Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg dialect when rendering index expressions needs to apply self_group() the same way core DDL compiler does #1322

Closed
MIrinkov opened this issue Oct 12, 2023 · 5 comments

Comments

@MIrinkov
Copy link

Describe the bug
Alembic fails to generate a syntactically correct index migration when running alembic revision --autogenerate when creating an index on a field in a JSONB column using format like sqlalchemy.Index("idx_meta_file_id", meta["file_id"].astext) (where meta is a column in a model class, meta: Mapped[dict] = mapped_column(sqlalchemy.dialects.postgresql.JSONB)).

The constructed migration operations are:

    op.create_table(
        "test_table",
        sa.Column("id", sa.Integer(), nullable=False),
        sa.Column(
            "meta", postgresql.JSONB(astext_type=sa.Text()), nullable=False
        ),
        sa.PrimaryKeyConstraint("id"),
    )
    op.create_index(
        "idx_meta_file_id",
        "test_table",
        [sa.text("meta ->> 'file_id'")],
        unique=False,
    )

Running those with alembic upgrade head results in

sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax erro
r at or near "->>"
[SQL: CREATE INDEX idx_meta_file_id ON test_table (meta ->> 'file_id')]

Expected behavior
The [sa.text("meta ->> 'file_id'")], is missing parentheses, it is supposed to be [sa.text("(meta ->> 'file_id')")], because the correct postgres syntax for this index is

-- notice the double parentheses
CREATE INDEX idx_meta_file_id ON test_table ((meta ->> 'file_id'));

I can currently achieve this by replacing the .astext version with

sqlalchemy.Index("idx_meta_file_id", sqlalchemy.text("(meta ->> 'file_id')")) # Notice the parentheses withing the text expression

My question is - is this a known bug? Did I miss some configuration that's required for dialect JSONB to work properly? Or am I expected to use text() expressions in such cases?

To Reproduce

from sqlalchemy.ext.asyncio import AsyncAttrs
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
import sqlalchemy.dialects.postgresql

class Base(AsyncAttrs, DeclarativeBase):
    pass

class TestTable(Base):
    __tablename__ = "test_table"
    id: Mapped[int] = mapped_column(primary_key=True)
    meta: Mapped[dict] = mapped_column(sqlalchemy.dialects.postgresql.JSONB)
    __table_args__ = (
        sqlalchemy.Index("idx_meta_file_id", meta["file_id"].astext),
        # # The version below actually works.
        # sqlalchemy.Index("idx_meta_file_id", sqlalchemy.text("(meta ->> 'file_id')")),
    )

Then run alembic revision --autogenerate, and the results will be as stated above - the required "extra" parentheses will be missing.
Running alembic upgrade head will then result in the following error.

Error

(venv) PS C:\Users\User\project\migrations> alembic upgrade head
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.        
INFO  [alembic.runtime.migration] Running upgrade  -> 2614768f9f92, test
Traceback (most recent call last):
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 528, in _prepare_and_execute
    prepared_stmt, attributes = await adapt_connection._prepare(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 778, in _prepare
    prepared_stmt = await self._connection.prepare(
  File "C:\Users\User\project\venv\lib\site-packages\asyncpg\connection.py", line 565, in prepare
    return await self._prepare(
  File "C:\Users\User\project\venv\lib\site-packages\asyncpg\connection.py", line 583, in _prepare
    stmt = await self._get_statement(
  File "C:\Users\User\project\venv\lib\site-packages\asyncpg\connection.py", line 397, in _get_statement
    statement = await self._protocol.prepare(
  File "asyncpg\protocol\protocol.pyx", line 168, in prepare
asyncpg.exceptions.PostgresSyntaxError: syntax error at or near "->>"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 585, in execute
    self._adapt_connection.await_(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 125, in await_only
    return current.driver.switch(awaitable)  # type: ignore[no-any-return]
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 185, in greenlet_spawn
    value = await result
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 564, in _prepare_and_execute
    self._handle_exception(error)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 515, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 802, in _handle_exception
    raise translated_error from error
sqlalchemy.dialects.postgresql.asyncpg.AsyncAdapt_asyncpg_dbapi.ProgrammingError: <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "->>"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\User\project\venv\Scripts\alembic.exe\__main__.py", line 7, in <module>
  File "C:\Users\User\project\venv\lib\site-packages\alembic\config.py", line 630, in main
    CommandLine(prog=prog).main(argv=argv)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\config.py", line 624, in main
    self.run_cmd(cfg, options)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\config.py", line 601, in run_cmd
    fn(
  File "C:\Users\User\project\venv\lib\site-packages\alembic\command.py", line 399, in upgrade
    script.run_env()
  File "C:\Users\User\project\venv\lib\site-packages\alembic\script\base.py", line 578, in run_env
    util.load_python_file(self.dir, "env.py")
  File "C:\Users\User\project\venv\lib\site-packages\alembic\util\pyfiles.py", line 93, in load_python_file
    module = load_module_py(module_id, path)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\util\pyfiles.py", line 109, in load_module_py
    spec.loader.exec_module(module)  # type: ignore
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "C:\Users\User\project\migrations\.\env.py", line 89, in <module>
    run_migrations_online()
  File "C:\Users\User\project\migrations\.\env.py", line 83, in run_migrations_online
    asyncio.run(run_async_migrations())
  File "C:\Program Files\Python310\lib\asyncio\runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "C:\Program Files\Python310\lib\asyncio\base_events.py", line 641, in run_until_complete
    return future.result()
  File "C:\Users\User\project\migrations\.\env.py", line 75, in run_async_migrations
    await connection.run_sync(do_run_migrations)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\ext\asyncio\engine.py", line 884, in run_sync
    return await greenlet_spawn(fn, self._proxied, *arg, **kw)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 192, in greenlet_spawn
    result = context.switch(value)
  File "C:\Users\User\project\migrations\.\env.py", line 59, in do_run_migrations
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "C:\Users\User\project\venv\lib\site-packages\alembic\runtime\environment.py", line 937, in run_migrations
    self.get_context().run_migrations(**kw)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\runtime\migration.py", line 624, in run_migrations
    step.migration_fn(**kw)
  File "C:\Users\User\project\migrations\versions\2614768f9f92_test.py", line 74, in upgrade
    op.create_index(
  File "<string>", line 8, in create_index
  File "<string>", line 3, in create_index
  File "C:\Users\User\project\venv\lib\site-packages\alembic\operations\ops.py", line 994, in create_index
    return operations.invoke(op)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\operations\base.py", line 393, in invoke
    return fn(self, operation)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\operations\toimpl.py", line 105, in create_index
    operations.impl.create_index(idx, **kw)
  File "C:\Users\User\project\venv\lib\site-packages\alembic\ddl\postgresql.py", line 94, in create_index
    self._exec(CreateIndex(index, **kw))
  File "C:\Users\User\project\venv\lib\site-packages\alembic\ddl\impl.py", line 193, in _exec
    return conn.execute(  # type: ignore[call-overload]
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1412, in execute
    return meth(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\sql\ddl.py", line 181, in _execute_on_connection
    return connection._execute_ddl(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1524, in _execute_ddl
    ret = self._execute_context(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1844, in _execute_context
    return self._exec_single_context(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1984, in _exec_single_context
    self._handle_dbapi_exception(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2339, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\engine\default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 585, in execute
    self._adapt_connection.await_(
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 125, in await_only
    return current.driver.switch(awaitable)  # type: ignore[no-any-return]
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 185, in greenlet_spawn
    value = await result
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 564, in _prepare_and_execute
    self._handle_exception(error)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 515, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "C:\Users\User\project\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 802, in _handle_exception
    raise translated_error from error
sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "->>"
[SQL: CREATE INDEX idx_meta_file_id ON test_table (meta ->> 'file_id')]
(Background on this error at: https://sqlalche.me/e/20/f405)

Versions.

  • OS: Win 10
  • Python: 3.10.1
  • Alembic: 1.12.0
  • SQLAlchemy: 2.0.21
  • Database: PostgreSQL 15.4 (Debian 15.4-1.pgdg120+1) on x86_64-pc-linux-gnu
  • DBAPI:

Have a nice day!

@MIrinkov MIrinkov added the requires triage New issue that requires categorization label Oct 12, 2023
@MIrinkov MIrinkov changed the title Creating index on a JSONB field generates invalid SQL (missing parentheses) Creating index on a JSONB nested field generates invalid SQL (missing parentheses) Oct 12, 2023
@zzzeek
Copy link
Member

zzzeek commented Oct 12, 2023

it's a bit strange because a plain create table does do the parenthesis in that weird "double" way

from sqlalchemy.orm import declarative_base
from sqlalchemy import Column, Integer, Text, Index
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy import create_engine

Base = declarative_base()

class Thing(Base):
    __tablename__ = 'thing'

    id = Column(Integer, primary_key=True)
    data = Column(JSONB(astext_type=Text()))

    __table_args__ = (
        Index("idx_1", data["file_id"].astext),
    )



e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
Base.metadata.drop_all(e)
Base.metadata.create_all(e)
CREATE TABLE thing (
	id SERIAL NOT NULL, 
	data JSONB, 
	PRIMARY KEY (id)
)


2023-10-12 13:01:42,808 INFO sqlalchemy.engine.Engine [no key 0.00063s] {}
2023-10-12 13:01:42,835 INFO sqlalchemy.engine.Engine CREATE INDEX idx_1 ON thing ((data ->> 'file_id'))

obviously the text rendering for alembic is trying to normalize parens out and is failing for some reason here

@zzzeek
Copy link
Member

zzzeek commented Oct 12, 2023

OK well the PG dialect in SQLAlchemy just calls self_group() on these expressions. so you can work around like this:

class TestTable(Base):
    __tablename__ = "test_table"
    id: Mapped[int] = mapped_column(primary_key=True)
    meta: Mapped[dict] = mapped_column(sqlalchemy.dialects.postgresql.JSONB)
    __table_args__ = (
        sqlalchemy.Index("idx_meta_file_id", meta["file_id"].astext.self_group()),
    )

that should resolve the issue here

@zzzeek zzzeek changed the title Creating index on a JSONB nested field generates invalid SQL (missing parentheses) pg dialect when rendering index expressions needs to apply self_group() the same way core DDL compiler does Oct 12, 2023
@zzzeek
Copy link
Member

zzzeek commented Oct 12, 2023

the thing is , autogenerate on the second run will fail anyway as we dont do a good job comparing these indexes, so you'd really need to define the text like this probably:

sa.text('(data ->> \'$."file_id"\'::text)')

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the main branch:

apply PG ddl paren rules to index expressions https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4921

@CaselIT
Copy link
Member

CaselIT commented Oct 12, 2023

the thing is , autogenerate on the second run will fail anyway as we dont do a good job comparing these indexes, so you'd really need to define the text like this probably:

sa.text('(data ->> \'$."file_id"\'::text)')

so pg returns these like that? wow that's unfortunate.

Do you want to add a rule to ignore any index that contains an expression with -> and ->>?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants