Skip to content

The get_multi_indexes method in PGDialect does not work correctly with pgvector-rs #11724

@xianyuntang

Description

@xianyuntang

Describe the bug

When my database uses Postgres 16 with pgvector-rs and applies the following index, an unexpected error occurs when I attempt to auto-generate a revision through Alembic. The root cause is that the index in pgvector-rs is slightly different from the index in PostgreSQL

When I query the indexes using

SELECT
    indexname,
    indexdef
FROM
    pg_indexes
WHERE
    tablename = 'vectors_768';

I noticed that the PostgreSQL index is:

CREATE INDEX ix_vector_embedding ON public.vectors_768 USING hnsw (embedding vector_l2_ops) WITH (m='4', ef_construction='10');

However, the index from pgvector-rs is:

CREATE INDEX ix_vector_embedding ON public.vectors_768 USING vectors (embedding vector_cos_ops) WITH (options='[indexing.hnsw]+
                     | m = 4                                                                                                                         +
                     | ef_construction = 10');

To fix this issue, just need to modify the get_multi_indexes method in PGDialect.

if row["reloptions"]:
-    dialect_options["postgresql_with"] = dict([option.split("=") for option in row["reloptions"]])
+    dialect_options["postgresql_with"] = dict([option.split("=", 1) for option in row["reloptions"]])

Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected

No response

SQLAlchemy Version in Use

2.0.31

DBAPI (i.e. the database driver)

asyncpg

Database Vendor and Major Version

Postgres 16.3 with pgvector-rs 0.2.1

Python Version

3.10.11

Operating system

OSX

To Reproduce

from sqlalchemy.ext.asyncio import async_sessionmaker, create_async_engine
from sqlalchemy.ext.asyncio import AsyncAttrs
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import Index
from pgvector.sqlalchemy import Vector

engine = create_async_engine("postgresql+asyncpg://username:password@localhost/pgdb")

AsyncSessionMaker = async_sessionmaker(
    bind=engine, autocommit=False, autoflush=False, expire_on_commit=False
)


class Base(AsyncAttrs, DeclarativeBase):
    pass


class Vector768(Base):
    __tablename__ = "vectors"

    __table_args__ = (
        Index(
            "ix_vector_embedding",
            "embedding",
            postgresql_using="hnsw",
            postgresql_with={"m": 4, "ef_construction": 10},
            postgresql_ops={"embedding": "vector_l2_ops"},
        ),
    )

    id: Mapped[int] = mapped_column(autoincrement=True, primary_key=True)

    embedding: Mapped[list[float]] = mapped_column(Vector(768))

Error

Traceback (most recent call last):
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/bin/alembic", line 8, in <module>
    sys.exit(main())
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/config.py", line 636, in main
    CommandLine(prog=prog).main(argv=argv)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/config.py", line 626, in main
    self.run_cmd(cfg, options)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/config.py", line 603, in run_cmd
    fn(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/command.py", line 236, in revision
    script_directory.run_env()
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/script/base.py", line 582, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 95, in load_python_file
    module = load_module_py(module_id, path)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 113, in load_module_py
    spec.loader.exec_module(module)  # type: ignore
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 90, in <module>
    run_migrations_online()
  File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 84, in run_migrations_online
    asyncio.run(run_async_migrations())
  File "/Users/tangxianyun/.asdf/installs/python/3.10.11/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/tangxianyun/.asdf/installs/python/3.10.11/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 76, in run_async_migrations
    await connection.run_sync(do_run_migrations)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/ext/asyncio/engine.py", line 886, in run_sync
    return await greenlet_spawn(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 203, in greenlet_spawn
    result = context.switch(value)
  File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 60, in do_run_migrations
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/runtime/environment.py", line 946, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/runtime/migration.py", line 616, in run_migrations
    for step in self._migrations_fn(heads, self):
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/command.py", line 212, in retrieve_migrations
    revision_context.run_autogenerate(rev, context)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/api.py", line 570, in run_autogenerate
    self._run_environment(rev, migration_context, True)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/api.py", line 617, in _run_environment
    compare._populate_migration_script(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 65, in _populate_migration_script
    _produce_net_changes(autogen_context, upgrade_ops)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 98, in _produce_net_changes
    comparators.dispatch("schema", autogen_context.dialect.name)(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/langhelpers.py", line 310, in go
    fn(*arg, **kw)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 137, in _autogen_for_tables
    _compare_tables(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 249, in _compare_tables
    sqla_compat._reflect_table(inspector, t)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/sqla_compat.py", line 359, in _reflect_table
    return inspector.reflect_table(table, None)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 1526, in reflect_table
    _reflect_info = self._get_reflection_info(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 2011, in _get_reflection_info
    indexes=run(self.get_multi_indexes),
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 1992, in run
    res = meth(filter_names=_fn, **kw)
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 1185, in get_multi_indexes
    self.dialect.get_multi_indexes(
  File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/base.py", line 4582, in get_multi_indexes
    dialect_options["postgresql_with"] = dict(
ValueError: dictionary update sequence element #0 has length 4; 2 is required

Additional context

alembic migration scripts here

def upgrade() -> None:
    # ### commands auto generated by Alembic - please adjust! ###
    op.execute("DROP EXTENSION IF EXISTS vectors;")
    op.execute("CREATE EXTENSION vectors;")
    op.execute("SET vectors.pgvector_compatibility=on;")
    op.create_table('vectors_768',
    sa.Column('id', sa.Integer(), autoincrement=True, nullable=False),
    sa.Column('embedding', vector.VECTOR(dim=768), nullable=False),
    sa.PrimaryKeyConstraint('id')
    )
    op.create_index('ix_vector_embedding', 'vectors_768',
                    ['embedding'],
                    unique=False,
                    postgresql_using='hnsw',
                    postgresql_with={'m': 4, 'ef_construction': 10},
                    postgresql_ops={'embedding': 'vector_l2_ops'})
    # ### end Alembic commands ###

Metadata

Metadata

Assignees

No one assigned

    Labels

    PRs (with tests!) welcomea fix or feature which is appropriate to be implemented by volunteersbugSomething isn't workingpostgresqlreflectionreflection of tables, columns, constraints, defaults, sequences, views, everything else

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions