-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Closed
Copy link
Labels
PRs (with tests!) welcomea fix or feature which is appropriate to be implemented by volunteersa fix or feature which is appropriate to be implemented by volunteersbugSomething isn't workingSomething isn't workingpostgresqlreflectionreflection of tables, columns, constraints, defaults, sequences, views, everything elsereflection of tables, columns, constraints, defaults, sequences, views, everything else
Description
Describe the bug
When my database uses Postgres 16 with pgvector-rs and applies the following index, an unexpected error occurs when I attempt to auto-generate a revision through Alembic. The root cause is that the index in pgvector-rs is slightly different from the index in PostgreSQL
When I query the indexes using
SELECT
indexname,
indexdef
FROM
pg_indexes
WHERE
tablename = 'vectors_768';
I noticed that the PostgreSQL index is:
CREATE INDEX ix_vector_embedding ON public.vectors_768 USING hnsw (embedding vector_l2_ops) WITH (m='4', ef_construction='10');
However, the index from pgvector-rs is:
CREATE INDEX ix_vector_embedding ON public.vectors_768 USING vectors (embedding vector_cos_ops) WITH (options='[indexing.hnsw]+
| m = 4 +
| ef_construction = 10');
To fix this issue, just need to modify the get_multi_indexes method in PGDialect.
if row["reloptions"]:
- dialect_options["postgresql_with"] = dict([option.split("=") for option in row["reloptions"]])
+ dialect_options["postgresql_with"] = dict([option.split("=", 1) for option in row["reloptions"]])Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected
No response
SQLAlchemy Version in Use
2.0.31
DBAPI (i.e. the database driver)
asyncpg
Database Vendor and Major Version
Postgres 16.3 with pgvector-rs 0.2.1
Python Version
3.10.11
Operating system
OSX
To Reproduce
from sqlalchemy.ext.asyncio import async_sessionmaker, create_async_engine
from sqlalchemy.ext.asyncio import AsyncAttrs
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import Index
from pgvector.sqlalchemy import Vector
engine = create_async_engine("postgresql+asyncpg://username:password@localhost/pgdb")
AsyncSessionMaker = async_sessionmaker(
bind=engine, autocommit=False, autoflush=False, expire_on_commit=False
)
class Base(AsyncAttrs, DeclarativeBase):
pass
class Vector768(Base):
__tablename__ = "vectors"
__table_args__ = (
Index(
"ix_vector_embedding",
"embedding",
postgresql_using="hnsw",
postgresql_with={"m": 4, "ef_construction": 10},
postgresql_ops={"embedding": "vector_l2_ops"},
),
)
id: Mapped[int] = mapped_column(autoincrement=True, primary_key=True)
embedding: Mapped[list[float]] = mapped_column(Vector(768))Error
Traceback (most recent call last):
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/bin/alembic", line 8, in <module>
sys.exit(main())
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/config.py", line 636, in main
CommandLine(prog=prog).main(argv=argv)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/config.py", line 626, in main
self.run_cmd(cfg, options)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/config.py", line 603, in run_cmd
fn(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/command.py", line 236, in revision
script_directory.run_env()
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/script/base.py", line 582, in run_env
util.load_python_file(self.dir, "env.py")
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 95, in load_python_file
module = load_module_py(module_id, path)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 113, in load_module_py
spec.loader.exec_module(module) # type: ignore
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 90, in <module>
run_migrations_online()
File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 84, in run_migrations_online
asyncio.run(run_async_migrations())
File "/Users/tangxianyun/.asdf/installs/python/3.10.11/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/Users/tangxianyun/.asdf/installs/python/3.10.11/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 76, in run_async_migrations
await connection.run_sync(do_run_migrations)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/ext/asyncio/engine.py", line 886, in run_sync
return await greenlet_spawn(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 203, in greenlet_spawn
result = context.switch(value)
File "/Volumes/Data/Work/Other/alembic-reproduce/alembic/env.py", line 60, in do_run_migrations
context.run_migrations()
File "<string>", line 8, in run_migrations
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/runtime/environment.py", line 946, in run_migrations
self.get_context().run_migrations(**kw)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/runtime/migration.py", line 616, in run_migrations
for step in self._migrations_fn(heads, self):
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/command.py", line 212, in retrieve_migrations
revision_context.run_autogenerate(rev, context)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/api.py", line 570, in run_autogenerate
self._run_environment(rev, migration_context, True)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/api.py", line 617, in _run_environment
compare._populate_migration_script(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 65, in _populate_migration_script
_produce_net_changes(autogen_context, upgrade_ops)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 98, in _produce_net_changes
comparators.dispatch("schema", autogen_context.dialect.name)(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/langhelpers.py", line 310, in go
fn(*arg, **kw)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 137, in _autogen_for_tables
_compare_tables(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/autogenerate/compare.py", line 249, in _compare_tables
sqla_compat._reflect_table(inspector, t)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/alembic/util/sqla_compat.py", line 359, in _reflect_table
return inspector.reflect_table(table, None)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 1526, in reflect_table
_reflect_info = self._get_reflection_info(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 2011, in _get_reflection_info
indexes=run(self.get_multi_indexes),
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 1992, in run
res = meth(filter_names=_fn, **kw)
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 1185, in get_multi_indexes
self.dialect.get_multi_indexes(
File "/Volumes/Data/Work/Other/alembic-reproduce/.venv/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/base.py", line 4582, in get_multi_indexes
dialect_options["postgresql_with"] = dict(
ValueError: dictionary update sequence element #0 has length 4; 2 is required
Additional context
alembic migration scripts here
def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.execute("DROP EXTENSION IF EXISTS vectors;")
op.execute("CREATE EXTENSION vectors;")
op.execute("SET vectors.pgvector_compatibility=on;")
op.create_table('vectors_768',
sa.Column('id', sa.Integer(), autoincrement=True, nullable=False),
sa.Column('embedding', vector.VECTOR(dim=768), nullable=False),
sa.PrimaryKeyConstraint('id')
)
op.create_index('ix_vector_embedding', 'vectors_768',
['embedding'],
unique=False,
postgresql_using='hnsw',
postgresql_with={'m': 4, 'ef_construction': 10},
postgresql_ops={'embedding': 'vector_l2_ops'})
# ### end Alembic commands ###
Metadata
Metadata
Assignees
Labels
PRs (with tests!) welcomea fix or feature which is appropriate to be implemented by volunteersa fix or feature which is appropriate to be implemented by volunteersbugSomething isn't workingSomething isn't workingpostgresqlreflectionreflection of tables, columns, constraints, defaults, sequences, views, everything elsereflection of tables, columns, constraints, defaults, sequences, views, everything else