Skip to content

Fix: Map "ivf_flat" to "ivfflat" for pgvector index access method#763

Merged
XuanYang-cn merged 2 commits intozilliztech:mainfrom
NagarajuReddyBoggala:fix/pgvector-ivfflat-index-type
Apr 21, 2026
Merged

Fix: Map "ivf_flat" to "ivfflat" for pgvector index access method#763
XuanYang-cn merged 2 commits intozilliztech:mainfrom
NagarajuReddyBoggala:fix/pgvector-ivfflat-index-type

Conversation

@NagarajuReddyBoggala
Copy link
Copy Markdown
Contributor

  • IndexType.IVFFlat.value="IVF_FLAT" → .lower()="ivf_flat" caused SQL to fail with "access method 'ivf_flat' does not exist"
  • pgvector PostgreSQL extension expects "ivfflat" (no underscore), not "ivf_flat"
  • Added explicit mapping after lowercase normalization: if index_type_lower == "ivf_flat": index_type_lower = "ivfflat"

Problem

PGVector IVFFlat index creation fails via UI with:

  • ERROR: access method "ivf_flat" does not exist

But works fine via CLI.

Root Cause

The frontend config sends the wrong IndexType values:

  • Sends IndexType.IVFFlat.value = "IVF_FLAT" instead of "ivfflat"

pgvector PostgreSQL extension expects "ivfflat" (no underscore), not "ivf_flat"

Solution

if index_type_lower == "ivf_flat":
    index_type_lower = "ivfflat"  # [FIX] pgvector expects "ivfflat" not "ivf_flat"

Before: "IVF_FLAT""ivf_flat"ERROR
After: "IVF_FLAT""ivf_flat""ivfflat" → ✅ CREATE INDEX USING ivfflat

Evidence

UI logs (broken):
Screenshot from 2026-04-20 16-14-45

- IndexType.IVFFlat.value="IVF_FLAT" → .lower()="ivf_flat" caused SQL to fail with "access method 'ivf_flat' does not exist"
- pgvector PostgreSQL extension expects "ivfflat" (no underscore), not "ivf_flat"
- Added explicit mapping after lowercase normalization: if index_type_lower == "ivf_flat": index_type_lower = "ivfflat"
@sre-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: NagarajuReddyBoggala
To complete the pull request process, please assign xuanyang-cn after the PR has been reviewed.
You can assign the PR to them by writing /assign @xuanyang-cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@NagarajuReddyBoggala
Copy link
Copy Markdown
Contributor Author

/assign @XuanYang-cn

@XuanYang-cn
Copy link
Copy Markdown
Collaborator

@NagarajuReddyBoggala Please run make format to pass the actions

@NagarajuReddyBoggala
Copy link
Copy Markdown
Contributor Author

@XuanYang-cn Thanks for pointing this out!

Note: These linting issues (long comments at lines 338-340, commented code at 377/394) were introduced by [PR #760 ] which got merged earlier.

My PR scope: Only added the ivf_flat → ivfflat fix, which is now lint-compliant after make format.

Should I fix all lint issues caused by PR #760?

@XuanYang-cn
Copy link
Copy Markdown
Collaborator

@NagarajuReddyBoggala Thanks for the clarification — you're right, those are from #760, not this PR.

Since you're already in the file, it'd be great if you could include the cleanup here — but totally your call. If you'd rather keep this PR scoped to the ivf_flat → ivfflat fix, that's completely fine and I'll track the #760 issues separately. Either way works, and I appreciate you flagging it.

@NagarajuReddyBoggala
Copy link
Copy Markdown
Contributor Author

Done @XuanYang-cn ! Pushed lint cleanup + fixes to the PR

Changes in 2nd commit:

  • Wrapped long comments to <120 chars (lines 338-340)
  • Removed commented-out code (lines 377, 394)

Thanks for the quick feedback

@XuanYang-cn XuanYang-cn merged commit b3613ff into zilliztech:main Apr 21, 2026
4 checks passed
XuanYang-cn added a commit to XuanYang-cn/VectorDBBench that referenced this pull request Apr 21, 2026
For non-thread-safe DBs (e.g. PgVector), ConcurrentInsertRunner clamps
max_workers to 1, so there is always exactly one worker thread. There is
no need to deepcopy self.db per thread — the single worker can use
self.db directly via the connection already opened by task()'s
`with self.db.init():`.

The original code called deepcopy(self.db) inside _get_thread_db() after
task() had already opened a live psycopg C-extension Connection on
self.db. C-extension objects cannot be deep-copied, causing:
  TypeError: no default __reduce__ due to non-trivial __cinit__

Fix: remove the deepcopy branch entirely. All workers (thread-safe or
not) now use self.db directly; thread-safety is guaranteed for
non-thread-safe DBs by the max_workers=1 clamp.

Also clean up stale comments in pgvector.py left over from zilliztech#760/zilliztech#763.

Adds tests/test_pgvector.py with:
- unit test that reproduces the bug (fails on original, passes on fix)
- e2e regression test via ConcurrentInsertRunner + OpenAI 50K dataset

See also: zilliztech#756

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
XuanYang-cn added a commit that referenced this pull request Apr 21, 2026
For non-thread-safe DBs (e.g. PgVector), ConcurrentInsertRunner clamps
max_workers to 1, so there is always exactly one worker thread. There is
no need to deepcopy self.db per thread — the single worker can use
self.db directly via the connection already opened by task()'s
`with self.db.init():`.

The original code called deepcopy(self.db) inside _get_thread_db() after
task() had already opened a live psycopg C-extension Connection on
self.db. C-extension objects cannot be deep-copied, causing:
  TypeError: no default __reduce__ due to non-trivial __cinit__

Fix: remove the deepcopy branch entirely. All workers (thread-safe or
not) now use self.db directly; thread-safety is guaranteed for
non-thread-safe DBs by the max_workers=1 clamp.

Also clean up stale comments in pgvector.py left over from #760/#763.

Adds tests/test_pgvector.py with:
- unit test that reproduces the bug (fails on original, passes on fix)
- e2e regression test via ConcurrentInsertRunner + OpenAI 50K dataset

See also: #756

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants