Skip to content

feat(vectorstore): task #61 P1-V vector adapter family — capability + filter Or guard + retrieve defense-in-depth#1948

Merged
earayu merged 2 commits into
mainfrom
bryce/task-83-p1-vector-adapter-family
Apr 30, 2026
Merged

feat(vectorstore): task #61 P1-V vector adapter family — capability + filter Or guard + retrieve defense-in-depth#1948
earayu merged 2 commits into
mainfrom
bryce/task-83-p1-vector-adapter-family

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 30, 2026

Summary

Closes task #83 per PM @不穷 dispatch (msg=29c9e753). Folds 4 P1-V items from task #61 spec v1 § 2.3 into a single PR.

Changes

  • P1-V1 ensure_collection Protocol docstring spells out the cross-adapter contract (idempotent / race-safe / fail-loud / cache-not-poisoned-on-failure)
  • P1-V2 New VectorBackendCapabilities frozen dataclass + BACKEND_CAPABILITIES class-level attribute on each adapter declares supports_atomic_batch_upsert (PGVector True, Qdrant False)
  • P1-V3 Both adapter translators guard against empty Or parts at translator boundary (defense-in-depth on top of Or.__post_init__ DSL-level reject) — prevents vacuous "match everything" disjunction
  • P1-V4 Qdrant retrieve applies belt-and-braces TENANT_PAYLOAD_KEY filter in both multitenant + legacy modes, with backwards-compat "no payload key → pass through" branch. supports_legacy_mode capability flag declared (PGVector False, Qdrant True).

Tests

  • New tests/unit_test/vectorstore/test_backend_capabilities.py (8 cases) — pins shape + per-flag values; coordinates with cuiwenbo task chore: remove the vicuna model #87 P1-D3 collection metadata Pydantic projection
  • test_pgvector_translator.py + test_qdrant_filter_translation.py — pin the Or empty-parts guard with frozen-dataclass-bypass coverage
  • test_qdrant_multitenancy_integration.py — new test_retrieve_legacy_mode_filters_stray_foreign_payload exercises P1-V4 with a real :memory: Qdrant client (no payload key passes through, own-tenant passes, foreign-tenant dropped)

Local: uv run pytest tests/unit_test/vectorstore/156 passed, 10 skipped, 1 warning.

Test plan

  • Local unit tests pass
  • ruff check + ruff format --check clean
  • CI lint-and-unit + e2e green
  • @符炫炜 architect ratify (per spec § 2.3 + Lesson feat: chat #17 backend converge contract)
  • @weston architecture CR (cross-adapter parity gate)
  • @huangzhangshu testing CR (boundary test grep gate + capability matrix)
  • @cuiwenbo coordinate task chore: remove the vicuna model #87 P1-D3 projection consumer (read BACKEND_CAPABILITIES values verbatim)

Spec / scope alignment

Follow-ups (NOT in this PR)

🤖 Generated with Claude Code

… filter Or guard + retrieve defense-in-depth

Closes task #83 per PM @不穷 dispatch (msg=29c9e753). Folds 4 P1-V
items from task #61 spec v1 § 2.3 into a single PR:

P1-V1 — collection init failure contract documentation
------------------------------------------------------
``ensure_collection`` Protocol docstring now spells out the cross-
adapter contract (idempotent / race-safe / fail-loud / cache-not-
poisoned-on-failure). Both adapters already implement these
behaviours; the documentation closes the spec drift gap so future
implementers have a checklist.

P1-V2 — batch upsert atomicity capability declaration
-----------------------------------------------------
New :class:`VectorBackendCapabilities` frozen dataclass on the base
module declares static per-backend behaviour flags. Each
``VectorStoreConnector`` subclass exposes an instance via the
``BACKEND_CAPABILITIES`` class-level attribute:

* ``PgvectorVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = True``
  (PGVector wraps bulk INSERT ON CONFLICT in ``engine.begin()`` —
  mid-batch failure rolls back the whole batch).
* ``QdrantVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = False``
  (Qdrant ``client.upsert(points, wait=True)`` is best-effort
  per-point — partial writes possible on mid-batch failure).

``upsert`` Protocol docstring now points at the capability flag so
callers know to chunk + verify on backends that declare ``False``.

P1-V3 — filter Or empty-parts guard
-----------------------------------
``Or.__post_init__`` already rejects empty ``parts`` at DSL
construction. Both adapter translators now also guard at the
translator boundary so a future refactor that bypasses the
constructor (e.g. ``object.__setattr__(or_node, "parts", ())`` on
the frozen dataclass, or a ``dataclasses.replace`` with empty
parts) can't silently degrade to a vacuous "match everything"
disjunction:

* ``aperag/vectorstore/pgvector_connector.py:_SqlFilter._walk`` —
  raises ``UnsupportedFilterError`` on empty post-walk parts.
* ``aperag/vectorstore/qdrant_connector.py:_translate_filter`` —
  raises ``UnsupportedFilterError`` on empty post-prune subs (so
  ``rest.Filter(should=[])`` — which Qdrant treats as match-all —
  is unreachable).

P1-V4 — Qdrant legacy mode defense-in-depth
-------------------------------------------
``QdrantVectorStoreConnector.retrieve`` now applies the same
``TENANT_PAYLOAD_KEY`` filter in **both** multitenant and legacy
modes, but with a backwards-compatible "no payload key → pass
through" branch so legacy-only rows that don't carry the payload
key keep working:

* In multitenant mode: filter is the primary tenant-isolation
  layer (unchanged behaviour).
* In legacy mode: collection-name isolation is the primary layer;
  the new payload-level filter is belt-and-braces against tooling
  drift / migration mistakes that could plant a stray foreign-tenant
  row in a legacy collection.

The new ``BACKEND_CAPABILITIES.supports_legacy_mode`` flag declares
which adapter supports the legacy layout (PGVector ``False``,
Qdrant ``True``) so callers can tell the difference machine-
readably.

Tests
-----
* ``tests/unit_test/vectorstore/test_backend_capabilities.py``
  (new) — pins shape + per-flag values for each adapter. Coordinates
  with cuiwenbo task #87 P1-D3 collection metadata Pydantic
  projection so the static capability matrix stays consistent
  across PRs.
* ``tests/unit_test/vectorstore/test_pgvector_translator.py`` and
  ``test_qdrant_filter_translation.py`` — pin the new Or empty-parts
  guard with frozen-dataclass-bypass coverage.
* ``tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py``
  — new ``test_retrieve_legacy_mode_filters_stray_foreign_payload``
  exercises the P1-V4 belt-and-braces filter on a real ``:memory:``
  Qdrant client: legacy-mode rows without payload key pass through
  (backward compat), own-tenant payload passes, foreign-tenant
  payload is dropped.

Local: ``uv run pytest tests/unit_test/vectorstore/`` →
**156 passed, 10 skipped, 1 warning**.

Spec / scope alignment
----------------------
* task #61 spec v1 § 2.3 P1-V1 → ensure_collection contract doc ✅
* task #61 spec v1 § 2.3 P1-V2 → BACKEND_CAPABILITIES.supports_atomic_batch_upsert ✅
* task #61 spec v1 § 2.3 P1-V3 → Or empty-parts guard ✅
* task #61 spec v1 § 2.3 P1-V4 → retrieve defense-in-depth + supports_legacy_mode ✅
* Lesson #14 multi-iteration cleanup — legacy mode flagged via
  ``supports_legacy_mode`` so a future PR can drop the mode
  entirely once telemetry confirms zero production usage ✅
* Lesson #17 backend 收敛 contract — capability declaration is the
  backend-side contract that lets callers (FE / API / MCP) read a
  single source of truth instead of forking on backend type ✅

Follow-ups (NOT in this PR)
---------------------------
* task #84 P1-G1+G2 graph store boundary tests — ziang
* task #85 P1-D1 e2e shape matrix — huangzhangshu
* task #86 P1-D2 Helm Nebula first-class — Planetegg
* task #87 P1-D3 collection metadata vector_backend projection —
  cuiwenbo + dongdong (consumes ``BACKEND_CAPABILITIES`` values)
* task #88 P2-S1+S2 batch alias resolution — Bryce after this PR

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu
Copy link
Copy Markdown
Collaborator Author

earayu commented Apr 30, 2026

Testing CR LGTM. I verified the P1-V testing surface: capability matrix tests pin PGVector/Qdrant BACKEND_CAPABILITIES, both translators now fail-loud on empty Or after DSL guard bypass, and Qdrant legacy retrieve keeps no-payload rows while dropping stray foreign TENANT_PAYLOAD_KEY rows. Local validation: uv run --extra test pytest tests/unit_test/vectorstore/test_backend_capabilities.py tests/unit_test/vectorstore/test_pgvector_translator.py tests/unit_test/vectorstore/test_qdrant_filter_translation.py tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py -q -> 48 passed; uv run --extra test pytest tests/unit_test/vectorstore -q -> 156 passed / 10 skipped; git diff --check origin/main...HEAD passed. Note: GitHub identity cannot approve because it is treated as PR author, so recording LGTM as a PR comment.

…ton PR #1948 BLOCKER)

Weston PR #1948 architecture CR (msg=910cad66 BLOCKER) caught a real
correctness regression in the initial P1-V4 commit: the uniform
"no payload key → pass through" branch leaked stray ``{}`` payload
rows in the **shared multitenant collection** to every tenant on a
``retrieve(ids=...)`` call.

Local Qdrant ``:memory:`` repro (per Weston): a multitenant
connector ``tenant_a`` writes a point with ``payload={}`` directly
to the shared collection, then ``tenant_a.retrieve([id])`` returns
the row. Because ``upsert()`` always stamps the payload key, the
only way a missing-key row reaches the shared collection is tooling
drift / migration drift — exactly the case P1-V4 defense-in-depth
is supposed to catch.

Fix
---
Mode-specific semantics:

* **Multitenant mode** (shared physical collection): STRICT —
  every row MUST carry ``TENANT_PAYLOAD_KEY`` matching the
  connector's tenant id. No "no payload key → pass through"
  branch, because the shared collection means a missing key would
  expose the row to every tenant.
* **Legacy mode** (per-tenant physical collection, unchanged from
  initial commit): PERMISSIVE — a row that doesn't carry the
  payload key still passes through (typical pre-multitenant data
  shape), but a stray foreign-tenant payload gets dropped (catches
  tooling drift / migration mistakes).

Tests
-----
``test_retrieve_multitenant_mode_strict_requires_payload_key`` (new)
— Weston's exact repro: seed shared collection with ``{}`` payload
+ own-tenant payload + foreign-tenant payload, assert only the
own-tenant row passes through. The legacy-mode permissive
counterpart (``test_retrieve_legacy_mode_filters_stray_foreign_payload``)
stays unchanged so a future refactor that unifies them silently
re-opens the leak fails fast.

Local: ``uv run pytest tests/unit_test/vectorstore/`` →
**157 passed, 10 skipped** (one new case).

Sediment trigger
----------------
This is Lesson #12 v9 fifth-application demo same family — Weston
first-principles repro catches the unified branch as silent leak
that I missed when applying the legacy-compat optimization
uniformly. The narrower ``mode-specific`` framing matches the spec
language ("legacy compat for legacy mode only") more precisely.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu
Copy link
Copy Markdown
Collaborator Author

earayu commented Apr 30, 2026

Testing CR re-final after 542dd65: LGTM. Weston catch was valid; my earlier LGTM missed that legacy no-key pass-through had been applied too broadly to multitenant retrieve. I rechecked the mode split: multitenant retrieve now strictly requires payload.collection_id == tenant_id, while legacy mode still allows no-key rows and drops foreign-tenant payloads. Local validation: uv run --extra test pytest tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py -q -> 13 passed; uv run --extra test pytest tests/unit_test/vectorstore/test_backend_capabilities.py tests/unit_test/vectorstore/test_pgvector_translator.py tests/unit_test/vectorstore/test_qdrant_filter_translation.py tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py -q -> 49 passed; git diff --check origin/main...HEAD passed.

@earayu earayu merged commit d433a5e into main Apr 30, 2026
25 of 27 checks passed
@earayu earayu deleted the bryce/task-83-p1-vector-adapter-family branch April 30, 2026 10:27
earayu added a commit that referenced this pull request Apr 30, 2026
…ty matrix (#1949)

* feat(collection): task #61 P1-D3 — vector backend identity + capability matrix

Project the deployment-wide ``settings.vector_db_type`` onto every
collection detail read so the FE can render a "what does this vector
backend actually support" panel without per-collection migration or
runtime probe.

Backend (output-only projection):
- ``aperag/schema/common.py``: ``VectorBackendCapabilities`` +
  ``VectorBackendInfo`` + ``_STATIC_VECTOR_BACKEND_CAPABILITIES`` dict +
  ``project_vector_backend_info()`` helper.
- ``aperag/domains/knowledge_base/schemas.py:Collection``: add
  ``vector_backend: Optional[VectorBackendInfo]``. **Intentionally NOT
  on ``CollectionConfig``** so the OpenAPI ``CollectionCreate`` /
  ``CollectionUpdate`` input shapes do not let callers mistake a
  deployment-wide setting for a per-collection editable knob (per
  dongdong msg=c2593fdd + PM msg=caf7e4df + architect msg=0044261f
  read-only projection lock).
- ``aperag/domains/knowledge_base/service/collection_service.py``:
  populate ``vector_backend`` in ``build_collection_response`` from
  ``settings.vector_db_type``; ``None`` for unknown backends so the FE
  can render a placeholder without a hard failure.

Cross-PR consistency with task #83 / PR #1948 (Bryce, vector adapter
behavior fixes):
- Bryce's connector-layer ``BACKEND_CAPABILITIES`` ClassVar declares 2
  truth flags (``supports_atomic_batch_upsert`` +
  ``supports_legacy_mode``); this PR's schema-layer Pydantic model
  mirrors those values plus a 3rd schema-layer-only flag
  ``supports_filter_or_with_empty_parts`` which is uniformly False
  across adapters after task #83 P1-V3 (translator-level
  defense-in-depth rejects empty Or parts).
- The 3rd flag stays in the schema so the FE can declare the uniform
  reject explicitly per spec § 2.3 P1-D3 「显示『允许差异但显式』」 —
  Lesson #17 backend 收敛 contract simple-stable family pattern (cite
  PR #1930 SearchHit normalize, PR #1935 GraphMergeSuggestionItem
  projection layer).

Mechanical gate (per Lesson #18 lesson-sediment + mechanical-gate 双
layer codification — first established by chenyexuan PR #1933 / PR
#1941, then PR #1940 ``model_validate`` boundary): 13-case unit suite
in ``tests/unit_test/contracts/test_vector_backend_capability_matrix.py``
pins each capability flag, normalizes inputs, and round-trips Pydantic
``model_dump`` so future drift between schema, projection helper, and
FE-consumed shape fails fast at unit-test time.

FE (read-only display):
- ``web/src/features/collection/types.ts``: typed mirrors
  ``VectorBackendInfo`` / ``VectorBackendCapabilities`` /
  ``VectorBackendType``.
- ``web/src/app/workspace/collections/[collectionId]/settings/collection-vector-backend-card.tsx``:
  new component that surfaces backend identity + capability matrix in
  the collection settings page (above the edit form). dongdong picks
  up rendering polish (responsive + dark mode + final copy) on the
  same PR per the joint A4-style split (cuiwenbo contract layer +
  dongdong rendering polish + CR pair).
- ``web/src/i18n/{en-US,zh-CN}/page_collections.json``: copy strings.
- ``web/src/api-v2/schema.d.ts`` regenerated via ``yarn api:v2:types``.

Local verification:
- ``uv run --extra test pytest tests/unit_test/contracts/test_vector_backend_capability_matrix.py tests/unit_test/contracts/test_collection_v2_openapi_contract.py -q`` → 23 passed
- ``make openapi-check`` → ok
- ``yarn type-check --pretty false`` → 0 new errors on this PR's files
  (pre-existing graph-lab cosmograph + agent-runtime errors unchanged)
- ``yarn lint --quiet`` → 0 warnings/errors
- ``yarn i18n:check`` → ok
- ``git diff --check`` → ok

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(collection): task #87 P1-D3 — convert vector_backend to computed_field

Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f /
msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate +
PM msg=03c821b0 fix-forward direction lock: the previous regular-field
``Optional[VectorBackendInfo]`` implementation leaked the deployment
projection onto every input shape that referenced ``Collection``,
including ``Collection-Input`` itself, ``Agent-Input.collections``,
and ``CreateTurnRequest.collections``. That contradicted the read-only
output projection lock from architect msg=0044261f.

Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field``
property so OpenAPI input/output schemas auto-split:

- ``Collection-Output`` now lists ``vector_backend`` with
  ``readonly: true`` (verified in regenerated
  ``web/src/api-v2/schema.d.ts``).
- ``Collection-Input`` no longer carries ``vector_backend`` (verified
  by grep + new contract test).
- ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` /
  ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``,
  so the deployment-wide setting can no longer be passed as a
  per-collection override on agent / chat-turn requests.

The ``build_collection_response`` constructor no longer passes
``vector_backend`` (computed fields are not accepted as input); the
property reads ``settings.vector_db_type`` lazily on each serialization.

Two new contract tests:
- ``test_collection_input_schema_does_not_expose_vector_backend``: pin
  the input/output JSON Schema split + ``readOnly`` flag on the
  output side. Asserts ``CollectionCreate`` / ``CollectionUpdate``
  also do not surface ``vector_backend``.
- ``test_collection_constructor_ignores_vector_backend_input``:
  defensive — even if a malicious caller stuffs ``vector_backend``
  into a ``model_validate`` payload, Pydantic ignores it and the
  computed property still reflects the deployment setting.

Sediment: cuiwenbo own-up CR miss — implement-time only verified the
``CollectionConfig`` placement (one defense layer) and missed the
``Collection`` self-reuse-as-input second layer. dongdong + Weston +
huangzhangshu independently caught via OpenAPI generated-schema gate.
mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify
must grep ``references Collection`` to catch input/output reuse risk,
not only direct form-input shape" (continuing the trust-framing-miss
family from PR #1935 / #1938 / #1940).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): consolidate vector_backend_capability_matrix imports for ruff

Combine the two from aperag.schema.common import ... statements
into a single block so ruff's import organization rule is satisfied.
No code-behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): apply ruff format to vector_backend test + common.py

Run `uv run ruff format` on ApeRAG/aperag/schema/common.py and
ApeRAG/tests/unit_test/contracts/test_vector_backend_capability_matrix.py
so `make lint` (`ruff format --check`) passes. Pure formatting; no
behavior change. Other unrelated files reverted to keep this PR scope
clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant