Skip to content

feat(vectorstore): cross-adapter filter fail-loud + score normalization (task #61 P0-A + P0-B)#1930

Merged
earayu merged 3 commits into
mainfrom
bryce/task-61-vector-p0v1-v2-v3-v4-fix
Apr 30, 2026
Merged

feat(vectorstore): cross-adapter filter fail-loud + score normalization (task #61 P0-A + P0-B)#1930
earayu merged 3 commits into
mainfrom
bryce/task-61-vector-p0v1-v2-v3-v4-fix

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 30, 2026

Summary

Closes the two task #61 vector-adapter contract gaps PM @不穷 dispatched to me (msg=a387a81e) and architect @符炫炜 ratified (msg=7646eb4f), per Weston contract-matrix scope (msg=8beffab5) + spec PR #1928 § 2.2 / § 5.3 lock.

  • P0-A: filter fail-loud — qdrant_connector._normalize_filter_input no longer logs warning + returns None (silent unfiltered scan); both adapters now raise the new cross-adapter UnsupportedFilterError (subclass of TypeError → existing except TypeError callers keep working)
  • P0-B: score normalization — SearchHit.score is now uniformly in [0, 1] higher=better across PGVector × Qdrant × cosine/L2/dot, with score_threshold on the same scale (pushdown via inverse + Python belt-and-braces)

P0-V1 (Qdrant legacy retrieve cross-tenant filter) intentionally NOT in scope — first-principles verify (msg=23a2f514) confirmed legacy mode is tenant-isolated by physical collection name (qdrant_connector.py:442-446), not query filter, so no leak. Architect ratified the reframe to P1-V4 defense-in-depth (msg=7646eb4f).

Changes

aperag/vectorstore/base.py

  • New UnsupportedFilterError(TypeError) cross-adapter exception
  • New normalize_score(metric, raw) + inverse denormalize_threshold_to_native
    • cosine: clamp to [0, 1]
    • euclid: 1 / (1 + L2) onto (0, 1]
    • dot: numerically-stable sigmoid onto (0, 1)
    • all monotone → top-k ordering preserved
  • VectorStoreConnector.search() docstring + class-level invariants spell out §5 / §6 contract

aperag/vectorstore/dto.py

  • SearchHit.__post_init__ validates 0.0 <= score <= 1.0 (Pydantic-Field-equivalent for the frozen dataclass)

aperag/vectorstore/pgvector_connector.py

  • _SqlFilter._walk raises UnsupportedFilterError
  • search() pushes threshold down via denormalize_threshold_to_native + applies normalize_score post-query + Python belt-and-braces

aperag/vectorstore/qdrant_connector.py

  • _normalize_filter_input raises UnsupportedFilterError instead of fail-silent return None
  • search() pushes threshold down + applies normalize_score + handles +inf / -inf pushdown edge cases

Tests

  • New tests/unit_test/vectorstore/test_score_normalization.py (33 cases): range invariants × 14 raw scores, ordering preservation × 4 metric pairs, denormalize→normalize roundtrip × 15 (3 metric × 5 thresholds), endpoint behaviour, UnsupportedFilterError isinstance TypeError
  • tests/unit_test/vectorstore/test_pgvector_translator.py + test_qdrant_filter_translation.py updated to assert UnsupportedFilterError while keeping TypeError back-compat assertion
  • tests/integration/compat/test_vector_compat.py extended with 4 new cross-backend cases:
    • test_search_unsupported_filter_raises_unsupported_filter_error
    • test_search_score_in_unit_interval
    • test_search_score_threshold_filters_low_scores
    • test_search_top_k_ranking_higher_is_better

Local: uv run pytest tests/unit_test/vectorstore/ tests/unit_test/indexing/ tests/unit_test/graph_curation/518 passed, 10 skipped.

Test plan

  • Local unit tests pass (vectorstore / indexing / graph_curation)
  • ruff check + ruff format --check clean
  • CI lint-and-unit green
  • Backend-Compat-Test green (PR ci(compat): trigger Backend-Compat-Test on real graph_storage path #1926 merged → compat-test now triggers on aperag/vectorstore/**)
  • @huangheng CR (per msg=7d4fbcf2 + msg=05203d11 4-point checklist: base.py contract Pydantic-equivalent / cross-metric enum coverage / cross-adapter parametrize / legacy follow-up)
  • @符炫炜 ratify (standing per msg=7646eb4f)

Spec / scope alignment

Follow-ups (NOT in this PR)

🤖 Generated with Claude Code

earayu and others added 3 commits April 30, 2026 13:04
…on (task #61 P0-A + P0-B)

Closes the two task #61 vector-adapter contract gaps PM @不穷 dispatched
to me (msg=a387a81e) and architect @符炫炜 ratified (msg=7646eb4f),
collapsed onto a single PR per Weston's contract-matrix scope (msg=8beffab5).

P0-A — filter fail-loud
-----------------------
* Add ``UnsupportedFilterError`` to ``aperag.vectorstore.base`` as a
  cross-adapter exception type. Subclasses ``TypeError`` so existing
  ``except TypeError`` callers (pgvector translator pre-this-PR) keep
  working unchanged.
* Qdrant ``_normalize_filter_input`` now raises instead of logging a
  warning + ``return None``. The previous behaviour silently dropped
  the filter and degraded the search into a tenant-wide unfiltered scan
  — a correctness footgun, not graceful degradation.
* Pgvector ``_SqlFilter._walk`` re-types its raise to the same exception
  so both backends fail the same way on the same input.

P0-B — score normalization onto [0, 1] with higher = better
-----------------------------------------------------------
* Add ``normalize_score(metric, raw)`` and inverse
  ``denormalize_threshold_to_native(metric, normalized)`` to
  ``aperag.vectorstore.base``. Cosine clamps to [0, 1]; euclid maps
  ``-L2`` via ``1/(1+L2)`` onto (0, 1]; dot uses a numerically-stable
  sigmoid onto (0, 1). All three transforms are monotone so top-k
  ordering is preserved versus the raw form.
* Both adapters apply ``normalize_score`` before constructing
  ``SearchHit`` and use ``denormalize_threshold_to_native`` to push
  ``QueryRequest.score_threshold`` down to the native query (SQL
  ``WHERE score >= …`` / Qdrant ``score_threshold=``) so the server-
  side cutoff is exactly equivalent to a Python post-filter on the
  normalized score. A belt-and-braces post-filter catches any inverse-
  roundoff drift so the [0, 1] contract holds exactly.
* ``SearchHit.__post_init__`` now validates ``0.0 <= score <= 1.0`` so
  any future direct-build path that bypassed an adapter's normalization
  surfaces at the DTO boundary instead of polluting downstream
  score-threshold logic.
* ``base.VectorStoreConnector`` docstring + ``search()`` contract
  updated to spell out the §5/§6 invariants.

Tests
-----
* New ``tests/unit_test/vectorstore/test_score_normalization.py``: range
  invariants per metric, ordering preservation, denormalize→normalize
  roundtrip on (0, 1), endpoint behaviour (-inf / +inf clamps for
  pushdown), and ``UnsupportedFilterError isinstance TypeError``.
* Existing translator unit tests updated to assert the cross-adapter
  exception type while still asserting ``TypeError`` for back-compat.
* ``tests/integration/compat/test_vector_compat.py`` adds three new
  cross-backend cases (filter fail-loud, score in [0, 1], threshold
  direction, top-k ranking monotone) so the contract is pinned across
  PGVector × Qdrant under compat-test, not just per-adapter.

Per spec PR #1928 § 2.2 / § 5.3, follow-up boundary test sub-PR by
@huangheng will extend the parametrize fixture to cover the full
PGVector × Qdrant × {cosine, euclid, dot} 6-cell grid; this PR ships
the cosine cell (the only metric currently exercised by the compat
fixture) plus the per-metric unit tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…angheng NIT 1)

huangheng PR #1930 line-level CR (msg=5eb7315c) NIT 1 fold-in: caller
chain audit surfaced that all three in-tree default thresholds
(``DEFAULT_VECTOR_SCORE_THRESHOLD = 0.72`` × 2 + retrieval ``score_threshold = 0.5``)
were tuned on cosine-distance embeddings. After P0-B normalization the
[0, 1] number is directly comparable across adapters but the *intent*
is still cosine-grade strictness — collections that pick ``euclid`` or
``dot`` distance may want to override.

This commit only adds explanatory docstrings; no behaviour change.
The metric-aware default refactor (Lesson #12 v7.3 cross-PR default
value alignment family) stays as a follow-up sub-PR per huangheng's
non-blocker NIT framing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…n BLOCKER)

Weston msg=86e05a8e caught a real bug in PR #1930's P0-B implementation:
``normalize_score("euclid", raw)`` assumes the canonical "negative L2,
higher=better" raw form (which pgvector's ``_score_expr = -(<->)``
produces directly), but Qdrant returns positive L2 distance natively
(smaller=better). Result: every Qdrant euclid hit was clamped to L2=0
→ score=1.0, and a tight ``score_threshold=0.9`` returned an empty
list because the inverse threshold was a negative number that Qdrant
re-interpreted as a positive-L2 *upper* bound (vacuous).

Per architect msg=06902347 + huangheng msg=99b52499, fix-forward Option
A: keep the shared ``normalize_score`` / ``denormalize_threshold_to_native``
helpers' contract (input is canonical "higher-is-better raw", output is
[0, 1]) and convert at the Qdrant adapter boundary for the asymmetric
metric. Cosine + dot agree on convention across both backends so they
need no boundary work; only euclid is asymmetric.

Changes
-------
* ``aperag/vectorstore/qdrant_connector.py``:
  * ``search()`` now negates ``p.score`` before calling
    ``normalize_score`` when the metric is euclid.
  * Threshold pushdown: when the metric is euclid, the helper-returned
    "negative L2" gets flipped back to a "positive L2 upper bound"
    before passing to Qdrant's native ``score_threshold``. Pre-existing
    ``+inf`` (return empty) / ``-inf`` (omit threshold) edge cases stay
    intact.
* ``aperag/vectorstore/base.py``: docstring for the score-normalization
  block now documents the canonical "higher-is-better raw" convention
  the helpers operate on, calls out the Qdrant euclid asymmetry
  explicitly, and pins the responsibility on adapters (math-only helper,
  adapters do raw → canonical conversion).

Tests (Weston requested cross-metric Qdrant-native verify)
----------------------------------------------------------
``tests/unit_test/vectorstore/test_score_normalization.py`` adds four
end-to-end Qdrant ``:memory:`` regressions:

* ``test_qdrant_euclid_normalized_scores_strictly_decreasing_with_distance``
  — pins Weston's exact failure mode: near/mid/far must produce
  strictly decreasing normalized scores.
* ``test_qdrant_euclid_score_threshold_filters_far_keeps_near`` — pins
  the threshold-pushdown direction: ``score_threshold=0.9`` must keep
  the L2=0 near point and drop the L2=3 far point.
* ``test_qdrant_dot_normalized_scores_strictly_increasing_with_inner_product``
  — explicit pin that dot is *not* asymmetric and a future refactor
  must not negate it accidentally.
* ``test_qdrant_cosine_normalized_scores_strictly_increasing_with_similarity``
  — completes the per-metric Qdrant pin so all three native conventions
  are documented next to each other.

Local ``uv run pytest tests/unit_test/vectorstore/`` → 146 passed, 10
skipped, 1 warning. Existing PGVector + cosine compat tests unchanged.

Sediment fold-in candidates per huangheng msg=99b52499:
* Lesson #12 v9 second-application demo (Weston msg=86e05a8e + Bryce
  msg=23a2f514, double-source) — first-principles verify catches
  surface-signal mistakes
* Lesson #12 v7 extension candidate — external API contract verify
  (Qdrant ``p.score`` raw convention vs in-tree docstring assumption)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu earayu merged commit 7ab474b into main Apr 30, 2026
15 checks passed
@earayu earayu deleted the bryce/task-61-vector-p0v1-v2-v3-v4-fix branch April 30, 2026 05:27
earayu added a commit that referenced this pull request Apr 30, 2026
…#1932)

§ 四 加 8 lesson sediment(task #30 B3 + task #61 全 P0 闭环累计实证)+ § 六 sediment 引用追加 6 PR commit cross-link + § 八 修订记录追加本 PR fold trail。

新增 lesson:
- Lesson #12 v7.4: external API raw contract verify (task #61 P0-B PR #1930
  Qdrant euclid raw direction first-application + fix-forward 1e30a00)
- Lesson #12 v8 second-application: test docstring fake guardrail (task #61
  P0-G1 PR #1927 description_parts assertion 缺位 fix-forward 1953933)
- Lesson #12 v9: first-principles verify catch surface signal mistakes
  (task #61 P0-V1 重新定性 Bryce + task #61 P0-B Qdrant euclid Weston catch
  双独立 source 同源 first/second-application)
- Lesson #13 v2.3: deploy manifest dual-side rewrite (task #61 P0-D1 PR #1929
  Helm Neo4j worker env first-application)
- Lesson #13 v3 application demo 2: cross-source default value alignment
  (task #30 B3 PR #1925 commit dae43f5 三 source 同步 first-application)
- Lesson #14 application demo: spec 内部 default 漂浮 multi-iteration cleanup
  (task #30 B3 PR #1925 fix-forward dae43f5 § 3.1.1 line 85 cleanup
  second-application demo, first-application 在 task #35 6 轮 fix-forward)
- Lesson #16: CI workflow paths filter dead reference 反 pattern (task #61
  P0-W1 PR #1926 first-application demo + Lesson #15 file-move 3-step verify
  升级到 v2 4-step grep .github/workflows/*.yml paths 同步)
- Lesson #17: backend 收敛 contract 优于上层 fork (simple-stable + private-deploy
  paramount directive earayu2 msg=1224bec8 在 cross-adapter contract 设计时
  应用; task #69 P0-B + task #70 P1 候选 1 cross-PR 一次性收敛 first-application)

跨 PR 多独立 source 同源 catch trail:
- Lesson #12 v9: Bryce msg=23a2f514 + Weston msg=86e05a8e 双独立 source
- Lesson #16: chenyexuan msg=f298011e + 冬柏 msg=3e93bb64 双独立 source
- Lesson #17: cuiwenbo msg=cedc7703 + Bryce msg=9895a148 双独立 source
- Lesson #13 v3 application demo 2: huangheng msg=bf785b12 + Planetegg
  msg=c63acbf5 + Weston msg=1e6b0838 三独立 source

per architect msg=c4cdf634 + msg=daaeeab5 + msg=03c892e0 sediment dispatch.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
…CCEPTED legacy semantic lock

8/8 lane final pass converge state machine NIT (per dongdong msg=e7d7600a + Weston msg=013fdc47/14859580 + ziang msg=378455ad/c2228ba1 + dongdong msg=ceca6063 + cuiwenbo msg=49beb855 集体 converge):

架构师拍板 Option B (4 new enum + APPLY_PENDING new + ACCEPTED legacy read-only):

实证根据: aperag/graph_curation/service.py:534 sync handle_action() 末尾 suggestion.status = GraphCurationSuggestionStatus.ACCEPTED (merge sync 已执行完成后写入), ACCEPTED 在历史代码中是 terminal status = merge 已执行完成 semantic. 新 async path 复用会造成同名不同义 -> 引入新 enum value apply_pending 表示新 async 决策态.

Final 状态机: pending -> dismissed | rejected | apply_pending -> applying -> applied | apply_failed

修订点:
- §3.1.6 表格 accepted 行 -> apply_pending 行 + legacy ACCEPTED semantic note 段落
- §3.1.6 enum 列表 -> 5 新加 dismissed/apply_pending/applying/applied/apply_failed
- §3.1.2 action contract pending->accepted->applying->applied|apply_failed -> pending->apply_pending->applying->applied|apply_failed + UI typed schema 7 active state + legacy accepted read-only display
- §4 #31-A2 status enum extend -> 4 新 value APPLY_PENDING/APPLYING/APPLIED/APPLY_FAILED + ACCEPTED 保留 legacy + service.py:534 实证 cite
- §5.1 + §5.2.b 4 新 value + legacy ACCEPTED zero-write grep gate (新 async path 模块 import allowlist 不含 GraphCurationSuggestionStatus.ACCEPTED 写入)
- 全文 "新增 4 value APPLY_PENDING/APPLYING/APPLIED/APPLY_FAILED" 替换 (line 70/119/133/237/297/328 6 处统一)

cuiwenbo NIT 3 fold: §5.2.b 加 Pydantic Field validator on MergeSuggestionView.confidence_score 锁 [0,1] (per Lesson #17 backend 收敛 contract + PR #1930 SearchHit.score 同 pattern)

8/8 lane final pass: huangheng + huangzhangshu + Weston + dongdong + Bryce + Planetegg + ziang + cuiwenbo

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
…ty matrix (#1949)

* feat(collection): task #61 P1-D3 — vector backend identity + capability matrix

Project the deployment-wide ``settings.vector_db_type`` onto every
collection detail read so the FE can render a "what does this vector
backend actually support" panel without per-collection migration or
runtime probe.

Backend (output-only projection):
- ``aperag/schema/common.py``: ``VectorBackendCapabilities`` +
  ``VectorBackendInfo`` + ``_STATIC_VECTOR_BACKEND_CAPABILITIES`` dict +
  ``project_vector_backend_info()`` helper.
- ``aperag/domains/knowledge_base/schemas.py:Collection``: add
  ``vector_backend: Optional[VectorBackendInfo]``. **Intentionally NOT
  on ``CollectionConfig``** so the OpenAPI ``CollectionCreate`` /
  ``CollectionUpdate`` input shapes do not let callers mistake a
  deployment-wide setting for a per-collection editable knob (per
  dongdong msg=c2593fdd + PM msg=caf7e4df + architect msg=0044261f
  read-only projection lock).
- ``aperag/domains/knowledge_base/service/collection_service.py``:
  populate ``vector_backend`` in ``build_collection_response`` from
  ``settings.vector_db_type``; ``None`` for unknown backends so the FE
  can render a placeholder without a hard failure.

Cross-PR consistency with task #83 / PR #1948 (Bryce, vector adapter
behavior fixes):
- Bryce's connector-layer ``BACKEND_CAPABILITIES`` ClassVar declares 2
  truth flags (``supports_atomic_batch_upsert`` +
  ``supports_legacy_mode``); this PR's schema-layer Pydantic model
  mirrors those values plus a 3rd schema-layer-only flag
  ``supports_filter_or_with_empty_parts`` which is uniformly False
  across adapters after task #83 P1-V3 (translator-level
  defense-in-depth rejects empty Or parts).
- The 3rd flag stays in the schema so the FE can declare the uniform
  reject explicitly per spec § 2.3 P1-D3 「显示『允许差异但显式』」 —
  Lesson #17 backend 收敛 contract simple-stable family pattern (cite
  PR #1930 SearchHit normalize, PR #1935 GraphMergeSuggestionItem
  projection layer).

Mechanical gate (per Lesson #18 lesson-sediment + mechanical-gate 双
layer codification — first established by chenyexuan PR #1933 / PR
#1941, then PR #1940 ``model_validate`` boundary): 13-case unit suite
in ``tests/unit_test/contracts/test_vector_backend_capability_matrix.py``
pins each capability flag, normalizes inputs, and round-trips Pydantic
``model_dump`` so future drift between schema, projection helper, and
FE-consumed shape fails fast at unit-test time.

FE (read-only display):
- ``web/src/features/collection/types.ts``: typed mirrors
  ``VectorBackendInfo`` / ``VectorBackendCapabilities`` /
  ``VectorBackendType``.
- ``web/src/app/workspace/collections/[collectionId]/settings/collection-vector-backend-card.tsx``:
  new component that surfaces backend identity + capability matrix in
  the collection settings page (above the edit form). dongdong picks
  up rendering polish (responsive + dark mode + final copy) on the
  same PR per the joint A4-style split (cuiwenbo contract layer +
  dongdong rendering polish + CR pair).
- ``web/src/i18n/{en-US,zh-CN}/page_collections.json``: copy strings.
- ``web/src/api-v2/schema.d.ts`` regenerated via ``yarn api:v2:types``.

Local verification:
- ``uv run --extra test pytest tests/unit_test/contracts/test_vector_backend_capability_matrix.py tests/unit_test/contracts/test_collection_v2_openapi_contract.py -q`` → 23 passed
- ``make openapi-check`` → ok
- ``yarn type-check --pretty false`` → 0 new errors on this PR's files
  (pre-existing graph-lab cosmograph + agent-runtime errors unchanged)
- ``yarn lint --quiet`` → 0 warnings/errors
- ``yarn i18n:check`` → ok
- ``git diff --check`` → ok

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(collection): task #87 P1-D3 — convert vector_backend to computed_field

Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f /
msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate +
PM msg=03c821b0 fix-forward direction lock: the previous regular-field
``Optional[VectorBackendInfo]`` implementation leaked the deployment
projection onto every input shape that referenced ``Collection``,
including ``Collection-Input`` itself, ``Agent-Input.collections``,
and ``CreateTurnRequest.collections``. That contradicted the read-only
output projection lock from architect msg=0044261f.

Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field``
property so OpenAPI input/output schemas auto-split:

- ``Collection-Output`` now lists ``vector_backend`` with
  ``readonly: true`` (verified in regenerated
  ``web/src/api-v2/schema.d.ts``).
- ``Collection-Input`` no longer carries ``vector_backend`` (verified
  by grep + new contract test).
- ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` /
  ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``,
  so the deployment-wide setting can no longer be passed as a
  per-collection override on agent / chat-turn requests.

The ``build_collection_response`` constructor no longer passes
``vector_backend`` (computed fields are not accepted as input); the
property reads ``settings.vector_db_type`` lazily on each serialization.

Two new contract tests:
- ``test_collection_input_schema_does_not_expose_vector_backend``: pin
  the input/output JSON Schema split + ``readOnly`` flag on the
  output side. Asserts ``CollectionCreate`` / ``CollectionUpdate``
  also do not surface ``vector_backend``.
- ``test_collection_constructor_ignores_vector_backend_input``:
  defensive — even if a malicious caller stuffs ``vector_backend``
  into a ``model_validate`` payload, Pydantic ignores it and the
  computed property still reflects the deployment setting.

Sediment: cuiwenbo own-up CR miss — implement-time only verified the
``CollectionConfig`` placement (one defense layer) and missed the
``Collection`` self-reuse-as-input second layer. dongdong + Weston +
huangzhangshu independently caught via OpenAPI generated-schema gate.
mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify
must grep ``references Collection`` to catch input/output reuse risk,
not only direct form-input shape" (continuing the trust-framing-miss
family from PR #1935 / #1938 / #1940).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): consolidate vector_backend_capability_matrix imports for ruff

Combine the two from aperag.schema.common import ... statements
into a single block so ruff's import organization rule is satisfied.
No code-behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): apply ruff format to vector_backend test + common.py

Run `uv run ruff format` on ApeRAG/aperag/schema/common.py and
ApeRAG/tests/unit_test/contracts/test_vector_backend_capability_matrix.py
so `make lint` (`ruff format --check`) passes. Pure formatting; no
behavior change. Other unrelated files reverted to keep this PR scope
clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant