chore: remove the vicuna model#87
Closed
iziang wants to merge 0 commit into
Closed
Conversation
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
…field Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f / msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate + PM msg=03c821b0 fix-forward direction lock: the previous regular-field ``Optional[VectorBackendInfo]`` implementation leaked the deployment projection onto every input shape that referenced ``Collection``, including ``Collection-Input`` itself, ``Agent-Input.collections``, and ``CreateTurnRequest.collections``. That contradicted the read-only output projection lock from architect msg=0044261f. Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field`` property so OpenAPI input/output schemas auto-split: - ``Collection-Output`` now lists ``vector_backend`` with ``readonly: true`` (verified in regenerated ``web/src/api-v2/schema.d.ts``). - ``Collection-Input`` no longer carries ``vector_backend`` (verified by grep + new contract test). - ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` / ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``, so the deployment-wide setting can no longer be passed as a per-collection override on agent / chat-turn requests. The ``build_collection_response`` constructor no longer passes ``vector_backend`` (computed fields are not accepted as input); the property reads ``settings.vector_db_type`` lazily on each serialization. Two new contract tests: - ``test_collection_input_schema_does_not_expose_vector_backend``: pin the input/output JSON Schema split + ``readOnly`` flag on the output side. Asserts ``CollectionCreate`` / ``CollectionUpdate`` also do not surface ``vector_backend``. - ``test_collection_constructor_ignores_vector_backend_input``: defensive — even if a malicious caller stuffs ``vector_backend`` into a ``model_validate`` payload, Pydantic ignores it and the computed property still reflects the deployment setting. Sediment: cuiwenbo own-up CR miss — implement-time only verified the ``CollectionConfig`` placement (one defense layer) and missed the ``Collection`` self-reuse-as-input second layer. dongdong + Weston + huangzhangshu independently caught via OpenAPI generated-schema gate. mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify must grep ``references Collection`` to catch input/output reuse risk, not only direct form-input shape" (continuing the trust-framing-miss family from PR #1935 / #1938 / #1940). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
… filter Or guard + retrieve defense-in-depth (#1948) * feat(vectorstore): task #61 P1-V vector adapter family — capability + filter Or guard + retrieve defense-in-depth Closes task #83 per PM @不穷 dispatch (msg=29c9e753). Folds 4 P1-V items from task #61 spec v1 § 2.3 into a single PR: P1-V1 — collection init failure contract documentation ------------------------------------------------------ ``ensure_collection`` Protocol docstring now spells out the cross- adapter contract (idempotent / race-safe / fail-loud / cache-not- poisoned-on-failure). Both adapters already implement these behaviours; the documentation closes the spec drift gap so future implementers have a checklist. P1-V2 — batch upsert atomicity capability declaration ----------------------------------------------------- New :class:`VectorBackendCapabilities` frozen dataclass on the base module declares static per-backend behaviour flags. Each ``VectorStoreConnector`` subclass exposes an instance via the ``BACKEND_CAPABILITIES`` class-level attribute: * ``PgvectorVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = True`` (PGVector wraps bulk INSERT ON CONFLICT in ``engine.begin()`` — mid-batch failure rolls back the whole batch). * ``QdrantVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = False`` (Qdrant ``client.upsert(points, wait=True)`` is best-effort per-point — partial writes possible on mid-batch failure). ``upsert`` Protocol docstring now points at the capability flag so callers know to chunk + verify on backends that declare ``False``. P1-V3 — filter Or empty-parts guard ----------------------------------- ``Or.__post_init__`` already rejects empty ``parts`` at DSL construction. Both adapter translators now also guard at the translator boundary so a future refactor that bypasses the constructor (e.g. ``object.__setattr__(or_node, "parts", ())`` on the frozen dataclass, or a ``dataclasses.replace`` with empty parts) can't silently degrade to a vacuous "match everything" disjunction: * ``aperag/vectorstore/pgvector_connector.py:_SqlFilter._walk`` — raises ``UnsupportedFilterError`` on empty post-walk parts. * ``aperag/vectorstore/qdrant_connector.py:_translate_filter`` — raises ``UnsupportedFilterError`` on empty post-prune subs (so ``rest.Filter(should=[])`` — which Qdrant treats as match-all — is unreachable). P1-V4 — Qdrant legacy mode defense-in-depth ------------------------------------------- ``QdrantVectorStoreConnector.retrieve`` now applies the same ``TENANT_PAYLOAD_KEY`` filter in **both** multitenant and legacy modes, but with a backwards-compatible "no payload key → pass through" branch so legacy-only rows that don't carry the payload key keep working: * In multitenant mode: filter is the primary tenant-isolation layer (unchanged behaviour). * In legacy mode: collection-name isolation is the primary layer; the new payload-level filter is belt-and-braces against tooling drift / migration mistakes that could plant a stray foreign-tenant row in a legacy collection. The new ``BACKEND_CAPABILITIES.supports_legacy_mode`` flag declares which adapter supports the legacy layout (PGVector ``False``, Qdrant ``True``) so callers can tell the difference machine- readably. Tests ----- * ``tests/unit_test/vectorstore/test_backend_capabilities.py`` (new) — pins shape + per-flag values for each adapter. Coordinates with cuiwenbo task #87 P1-D3 collection metadata Pydantic projection so the static capability matrix stays consistent across PRs. * ``tests/unit_test/vectorstore/test_pgvector_translator.py`` and ``test_qdrant_filter_translation.py`` — pin the new Or empty-parts guard with frozen-dataclass-bypass coverage. * ``tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py`` — new ``test_retrieve_legacy_mode_filters_stray_foreign_payload`` exercises the P1-V4 belt-and-braces filter on a real ``:memory:`` Qdrant client: legacy-mode rows without payload key pass through (backward compat), own-tenant payload passes, foreign-tenant payload is dropped. Local: ``uv run pytest tests/unit_test/vectorstore/`` → **156 passed, 10 skipped, 1 warning**. Spec / scope alignment ---------------------- * task #61 spec v1 § 2.3 P1-V1 → ensure_collection contract doc ✅ * task #61 spec v1 § 2.3 P1-V2 → BACKEND_CAPABILITIES.supports_atomic_batch_upsert ✅ * task #61 spec v1 § 2.3 P1-V3 → Or empty-parts guard ✅ * task #61 spec v1 § 2.3 P1-V4 → retrieve defense-in-depth + supports_legacy_mode ✅ * Lesson #14 multi-iteration cleanup — legacy mode flagged via ``supports_legacy_mode`` so a future PR can drop the mode entirely once telemetry confirms zero production usage ✅ * Lesson #17 backend 收敛 contract — capability declaration is the backend-side contract that lets callers (FE / API / MCP) read a single source of truth instead of forking on backend type ✅ Follow-ups (NOT in this PR) --------------------------- * task #84 P1-G1+G2 graph store boundary tests — ziang * task #85 P1-D1 e2e shape matrix — huangzhangshu * task #86 P1-D2 Helm Nebula first-class — Planetegg * task #87 P1-D3 collection metadata vector_backend projection — cuiwenbo + dongdong (consumes ``BACKEND_CAPABILITIES`` values) * task #88 P2-S1+S2 batch alias resolution — Bryce after this PR Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(vectorstore): mode-specific tenant filter on Qdrant retrieve (Weston PR #1948 BLOCKER) Weston PR #1948 architecture CR (msg=910cad66 BLOCKER) caught a real correctness regression in the initial P1-V4 commit: the uniform "no payload key → pass through" branch leaked stray ``{}`` payload rows in the **shared multitenant collection** to every tenant on a ``retrieve(ids=...)`` call. Local Qdrant ``:memory:`` repro (per Weston): a multitenant connector ``tenant_a`` writes a point with ``payload={}`` directly to the shared collection, then ``tenant_a.retrieve([id])`` returns the row. Because ``upsert()`` always stamps the payload key, the only way a missing-key row reaches the shared collection is tooling drift / migration drift — exactly the case P1-V4 defense-in-depth is supposed to catch. Fix --- Mode-specific semantics: * **Multitenant mode** (shared physical collection): STRICT — every row MUST carry ``TENANT_PAYLOAD_KEY`` matching the connector's tenant id. No "no payload key → pass through" branch, because the shared collection means a missing key would expose the row to every tenant. * **Legacy mode** (per-tenant physical collection, unchanged from initial commit): PERMISSIVE — a row that doesn't carry the payload key still passes through (typical pre-multitenant data shape), but a stray foreign-tenant payload gets dropped (catches tooling drift / migration mistakes). Tests ----- ``test_retrieve_multitenant_mode_strict_requires_payload_key`` (new) — Weston's exact repro: seed shared collection with ``{}`` payload + own-tenant payload + foreign-tenant payload, assert only the own-tenant row passes through. The legacy-mode permissive counterpart (``test_retrieve_legacy_mode_filters_stray_foreign_payload``) stays unchanged so a future refactor that unifies them silently re-opens the leak fails fast. Local: ``uv run pytest tests/unit_test/vectorstore/`` → **157 passed, 10 skipped** (one new case). Sediment trigger ---------------- This is Lesson #12 v9 fifth-application demo same family — Weston first-principles repro catches the unified branch as silent leak that I missed when applying the legacy-compat optimization uniformly. The narrower ``mode-specific`` framing matches the spec language ("legacy compat for legacy mode only") more precisely. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
…ty matrix (#1949) * feat(collection): task #61 P1-D3 — vector backend identity + capability matrix Project the deployment-wide ``settings.vector_db_type`` onto every collection detail read so the FE can render a "what does this vector backend actually support" panel without per-collection migration or runtime probe. Backend (output-only projection): - ``aperag/schema/common.py``: ``VectorBackendCapabilities`` + ``VectorBackendInfo`` + ``_STATIC_VECTOR_BACKEND_CAPABILITIES`` dict + ``project_vector_backend_info()`` helper. - ``aperag/domains/knowledge_base/schemas.py:Collection``: add ``vector_backend: Optional[VectorBackendInfo]``. **Intentionally NOT on ``CollectionConfig``** so the OpenAPI ``CollectionCreate`` / ``CollectionUpdate`` input shapes do not let callers mistake a deployment-wide setting for a per-collection editable knob (per dongdong msg=c2593fdd + PM msg=caf7e4df + architect msg=0044261f read-only projection lock). - ``aperag/domains/knowledge_base/service/collection_service.py``: populate ``vector_backend`` in ``build_collection_response`` from ``settings.vector_db_type``; ``None`` for unknown backends so the FE can render a placeholder without a hard failure. Cross-PR consistency with task #83 / PR #1948 (Bryce, vector adapter behavior fixes): - Bryce's connector-layer ``BACKEND_CAPABILITIES`` ClassVar declares 2 truth flags (``supports_atomic_batch_upsert`` + ``supports_legacy_mode``); this PR's schema-layer Pydantic model mirrors those values plus a 3rd schema-layer-only flag ``supports_filter_or_with_empty_parts`` which is uniformly False across adapters after task #83 P1-V3 (translator-level defense-in-depth rejects empty Or parts). - The 3rd flag stays in the schema so the FE can declare the uniform reject explicitly per spec § 2.3 P1-D3 「显示『允许差异但显式』」 — Lesson #17 backend 收敛 contract simple-stable family pattern (cite PR #1930 SearchHit normalize, PR #1935 GraphMergeSuggestionItem projection layer). Mechanical gate (per Lesson #18 lesson-sediment + mechanical-gate 双 layer codification — first established by chenyexuan PR #1933 / PR #1941, then PR #1940 ``model_validate`` boundary): 13-case unit suite in ``tests/unit_test/contracts/test_vector_backend_capability_matrix.py`` pins each capability flag, normalizes inputs, and round-trips Pydantic ``model_dump`` so future drift between schema, projection helper, and FE-consumed shape fails fast at unit-test time. FE (read-only display): - ``web/src/features/collection/types.ts``: typed mirrors ``VectorBackendInfo`` / ``VectorBackendCapabilities`` / ``VectorBackendType``. - ``web/src/app/workspace/collections/[collectionId]/settings/collection-vector-backend-card.tsx``: new component that surfaces backend identity + capability matrix in the collection settings page (above the edit form). dongdong picks up rendering polish (responsive + dark mode + final copy) on the same PR per the joint A4-style split (cuiwenbo contract layer + dongdong rendering polish + CR pair). - ``web/src/i18n/{en-US,zh-CN}/page_collections.json``: copy strings. - ``web/src/api-v2/schema.d.ts`` regenerated via ``yarn api:v2:types``. Local verification: - ``uv run --extra test pytest tests/unit_test/contracts/test_vector_backend_capability_matrix.py tests/unit_test/contracts/test_collection_v2_openapi_contract.py -q`` → 23 passed - ``make openapi-check`` → ok - ``yarn type-check --pretty false`` → 0 new errors on this PR's files (pre-existing graph-lab cosmograph + agent-runtime errors unchanged) - ``yarn lint --quiet`` → 0 warnings/errors - ``yarn i18n:check`` → ok - ``git diff --check`` → ok Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(collection): task #87 P1-D3 — convert vector_backend to computed_field Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f / msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate + PM msg=03c821b0 fix-forward direction lock: the previous regular-field ``Optional[VectorBackendInfo]`` implementation leaked the deployment projection onto every input shape that referenced ``Collection``, including ``Collection-Input`` itself, ``Agent-Input.collections``, and ``CreateTurnRequest.collections``. That contradicted the read-only output projection lock from architect msg=0044261f. Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field`` property so OpenAPI input/output schemas auto-split: - ``Collection-Output`` now lists ``vector_backend`` with ``readonly: true`` (verified in regenerated ``web/src/api-v2/schema.d.ts``). - ``Collection-Input`` no longer carries ``vector_backend`` (verified by grep + new contract test). - ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` / ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``, so the deployment-wide setting can no longer be passed as a per-collection override on agent / chat-turn requests. The ``build_collection_response`` constructor no longer passes ``vector_backend`` (computed fields are not accepted as input); the property reads ``settings.vector_db_type`` lazily on each serialization. Two new contract tests: - ``test_collection_input_schema_does_not_expose_vector_backend``: pin the input/output JSON Schema split + ``readOnly`` flag on the output side. Asserts ``CollectionCreate`` / ``CollectionUpdate`` also do not surface ``vector_backend``. - ``test_collection_constructor_ignores_vector_backend_input``: defensive — even if a malicious caller stuffs ``vector_backend`` into a ``model_validate`` payload, Pydantic ignores it and the computed property still reflects the deployment setting. Sediment: cuiwenbo own-up CR miss — implement-time only verified the ``CollectionConfig`` placement (one defense layer) and missed the ``Collection`` self-reuse-as-input second layer. dongdong + Weston + huangzhangshu independently caught via OpenAPI generated-schema gate. mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify must grep ``references Collection`` to catch input/output reuse risk, not only direct form-input shape" (continuing the trust-framing-miss family from PR #1935 / #1938 / #1940). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): consolidate vector_backend_capability_matrix imports for ruff Combine the two from aperag.schema.common import ... statements into a single block so ruff's import organization rule is satisfied. No code-behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): apply ruff format to vector_backend test + common.py Run `uv run ruff format` on ApeRAG/aperag/schema/common.py and ApeRAG/tests/unit_test/contracts/test_vector_backend_capability_matrix.py so `make lint` (`ruff format --check`) passes. Pure formatting; no behavior change. Other unrelated files reverted to keep this PR scope clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.