chore: remove the vicuna model by iziang · Pull Request #87 · apecloud/ApeRAG

iziang · 2023-08-03T05:25:00Z

No description provided.

…field Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f / msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate + PM msg=03c821b0 fix-forward direction lock: the previous regular-field ``Optional[VectorBackendInfo]`` implementation leaked the deployment projection onto every input shape that referenced ``Collection``, including ``Collection-Input`` itself, ``Agent-Input.collections``, and ``CreateTurnRequest.collections``. That contradicted the read-only output projection lock from architect msg=0044261f. Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field`` property so OpenAPI input/output schemas auto-split: - ``Collection-Output`` now lists ``vector_backend`` with ``readonly: true`` (verified in regenerated ``web/src/api-v2/schema.d.ts``). - ``Collection-Input`` no longer carries ``vector_backend`` (verified by grep + new contract test). - ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` / ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``, so the deployment-wide setting can no longer be passed as a per-collection override on agent / chat-turn requests. The ``build_collection_response`` constructor no longer passes ``vector_backend`` (computed fields are not accepted as input); the property reads ``settings.vector_db_type`` lazily on each serialization. Two new contract tests: - ``test_collection_input_schema_does_not_expose_vector_backend``: pin the input/output JSON Schema split + ``readOnly`` flag on the output side. Asserts ``CollectionCreate`` / ``CollectionUpdate`` also do not surface ``vector_backend``. - ``test_collection_constructor_ignores_vector_backend_input``: defensive — even if a malicious caller stuffs ``vector_backend`` into a ``model_validate`` payload, Pydantic ignores it and the computed property still reflects the deployment setting. Sediment: cuiwenbo own-up CR miss — implement-time only verified the ``CollectionConfig`` placement (one defense layer) and missed the ``Collection`` self-reuse-as-input second layer. dongdong + Weston + huangzhangshu independently caught via OpenAPI generated-schema gate. mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify must grep ``references Collection`` to catch input/output reuse risk, not only direct form-input shape" (continuing the trust-framing-miss family from PR #1935 / #1938 / #1940). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… filter Or guard + retrieve defense-in-depth (#1948) * feat(vectorstore): task #61 P1-V vector adapter family — capability + filter Or guard + retrieve defense-in-depth Closes task #83 per PM @不穷 dispatch (msg=29c9e753). Folds 4 P1-V items from task #61 spec v1 § 2.3 into a single PR: P1-V1 — collection init failure contract documentation ------------------------------------------------------ ``ensure_collection`` Protocol docstring now spells out the cross- adapter contract (idempotent / race-safe / fail-loud / cache-not- poisoned-on-failure). Both adapters already implement these behaviours; the documentation closes the spec drift gap so future implementers have a checklist. P1-V2 — batch upsert atomicity capability declaration ----------------------------------------------------- New :class:`VectorBackendCapabilities` frozen dataclass on the base module declares static per-backend behaviour flags. Each ``VectorStoreConnector`` subclass exposes an instance via the ``BACKEND_CAPABILITIES`` class-level attribute: * ``PgvectorVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = True`` (PGVector wraps bulk INSERT ON CONFLICT in ``engine.begin()`` — mid-batch failure rolls back the whole batch). * ``QdrantVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = False`` (Qdrant ``client.upsert(points, wait=True)`` is best-effort per-point — partial writes possible on mid-batch failure). ``upsert`` Protocol docstring now points at the capability flag so callers know to chunk + verify on backends that declare ``False``. P1-V3 — filter Or empty-parts guard ----------------------------------- ``Or.__post_init__`` already rejects empty ``parts`` at DSL construction. Both adapter translators now also guard at the translator boundary so a future refactor that bypasses the constructor (e.g. ``object.__setattr__(or_node, "parts", ())`` on the frozen dataclass, or a ``dataclasses.replace`` with empty parts) can't silently degrade to a vacuous "match everything" disjunction: * ``aperag/vectorstore/pgvector_connector.py:_SqlFilter._walk`` — raises ``UnsupportedFilterError`` on empty post-walk parts. * ``aperag/vectorstore/qdrant_connector.py:_translate_filter`` — raises ``UnsupportedFilterError`` on empty post-prune subs (so ``rest.Filter(should=[])`` — which Qdrant treats as match-all — is unreachable). P1-V4 — Qdrant legacy mode defense-in-depth ------------------------------------------- ``QdrantVectorStoreConnector.retrieve`` now applies the same ``TENANT_PAYLOAD_KEY`` filter in **both** multitenant and legacy modes, but with a backwards-compatible "no payload key → pass through" branch so legacy-only rows that don't carry the payload key keep working: * In multitenant mode: filter is the primary tenant-isolation layer (unchanged behaviour). * In legacy mode: collection-name isolation is the primary layer; the new payload-level filter is belt-and-braces against tooling drift / migration mistakes that could plant a stray foreign-tenant row in a legacy collection. The new ``BACKEND_CAPABILITIES.supports_legacy_mode`` flag declares which adapter supports the legacy layout (PGVector ``False``, Qdrant ``True``) so callers can tell the difference machine- readably. Tests ----- * ``tests/unit_test/vectorstore/test_backend_capabilities.py`` (new) — pins shape + per-flag values for each adapter. Coordinates with cuiwenbo task #87 P1-D3 collection metadata Pydantic projection so the static capability matrix stays consistent across PRs. * ``tests/unit_test/vectorstore/test_pgvector_translator.py`` and ``test_qdrant_filter_translation.py`` — pin the new Or empty-parts guard with frozen-dataclass-bypass coverage. * ``tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py`` — new ``test_retrieve_legacy_mode_filters_stray_foreign_payload`` exercises the P1-V4 belt-and-braces filter on a real ``:memory:`` Qdrant client: legacy-mode rows without payload key pass through (backward compat), own-tenant payload passes, foreign-tenant payload is dropped. Local: ``uv run pytest tests/unit_test/vectorstore/`` → **156 passed, 10 skipped, 1 warning**. Spec / scope alignment ---------------------- * task #61 spec v1 § 2.3 P1-V1 → ensure_collection contract doc ✅ * task #61 spec v1 § 2.3 P1-V2 → BACKEND_CAPABILITIES.supports_atomic_batch_upsert ✅ * task #61 spec v1 § 2.3 P1-V3 → Or empty-parts guard ✅ * task #61 spec v1 § 2.3 P1-V4 → retrieve defense-in-depth + supports_legacy_mode ✅ * Lesson #14 multi-iteration cleanup — legacy mode flagged via ``supports_legacy_mode`` so a future PR can drop the mode entirely once telemetry confirms zero production usage ✅ * Lesson #17 backend 收敛 contract — capability declaration is the backend-side contract that lets callers (FE / API / MCP) read a single source of truth instead of forking on backend type ✅ Follow-ups (NOT in this PR) --------------------------- * task #84 P1-G1+G2 graph store boundary tests — ziang * task #85 P1-D1 e2e shape matrix — huangzhangshu * task #86 P1-D2 Helm Nebula first-class — Planetegg * task #87 P1-D3 collection metadata vector_backend projection — cuiwenbo + dongdong (consumes ``BACKEND_CAPABILITIES`` values) * task #88 P2-S1+S2 batch alias resolution — Bryce after this PR Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(vectorstore): mode-specific tenant filter on Qdrant retrieve (Weston PR #1948 BLOCKER) Weston PR #1948 architecture CR (msg=910cad66 BLOCKER) caught a real correctness regression in the initial P1-V4 commit: the uniform "no payload key → pass through" branch leaked stray ``{}`` payload rows in the **shared multitenant collection** to every tenant on a ``retrieve(ids=...)`` call. Local Qdrant ``:memory:`` repro (per Weston): a multitenant connector ``tenant_a`` writes a point with ``payload={}`` directly to the shared collection, then ``tenant_a.retrieve([id])`` returns the row. Because ``upsert()`` always stamps the payload key, the only way a missing-key row reaches the shared collection is tooling drift / migration drift — exactly the case P1-V4 defense-in-depth is supposed to catch. Fix --- Mode-specific semantics: * **Multitenant mode** (shared physical collection): STRICT — every row MUST carry ``TENANT_PAYLOAD_KEY`` matching the connector's tenant id. No "no payload key → pass through" branch, because the shared collection means a missing key would expose the row to every tenant. * **Legacy mode** (per-tenant physical collection, unchanged from initial commit): PERMISSIVE — a row that doesn't carry the payload key still passes through (typical pre-multitenant data shape), but a stray foreign-tenant payload gets dropped (catches tooling drift / migration mistakes). Tests ----- ``test_retrieve_multitenant_mode_strict_requires_payload_key`` (new) — Weston's exact repro: seed shared collection with ``{}`` payload + own-tenant payload + foreign-tenant payload, assert only the own-tenant row passes through. The legacy-mode permissive counterpart (``test_retrieve_legacy_mode_filters_stray_foreign_payload``) stays unchanged so a future refactor that unifies them silently re-opens the leak fails fast. Local: ``uv run pytest tests/unit_test/vectorstore/`` → **157 passed, 10 skipped** (one new case). Sediment trigger ---------------- This is Lesson #12 v9 fifth-application demo same family — Weston first-principles repro catches the unified branch as silent leak that I missed when applying the legacy-compat optimization uniformly. The narrower ``mode-specific`` framing matches the spec language ("legacy compat for legacy mode only") more precisely. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…ty matrix (#1949) * feat(collection): task #61 P1-D3 — vector backend identity + capability matrix Project the deployment-wide ``settings.vector_db_type`` onto every collection detail read so the FE can render a "what does this vector backend actually support" panel without per-collection migration or runtime probe. Backend (output-only projection): - ``aperag/schema/common.py``: ``VectorBackendCapabilities`` + ``VectorBackendInfo`` + ``_STATIC_VECTOR_BACKEND_CAPABILITIES`` dict + ``project_vector_backend_info()`` helper. - ``aperag/domains/knowledge_base/schemas.py:Collection``: add ``vector_backend: Optional[VectorBackendInfo]``. **Intentionally NOT on ``CollectionConfig``** so the OpenAPI ``CollectionCreate`` / ``CollectionUpdate`` input shapes do not let callers mistake a deployment-wide setting for a per-collection editable knob (per dongdong msg=c2593fdd + PM msg=caf7e4df + architect msg=0044261f read-only projection lock). - ``aperag/domains/knowledge_base/service/collection_service.py``: populate ``vector_backend`` in ``build_collection_response`` from ``settings.vector_db_type``; ``None`` for unknown backends so the FE can render a placeholder without a hard failure. Cross-PR consistency with task #83 / PR #1948 (Bryce, vector adapter behavior fixes): - Bryce's connector-layer ``BACKEND_CAPABILITIES`` ClassVar declares 2 truth flags (``supports_atomic_batch_upsert`` + ``supports_legacy_mode``); this PR's schema-layer Pydantic model mirrors those values plus a 3rd schema-layer-only flag ``supports_filter_or_with_empty_parts`` which is uniformly False across adapters after task #83 P1-V3 (translator-level defense-in-depth rejects empty Or parts). - The 3rd flag stays in the schema so the FE can declare the uniform reject explicitly per spec § 2.3 P1-D3 「显示『允许差异但显式』」 — Lesson #17 backend 收敛 contract simple-stable family pattern (cite PR #1930 SearchHit normalize, PR #1935 GraphMergeSuggestionItem projection layer). Mechanical gate (per Lesson #18 lesson-sediment + mechanical-gate 双 layer codification — first established by chenyexuan PR #1933 / PR #1941, then PR #1940 ``model_validate`` boundary): 13-case unit suite in ``tests/unit_test/contracts/test_vector_backend_capability_matrix.py`` pins each capability flag, normalizes inputs, and round-trips Pydantic ``model_dump`` so future drift between schema, projection helper, and FE-consumed shape fails fast at unit-test time. FE (read-only display): - ``web/src/features/collection/types.ts``: typed mirrors ``VectorBackendInfo`` / ``VectorBackendCapabilities`` / ``VectorBackendType``. - ``web/src/app/workspace/collections/[collectionId]/settings/collection-vector-backend-card.tsx``: new component that surfaces backend identity + capability matrix in the collection settings page (above the edit form). dongdong picks up rendering polish (responsive + dark mode + final copy) on the same PR per the joint A4-style split (cuiwenbo contract layer + dongdong rendering polish + CR pair). - ``web/src/i18n/{en-US,zh-CN}/page_collections.json``: copy strings. - ``web/src/api-v2/schema.d.ts`` regenerated via ``yarn api:v2:types``. Local verification: - ``uv run --extra test pytest tests/unit_test/contracts/test_vector_backend_capability_matrix.py tests/unit_test/contracts/test_collection_v2_openapi_contract.py -q`` → 23 passed - ``make openapi-check`` → ok - ``yarn type-check --pretty false`` → 0 new errors on this PR's files (pre-existing graph-lab cosmograph + agent-runtime errors unchanged) - ``yarn lint --quiet`` → 0 warnings/errors - ``yarn i18n:check`` → ok - ``git diff --check`` → ok Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(collection): task #87 P1-D3 — convert vector_backend to computed_field Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f / msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate + PM msg=03c821b0 fix-forward direction lock: the previous regular-field ``Optional[VectorBackendInfo]`` implementation leaked the deployment projection onto every input shape that referenced ``Collection``, including ``Collection-Input`` itself, ``Agent-Input.collections``, and ``CreateTurnRequest.collections``. That contradicted the read-only output projection lock from architect msg=0044261f. Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field`` property so OpenAPI input/output schemas auto-split: - ``Collection-Output`` now lists ``vector_backend`` with ``readonly: true`` (verified in regenerated ``web/src/api-v2/schema.d.ts``). - ``Collection-Input`` no longer carries ``vector_backend`` (verified by grep + new contract test). - ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` / ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``, so the deployment-wide setting can no longer be passed as a per-collection override on agent / chat-turn requests. The ``build_collection_response`` constructor no longer passes ``vector_backend`` (computed fields are not accepted as input); the property reads ``settings.vector_db_type`` lazily on each serialization. Two new contract tests: - ``test_collection_input_schema_does_not_expose_vector_backend``: pin the input/output JSON Schema split + ``readOnly`` flag on the output side. Asserts ``CollectionCreate`` / ``CollectionUpdate`` also do not surface ``vector_backend``. - ``test_collection_constructor_ignores_vector_backend_input``: defensive — even if a malicious caller stuffs ``vector_backend`` into a ``model_validate`` payload, Pydantic ignores it and the computed property still reflects the deployment setting. Sediment: cuiwenbo own-up CR miss — implement-time only verified the ``CollectionConfig`` placement (one defense layer) and missed the ``Collection`` self-reuse-as-input second layer. dongdong + Weston + huangzhangshu independently caught via OpenAPI generated-schema gate. mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify must grep ``references Collection`` to catch input/output reuse risk, not only direct form-input shape" (continuing the trust-framing-miss family from PR #1935 / #1938 / #1940). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): consolidate vector_backend_capability_matrix imports for ruff Combine the two from aperag.schema.common import ... statements into a single block so ruff's import organization rule is satisfied. No code-behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): apply ruff format to vector_backend test + common.py Run `uv run ruff format` on ApeRAG/aperag/schema/common.py and ApeRAG/tests/unit_test/contracts/test_vector_backend_capability_matrix.py so `make lint` (`ruff format --check`) passes. Pure formatting; no behavior change. Other unrelated files reverted to keep this PR scope clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

apecloud-bot added the size/XS Denotes a PR that changes 0-9 lines. label Aug 3, 2023

iziang closed this Aug 6, 2023

iziang deleted the support/model branch December 1, 2023 07:36

This was referenced Apr 25, 2026

chore(devx #88): remove license header injection + git pre-commit hook #1701

Merged

feat(vectorstore): task #61 P1-V vector adapter family — capability + filter Or guard + retrieve defense-in-depth #1948

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: remove the vicuna model#87

chore: remove the vicuna model#87
iziang wants to merge 0 commit into
mainfrom
support/model

iziang commented Aug 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iziang commented Aug 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants