Skip to content

chore: remove the vicuna model#87

Closed
iziang wants to merge 0 commit into
mainfrom
support/model
Closed

chore: remove the vicuna model#87
iziang wants to merge 0 commit into
mainfrom
support/model

Conversation

@iziang
Copy link
Copy Markdown
Contributor

@iziang iziang commented Aug 3, 2023

No description provided.

@apecloud-bot apecloud-bot added the size/XS Denotes a PR that changes 0-9 lines. label Aug 3, 2023
@iziang iziang closed this Aug 6, 2023
@iziang iziang deleted the support/model branch December 1, 2023 07:36
earayu added a commit that referenced this pull request Apr 30, 2026
…field

Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f /
msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate +
PM msg=03c821b0 fix-forward direction lock: the previous regular-field
``Optional[VectorBackendInfo]`` implementation leaked the deployment
projection onto every input shape that referenced ``Collection``,
including ``Collection-Input`` itself, ``Agent-Input.collections``,
and ``CreateTurnRequest.collections``. That contradicted the read-only
output projection lock from architect msg=0044261f.

Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field``
property so OpenAPI input/output schemas auto-split:

- ``Collection-Output`` now lists ``vector_backend`` with
  ``readonly: true`` (verified in regenerated
  ``web/src/api-v2/schema.d.ts``).
- ``Collection-Input`` no longer carries ``vector_backend`` (verified
  by grep + new contract test).
- ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` /
  ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``,
  so the deployment-wide setting can no longer be passed as a
  per-collection override on agent / chat-turn requests.

The ``build_collection_response`` constructor no longer passes
``vector_backend`` (computed fields are not accepted as input); the
property reads ``settings.vector_db_type`` lazily on each serialization.

Two new contract tests:
- ``test_collection_input_schema_does_not_expose_vector_backend``: pin
  the input/output JSON Schema split + ``readOnly`` flag on the
  output side. Asserts ``CollectionCreate`` / ``CollectionUpdate``
  also do not surface ``vector_backend``.
- ``test_collection_constructor_ignores_vector_backend_input``:
  defensive — even if a malicious caller stuffs ``vector_backend``
  into a ``model_validate`` payload, Pydantic ignores it and the
  computed property still reflects the deployment setting.

Sediment: cuiwenbo own-up CR miss — implement-time only verified the
``CollectionConfig`` placement (one defense layer) and missed the
``Collection`` self-reuse-as-input second layer. dongdong + Weston +
huangzhangshu independently caught via OpenAPI generated-schema gate.
mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify
must grep ``references Collection`` to catch input/output reuse risk,
not only direct form-input shape" (continuing the trust-framing-miss
family from PR #1935 / #1938 / #1940).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
… filter Or guard + retrieve defense-in-depth (#1948)

* feat(vectorstore): task #61 P1-V vector adapter family — capability + filter Or guard + retrieve defense-in-depth

Closes task #83 per PM @不穷 dispatch (msg=29c9e753). Folds 4 P1-V
items from task #61 spec v1 § 2.3 into a single PR:

P1-V1 — collection init failure contract documentation
------------------------------------------------------
``ensure_collection`` Protocol docstring now spells out the cross-
adapter contract (idempotent / race-safe / fail-loud / cache-not-
poisoned-on-failure). Both adapters already implement these
behaviours; the documentation closes the spec drift gap so future
implementers have a checklist.

P1-V2 — batch upsert atomicity capability declaration
-----------------------------------------------------
New :class:`VectorBackendCapabilities` frozen dataclass on the base
module declares static per-backend behaviour flags. Each
``VectorStoreConnector`` subclass exposes an instance via the
``BACKEND_CAPABILITIES`` class-level attribute:

* ``PgvectorVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = True``
  (PGVector wraps bulk INSERT ON CONFLICT in ``engine.begin()`` —
  mid-batch failure rolls back the whole batch).
* ``QdrantVectorStoreConnector.BACKEND_CAPABILITIES.supports_atomic_batch_upsert = False``
  (Qdrant ``client.upsert(points, wait=True)`` is best-effort
  per-point — partial writes possible on mid-batch failure).

``upsert`` Protocol docstring now points at the capability flag so
callers know to chunk + verify on backends that declare ``False``.

P1-V3 — filter Or empty-parts guard
-----------------------------------
``Or.__post_init__`` already rejects empty ``parts`` at DSL
construction. Both adapter translators now also guard at the
translator boundary so a future refactor that bypasses the
constructor (e.g. ``object.__setattr__(or_node, "parts", ())`` on
the frozen dataclass, or a ``dataclasses.replace`` with empty
parts) can't silently degrade to a vacuous "match everything"
disjunction:

* ``aperag/vectorstore/pgvector_connector.py:_SqlFilter._walk`` —
  raises ``UnsupportedFilterError`` on empty post-walk parts.
* ``aperag/vectorstore/qdrant_connector.py:_translate_filter`` —
  raises ``UnsupportedFilterError`` on empty post-prune subs (so
  ``rest.Filter(should=[])`` — which Qdrant treats as match-all —
  is unreachable).

P1-V4 — Qdrant legacy mode defense-in-depth
-------------------------------------------
``QdrantVectorStoreConnector.retrieve`` now applies the same
``TENANT_PAYLOAD_KEY`` filter in **both** multitenant and legacy
modes, but with a backwards-compatible "no payload key → pass
through" branch so legacy-only rows that don't carry the payload
key keep working:

* In multitenant mode: filter is the primary tenant-isolation
  layer (unchanged behaviour).
* In legacy mode: collection-name isolation is the primary layer;
  the new payload-level filter is belt-and-braces against tooling
  drift / migration mistakes that could plant a stray foreign-tenant
  row in a legacy collection.

The new ``BACKEND_CAPABILITIES.supports_legacy_mode`` flag declares
which adapter supports the legacy layout (PGVector ``False``,
Qdrant ``True``) so callers can tell the difference machine-
readably.

Tests
-----
* ``tests/unit_test/vectorstore/test_backend_capabilities.py``
  (new) — pins shape + per-flag values for each adapter. Coordinates
  with cuiwenbo task #87 P1-D3 collection metadata Pydantic
  projection so the static capability matrix stays consistent
  across PRs.
* ``tests/unit_test/vectorstore/test_pgvector_translator.py`` and
  ``test_qdrant_filter_translation.py`` — pin the new Or empty-parts
  guard with frozen-dataclass-bypass coverage.
* ``tests/unit_test/vectorstore/test_qdrant_multitenancy_integration.py``
  — new ``test_retrieve_legacy_mode_filters_stray_foreign_payload``
  exercises the P1-V4 belt-and-braces filter on a real ``:memory:``
  Qdrant client: legacy-mode rows without payload key pass through
  (backward compat), own-tenant payload passes, foreign-tenant
  payload is dropped.

Local: ``uv run pytest tests/unit_test/vectorstore/`` →
**156 passed, 10 skipped, 1 warning**.

Spec / scope alignment
----------------------
* task #61 spec v1 § 2.3 P1-V1 → ensure_collection contract doc ✅
* task #61 spec v1 § 2.3 P1-V2 → BACKEND_CAPABILITIES.supports_atomic_batch_upsert ✅
* task #61 spec v1 § 2.3 P1-V3 → Or empty-parts guard ✅
* task #61 spec v1 § 2.3 P1-V4 → retrieve defense-in-depth + supports_legacy_mode ✅
* Lesson #14 multi-iteration cleanup — legacy mode flagged via
  ``supports_legacy_mode`` so a future PR can drop the mode
  entirely once telemetry confirms zero production usage ✅
* Lesson #17 backend 收敛 contract — capability declaration is the
  backend-side contract that lets callers (FE / API / MCP) read a
  single source of truth instead of forking on backend type ✅

Follow-ups (NOT in this PR)
---------------------------
* task #84 P1-G1+G2 graph store boundary tests — ziang
* task #85 P1-D1 e2e shape matrix — huangzhangshu
* task #86 P1-D2 Helm Nebula first-class — Planetegg
* task #87 P1-D3 collection metadata vector_backend projection —
  cuiwenbo + dongdong (consumes ``BACKEND_CAPABILITIES`` values)
* task #88 P2-S1+S2 batch alias resolution — Bryce after this PR

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(vectorstore): mode-specific tenant filter on Qdrant retrieve (Weston PR #1948 BLOCKER)

Weston PR #1948 architecture CR (msg=910cad66 BLOCKER) caught a real
correctness regression in the initial P1-V4 commit: the uniform
"no payload key → pass through" branch leaked stray ``{}`` payload
rows in the **shared multitenant collection** to every tenant on a
``retrieve(ids=...)`` call.

Local Qdrant ``:memory:`` repro (per Weston): a multitenant
connector ``tenant_a`` writes a point with ``payload={}`` directly
to the shared collection, then ``tenant_a.retrieve([id])`` returns
the row. Because ``upsert()`` always stamps the payload key, the
only way a missing-key row reaches the shared collection is tooling
drift / migration drift — exactly the case P1-V4 defense-in-depth
is supposed to catch.

Fix
---
Mode-specific semantics:

* **Multitenant mode** (shared physical collection): STRICT —
  every row MUST carry ``TENANT_PAYLOAD_KEY`` matching the
  connector's tenant id. No "no payload key → pass through"
  branch, because the shared collection means a missing key would
  expose the row to every tenant.
* **Legacy mode** (per-tenant physical collection, unchanged from
  initial commit): PERMISSIVE — a row that doesn't carry the
  payload key still passes through (typical pre-multitenant data
  shape), but a stray foreign-tenant payload gets dropped (catches
  tooling drift / migration mistakes).

Tests
-----
``test_retrieve_multitenant_mode_strict_requires_payload_key`` (new)
— Weston's exact repro: seed shared collection with ``{}`` payload
+ own-tenant payload + foreign-tenant payload, assert only the
own-tenant row passes through. The legacy-mode permissive
counterpart (``test_retrieve_legacy_mode_filters_stray_foreign_payload``)
stays unchanged so a future refactor that unifies them silently
re-opens the leak fails fast.

Local: ``uv run pytest tests/unit_test/vectorstore/`` →
**157 passed, 10 skipped** (one new case).

Sediment trigger
----------------
This is Lesson #12 v9 fifth-application demo same family — Weston
first-principles repro catches the unified branch as silent leak
that I missed when applying the legacy-compat optimization
uniformly. The narrower ``mode-specific`` framing matches the spec
language ("legacy compat for legacy mode only") more precisely.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
…ty matrix (#1949)

* feat(collection): task #61 P1-D3 — vector backend identity + capability matrix

Project the deployment-wide ``settings.vector_db_type`` onto every
collection detail read so the FE can render a "what does this vector
backend actually support" panel without per-collection migration or
runtime probe.

Backend (output-only projection):
- ``aperag/schema/common.py``: ``VectorBackendCapabilities`` +
  ``VectorBackendInfo`` + ``_STATIC_VECTOR_BACKEND_CAPABILITIES`` dict +
  ``project_vector_backend_info()`` helper.
- ``aperag/domains/knowledge_base/schemas.py:Collection``: add
  ``vector_backend: Optional[VectorBackendInfo]``. **Intentionally NOT
  on ``CollectionConfig``** so the OpenAPI ``CollectionCreate`` /
  ``CollectionUpdate`` input shapes do not let callers mistake a
  deployment-wide setting for a per-collection editable knob (per
  dongdong msg=c2593fdd + PM msg=caf7e4df + architect msg=0044261f
  read-only projection lock).
- ``aperag/domains/knowledge_base/service/collection_service.py``:
  populate ``vector_backend`` in ``build_collection_response`` from
  ``settings.vector_db_type``; ``None`` for unknown backends so the FE
  can render a placeholder without a hard failure.

Cross-PR consistency with task #83 / PR #1948 (Bryce, vector adapter
behavior fixes):
- Bryce's connector-layer ``BACKEND_CAPABILITIES`` ClassVar declares 2
  truth flags (``supports_atomic_batch_upsert`` +
  ``supports_legacy_mode``); this PR's schema-layer Pydantic model
  mirrors those values plus a 3rd schema-layer-only flag
  ``supports_filter_or_with_empty_parts`` which is uniformly False
  across adapters after task #83 P1-V3 (translator-level
  defense-in-depth rejects empty Or parts).
- The 3rd flag stays in the schema so the FE can declare the uniform
  reject explicitly per spec § 2.3 P1-D3 「显示『允许差异但显式』」 —
  Lesson #17 backend 收敛 contract simple-stable family pattern (cite
  PR #1930 SearchHit normalize, PR #1935 GraphMergeSuggestionItem
  projection layer).

Mechanical gate (per Lesson #18 lesson-sediment + mechanical-gate 双
layer codification — first established by chenyexuan PR #1933 / PR
#1941, then PR #1940 ``model_validate`` boundary): 13-case unit suite
in ``tests/unit_test/contracts/test_vector_backend_capability_matrix.py``
pins each capability flag, normalizes inputs, and round-trips Pydantic
``model_dump`` so future drift between schema, projection helper, and
FE-consumed shape fails fast at unit-test time.

FE (read-only display):
- ``web/src/features/collection/types.ts``: typed mirrors
  ``VectorBackendInfo`` / ``VectorBackendCapabilities`` /
  ``VectorBackendType``.
- ``web/src/app/workspace/collections/[collectionId]/settings/collection-vector-backend-card.tsx``:
  new component that surfaces backend identity + capability matrix in
  the collection settings page (above the edit form). dongdong picks
  up rendering polish (responsive + dark mode + final copy) on the
  same PR per the joint A4-style split (cuiwenbo contract layer +
  dongdong rendering polish + CR pair).
- ``web/src/i18n/{en-US,zh-CN}/page_collections.json``: copy strings.
- ``web/src/api-v2/schema.d.ts`` regenerated via ``yarn api:v2:types``.

Local verification:
- ``uv run --extra test pytest tests/unit_test/contracts/test_vector_backend_capability_matrix.py tests/unit_test/contracts/test_collection_v2_openapi_contract.py -q`` → 23 passed
- ``make openapi-check`` → ok
- ``yarn type-check --pretty false`` → 0 new errors on this PR's files
  (pre-existing graph-lab cosmograph + agent-runtime errors unchanged)
- ``yarn lint --quiet`` → 0 warnings/errors
- ``yarn i18n:check`` → ok
- ``git diff --check`` → ok

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(collection): task #87 P1-D3 — convert vector_backend to computed_field

Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f /
msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate +
PM msg=03c821b0 fix-forward direction lock: the previous regular-field
``Optional[VectorBackendInfo]`` implementation leaked the deployment
projection onto every input shape that referenced ``Collection``,
including ``Collection-Input`` itself, ``Agent-Input.collections``,
and ``CreateTurnRequest.collections``. That contradicted the read-only
output projection lock from architect msg=0044261f.

Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field``
property so OpenAPI input/output schemas auto-split:

- ``Collection-Output`` now lists ``vector_backend`` with
  ``readonly: true`` (verified in regenerated
  ``web/src/api-v2/schema.d.ts``).
- ``Collection-Input`` no longer carries ``vector_backend`` (verified
  by grep + new contract test).
- ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` /
  ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``,
  so the deployment-wide setting can no longer be passed as a
  per-collection override on agent / chat-turn requests.

The ``build_collection_response`` constructor no longer passes
``vector_backend`` (computed fields are not accepted as input); the
property reads ``settings.vector_db_type`` lazily on each serialization.

Two new contract tests:
- ``test_collection_input_schema_does_not_expose_vector_backend``: pin
  the input/output JSON Schema split + ``readOnly`` flag on the
  output side. Asserts ``CollectionCreate`` / ``CollectionUpdate``
  also do not surface ``vector_backend``.
- ``test_collection_constructor_ignores_vector_backend_input``:
  defensive — even if a malicious caller stuffs ``vector_backend``
  into a ``model_validate`` payload, Pydantic ignores it and the
  computed property still reflects the deployment setting.

Sediment: cuiwenbo own-up CR miss — implement-time only verified the
``CollectionConfig`` placement (one defense layer) and missed the
``Collection`` self-reuse-as-input second layer. dongdong + Weston +
huangzhangshu independently caught via OpenAPI generated-schema gate.
mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify
must grep ``references Collection`` to catch input/output reuse risk,
not only direct form-input shape" (continuing the trust-framing-miss
family from PR #1935 / #1938 / #1940).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): consolidate vector_backend_capability_matrix imports for ruff

Combine the two from aperag.schema.common import ... statements
into a single block so ruff's import organization rule is satisfied.
No code-behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): apply ruff format to vector_backend test + common.py

Run `uv run ruff format` on ApeRAG/aperag/schema/common.py and
ApeRAG/tests/unit_test/contracts/test_vector_backend_capability_matrix.py
so `make lint` (`ruff format --check`) passes. Pure formatting; no
behavior change. Other unrelated files reverted to keep this PR scope
clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XS Denotes a PR that changes 0-9 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants