feat(graph-curation): task #31 A2 suggestion status schema#1935
Merged
Conversation
This was referenced Apr 30, 2026
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
… 5 (DISMISSED 由 PR #1935 引入) per Weston msg=2ad46e97 BLOCKER + PM msg=58853156 — v1.1 amend 没全局 sweep 4 -> 5 enum count change,残留 6 处 4-value 旧口径与 line 241/394 的 5-value 新口径互相矛盾。 Architect own-up #3 of day (cross-spec multi-iteration cleanup miss): - fix-forward 4 lane name sweep (v1) - fix-forward 6 enum count micro-fix (v1) - v1.1 fix-forward 1 (本 PR) — Lesson #14 multi-iteration cleanup family 自身案例 修订: - § 1.1 line 17 GraphCurationSuggestionStatus enum: PENDING/ACCEPTED/REJECTED/DISMISSED 已就位 -> v1 spec lock 时 main 实证仅 PENDING/ACCEPTED/REJECTED/EXPIRED/SUPERSEDED 5 值, DISMISSED 由 PR #1935 引入 - § 2.2 P1-31-B / § 3.1.2 / § 3.1.6 4 新 value -> 5 新 value (含 DISMISSED, replace_all 命中 3 处) - § 3.1.6 line 163 AlembicMigration 4 新 value -> 5 新 value + ACCEPTED/EXPIRED/SUPERSEDED 保留 legacy - § 3.1.6 line 243 Migration chain 时序 4 新 value -> 5 新 value - § 4 A2 line 271 现有 PENDING/ACCEPTED/REJECTED/DISMISSED/EXPIRED/SUPERSEDED + 新加 4 个 -> 现有 PENDING/ACCEPTED/REJECTED/EXPIRED/SUPERSEDED 5 值 (DISMISSED 不在) + 新加 5 new values - § 5.1 line 310 Alembic migration extend 4 新 value -> 5 新 value - § 5.2.b line 331 PG enum 4 新 value -> 5 新 value 全文 enum count + DISMISSED source 一致 (5 new values 含 DISMISSED). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
2 cite accuracy fixes (Weston framing CR catch): 1. response_model validation failure 状态码: 422 -> 500 - response_model validation fails 抛 FastAPI ResponseValidationError - 通常映射到 HTTP 500,不是 request body 校验的 422 - 影响 line 745 + line 850 描述 PR #1940 BLOCKER 时的状态码引用 2. GraphMergeSuggestionItem canonical schema 字段实证修正 - 原写: ... / observed_types / type_conflict / suggested_entity_type - 实际 main aperag/domains/knowledge_graph/schemas.py::GraphMergeSuggestionItem 不含这三字段 - A4 (PR #1940) 这些字段是 FE-derived display (FE 从 entities / suggested_target_entity / evidence_refs 推导),不是 PR #1935 backend projection - 影响 line 781 sect 4 Lesson #17 second-application demo 描述 per Weston PR #1943 framing CR (msg=7690b723) - sediment cite accuracy 要求把事实漂移修干净,避免 future onboarding reference 时 confuse 422/500 状态码语义 + backend/FE field source attribution。 不阻塞 main fold-in scope - 6 lesson sediment + 5 PR commit cross-link 其他 framing 全 accurate (Weston verified)。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
task #31 spec v1.1 amend — fold Phase A 4/4 done 实施 surface 的 spec drift + spec lock invariants + lesson sediment trail. ## v1.1 Amend Scope - § 3.1.1 worker loop fail-safe invariant (PR #1938 Weston BLOCKER → spec lock) - § 3.1.2 action API response shape model_validate contract (PR #1940 Weston BLOCKER → spec lock) - § 3.1.6 DISMISSED enum source 修正 (PR #1935 ziang grep main 实证 v1 spec drift fix) - § 5.2.b 新增 3 boundary test invariants - § 5.2.c 新增 Phase A 实施 sediment trail - § 6 cr-checklist 加 5 sediment items - Migration chain 时序: 5 new value (含 DISMISSED) 不是 4 - fix-forward 1 (commit d50864f) — 全文 6 处 4→5 enum count global sweep + § 1.1 line 17 pre-A2 实证口径补齐 (per Weston BLOCKER msg=2ad46e97) ## CR - @符炫炜 architect (own draft) - @huangheng cr-checklist sediment cite verify ✅ msg=b276da50 - @weston framing verify ✅ msg=a111fcc3 (re-final pass post fix-forward 1) ## CI - lint-and-unit ✅ - e2e-http-smoke 3/3 ✅ (auto-merge after green) - provider-preflight 3/3 ✅ - docs-only lite gate satisfied 🤖 Architect ratify by Claude Code
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
#1943) * docs(cr-checklist): task #31 Phase A 全闭环后 sediment fold-in 子 PR 2 § 四 加 6 lesson sediment(task #31 Phase A 4 PR + task #33 P3 PR #1933 codify 累计实证 + multi-PR same-hour multi-source first-principles catch trust-framing miss)+ § 六 sediment 引用追加 5 PR commit cross-link + § 八 修订记录追加本 PR fold-in 完整 trail。 新增 lesson: - Lesson #12 v9 third + fourth + fifth-application demos (PR #1935 ziang DISMISSED enum impl-side catch + dongdong response_model legacy field filter BLOCKER 双 same-PR / PR #1938 Weston worker fail-safe BLOCKER upstream raise points trace / PR #1940 Weston SuggestionActionResponse.message required field catch) — sediment 升级 systemic 信号 reviewer chain 必独立 first- principles re-verify - Migration chain 时序 second-application demo (PR #1935 复用 table extend pattern 跟 PR #1910 新建 enum hard-cut migration 时序约束不同; 5 new enum value APPLY_PENDING/APPLYING/APPLIED/APPLY_FAILED/DISMISSED + evidence_refs JSON column + ACCEPTED legacy zero-write grep gate) - Lesson #17 second-application demo (PR #1935 backend 收敛 canonical contract 时同 PR fold-in legacy projection layer 保 backward-compat - suggestion_ batch_id=run_id alias 等 - 跟 deprecation marker Lesson #14 family 配) - Lesson #18 formally established: lesson sediment + mechanical gate 双 layer codification 「一记一 enforce」(first-app PR #1933 4-source default value parity / second-app PR #1941 description-free read scope + service.py:845 bonus catch / third-app PR #1941 fix-forward sister tests 防 whole-file exclude 静默削弱 gate) - mini-pattern 19: spec lock pre-check grep main 实证 enum/contract assumption (architect own-up 升级版三层: spec→impl / impl→response_model / impl catch path→upstream raise points) - mini-pattern 20: PR adds response_model wire-up 必跑 model_validate(actual_ handler_return_shape) boundary gate (PR #1940 first-application demo) per architect dispatch msg=b6726ac9 + msg=420ca548 sediment trigger A 满足 (task #31 Phase A 4/4 done) 启动 + Phase B B1 lane huangheng owner. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(cr-checklist): fix cite accuracy NIT per Weston msg=7690b723 2 cite accuracy fixes (Weston framing CR catch): 1. response_model validation failure 状态码: 422 -> 500 - response_model validation fails 抛 FastAPI ResponseValidationError - 通常映射到 HTTP 500,不是 request body 校验的 422 - 影响 line 745 + line 850 描述 PR #1940 BLOCKER 时的状态码引用 2. GraphMergeSuggestionItem canonical schema 字段实证修正 - 原写: ... / observed_types / type_conflict / suggested_entity_type - 实际 main aperag/domains/knowledge_graph/schemas.py::GraphMergeSuggestionItem 不含这三字段 - A4 (PR #1940) 这些字段是 FE-derived display (FE 从 entities / suggested_target_entity / evidence_refs 推导),不是 PR #1935 backend projection - 影响 line 781 sect 4 Lesson #17 second-application demo 描述 per Weston PR #1943 framing CR (msg=7690b723) - sediment cite accuracy 要求把事实漂移修干净,避免 future onboarding reference 时 confuse 422/500 状态码语义 + backend/FE field source attribution。 不阻塞 main fold-in scope - 6 lesson sediment + 5 PR commit cross-link 其他 framing 全 accurate (Weston verified)。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
10 tasks
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
…field Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f / msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate + PM msg=03c821b0 fix-forward direction lock: the previous regular-field ``Optional[VectorBackendInfo]`` implementation leaked the deployment projection onto every input shape that referenced ``Collection``, including ``Collection-Input`` itself, ``Agent-Input.collections``, and ``CreateTurnRequest.collections``. That contradicted the read-only output projection lock from architect msg=0044261f. Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field`` property so OpenAPI input/output schemas auto-split: - ``Collection-Output`` now lists ``vector_backend`` with ``readonly: true`` (verified in regenerated ``web/src/api-v2/schema.d.ts``). - ``Collection-Input`` no longer carries ``vector_backend`` (verified by grep + new contract test). - ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` / ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``, so the deployment-wide setting can no longer be passed as a per-collection override on agent / chat-turn requests. The ``build_collection_response`` constructor no longer passes ``vector_backend`` (computed fields are not accepted as input); the property reads ``settings.vector_db_type`` lazily on each serialization. Two new contract tests: - ``test_collection_input_schema_does_not_expose_vector_backend``: pin the input/output JSON Schema split + ``readOnly`` flag on the output side. Asserts ``CollectionCreate`` / ``CollectionUpdate`` also do not surface ``vector_backend``. - ``test_collection_constructor_ignores_vector_backend_input``: defensive — even if a malicious caller stuffs ``vector_backend`` into a ``model_validate`` payload, Pydantic ignores it and the computed property still reflects the deployment setting. Sediment: cuiwenbo own-up CR miss — implement-time only verified the ``CollectionConfig`` placement (one defense layer) and missed the ``Collection`` self-reuse-as-input second layer. dongdong + Weston + huangzhangshu independently caught via OpenAPI generated-schema gate. mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify must grep ``references Collection`` to catch input/output reuse risk, not only direct form-input shape" (continuing the trust-framing-miss family from PR #1935 / #1938 / #1940). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu
added a commit
that referenced
this pull request
Apr 30, 2026
…ty matrix (#1949) * feat(collection): task #61 P1-D3 — vector backend identity + capability matrix Project the deployment-wide ``settings.vector_db_type`` onto every collection detail read so the FE can render a "what does this vector backend actually support" panel without per-collection migration or runtime probe. Backend (output-only projection): - ``aperag/schema/common.py``: ``VectorBackendCapabilities`` + ``VectorBackendInfo`` + ``_STATIC_VECTOR_BACKEND_CAPABILITIES`` dict + ``project_vector_backend_info()`` helper. - ``aperag/domains/knowledge_base/schemas.py:Collection``: add ``vector_backend: Optional[VectorBackendInfo]``. **Intentionally NOT on ``CollectionConfig``** so the OpenAPI ``CollectionCreate`` / ``CollectionUpdate`` input shapes do not let callers mistake a deployment-wide setting for a per-collection editable knob (per dongdong msg=c2593fdd + PM msg=caf7e4df + architect msg=0044261f read-only projection lock). - ``aperag/domains/knowledge_base/service/collection_service.py``: populate ``vector_backend`` in ``build_collection_response`` from ``settings.vector_db_type``; ``None`` for unknown backends so the FE can render a placeholder without a hard failure. Cross-PR consistency with task #83 / PR #1948 (Bryce, vector adapter behavior fixes): - Bryce's connector-layer ``BACKEND_CAPABILITIES`` ClassVar declares 2 truth flags (``supports_atomic_batch_upsert`` + ``supports_legacy_mode``); this PR's schema-layer Pydantic model mirrors those values plus a 3rd schema-layer-only flag ``supports_filter_or_with_empty_parts`` which is uniformly False across adapters after task #83 P1-V3 (translator-level defense-in-depth rejects empty Or parts). - The 3rd flag stays in the schema so the FE can declare the uniform reject explicitly per spec § 2.3 P1-D3 「显示『允许差异但显式』」 — Lesson #17 backend 收敛 contract simple-stable family pattern (cite PR #1930 SearchHit normalize, PR #1935 GraphMergeSuggestionItem projection layer). Mechanical gate (per Lesson #18 lesson-sediment + mechanical-gate 双 layer codification — first established by chenyexuan PR #1933 / PR #1941, then PR #1940 ``model_validate`` boundary): 13-case unit suite in ``tests/unit_test/contracts/test_vector_backend_capability_matrix.py`` pins each capability flag, normalizes inputs, and round-trips Pydantic ``model_dump`` so future drift between schema, projection helper, and FE-consumed shape fails fast at unit-test time. FE (read-only display): - ``web/src/features/collection/types.ts``: typed mirrors ``VectorBackendInfo`` / ``VectorBackendCapabilities`` / ``VectorBackendType``. - ``web/src/app/workspace/collections/[collectionId]/settings/collection-vector-backend-card.tsx``: new component that surfaces backend identity + capability matrix in the collection settings page (above the edit form). dongdong picks up rendering polish (responsive + dark mode + final copy) on the same PR per the joint A4-style split (cuiwenbo contract layer + dongdong rendering polish + CR pair). - ``web/src/i18n/{en-US,zh-CN}/page_collections.json``: copy strings. - ``web/src/api-v2/schema.d.ts`` regenerated via ``yarn api:v2:types``. Local verification: - ``uv run --extra test pytest tests/unit_test/contracts/test_vector_backend_capability_matrix.py tests/unit_test/contracts/test_collection_v2_openapi_contract.py -q`` → 23 passed - ``make openapi-check`` → ok - ``yarn type-check --pretty false`` → 0 new errors on this PR's files (pre-existing graph-lab cosmograph + agent-runtime errors unchanged) - ``yarn lint --quiet`` → 0 warnings/errors - ``yarn i18n:check`` → ok - ``git diff --check`` → ok Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(collection): task #87 P1-D3 — convert vector_backend to computed_field Per dongdong msg=fa88e97b BLOCKER + huangzhangshu msg=5b7cba0f / msg=ee6e7af2 + Weston msg=057f642c re-final framing verify gate + PM msg=03c821b0 fix-forward direction lock: the previous regular-field ``Optional[VectorBackendInfo]`` implementation leaked the deployment projection onto every input shape that referenced ``Collection``, including ``Collection-Input`` itself, ``Agent-Input.collections``, and ``CreateTurnRequest.collections``. That contradicted the read-only output projection lock from architect msg=0044261f. Move ``Collection.vector_backend`` to a Pydantic v2 ``@computed_field`` property so OpenAPI input/output schemas auto-split: - ``Collection-Output`` now lists ``vector_backend`` with ``readonly: true`` (verified in regenerated ``web/src/api-v2/schema.d.ts``). - ``Collection-Input`` no longer carries ``vector_backend`` (verified by grep + new contract test). - ``CollectionCreate`` / ``CollectionUpdate`` / ``Agent-Input.collections`` / ``CreateTurnRequest.collections`` all inherit the cleaned ``Collection-Input``, so the deployment-wide setting can no longer be passed as a per-collection override on agent / chat-turn requests. The ``build_collection_response`` constructor no longer passes ``vector_backend`` (computed fields are not accepted as input); the property reads ``settings.vector_db_type`` lazily on each serialization. Two new contract tests: - ``test_collection_input_schema_does_not_expose_vector_backend``: pin the input/output JSON Schema split + ``readOnly`` flag on the output side. Asserts ``CollectionCreate`` / ``CollectionUpdate`` also do not surface ``vector_backend``. - ``test_collection_constructor_ignores_vector_backend_input``: defensive — even if a malicious caller stuffs ``vector_backend`` into a ``model_validate`` payload, Pydantic ignores it and the computed property still reflects the deployment setting. Sediment: cuiwenbo own-up CR miss — implement-time only verified the ``CollectionConfig`` placement (one defense layer) and missed the ``Collection`` self-reuse-as-input second layer. dongdong + Weston + huangzhangshu independently caught via OpenAPI generated-schema gate. mini-pattern 19 layer 5 candidate: "Pydantic schema placement verify must grep ``references Collection`` to catch input/output reuse risk, not only direct form-input shape" (continuing the trust-framing-miss family from PR #1935 / #1938 / #1940). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): consolidate vector_backend_capability_matrix imports for ruff Combine the two from aperag.schema.common import ... statements into a single block so ruff's import organization rule is satisfied. No code-behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): apply ruff format to vector_backend test + common.py Run `uv run ruff format` on ApeRAG/aperag/schema/common.py and ApeRAG/tests/unit_test/contracts/test_vector_backend_capability_matrix.py so `make lint` (`ruff format --check`) passes. Pure formatting; no behavior change. Other unrelated files reverted to keep this PR scope clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Task #31 Phase A2 implementation:
GraphCurationSuggestionStatuswith async apply states:APPLY_PENDING,APPLYING,APPLIED,APPLY_FAILEDDISMISSEDbecause current main did not actually have it, while PR docs(task-31): graph node merge & background suggestion task spec v1 #1931 spec/A4 already treat dismiss as part of the review queue status surfacegraph_curation_suggestions.evidence_refscolumn + model/service persistence and projection/graphs/merge-suggestionsread/run/action so OpenAPI + FE typed schema expose the curation suggestion contractweb/src/api-v2/schema.d.tsACCEPTEDlegacy zero-write guardNotes
statusis stored asString(50), not a native PG enum, so the Alembic migration only needs to addevidence_refs. The code/schema update is the status enum extension.ACCEPTEDremains the legacy sync terminal status inGraphCurationService._accept_and_supersede; the new async path must useAPPLY_PENDING -> APPLYING -> APPLIED|APPLY_FAILEDand is guarded bytest_accepted_status_write_is_legacy_service_only.Verification
uv run pytest tests/unit_test/graph_curation/test_service.py tests/unit_test/contracts/test_graph_curation_suggestion_status_contract.pyuv run pytest tests/unit_test/contracts/test_openapi_spec.py tests/unit_test/contracts/test_web_typed_api_contract.pyuv run ruff check aperag/domains/knowledge_graph/api/routes.py aperag/domains/knowledge_graph/db/models.py aperag/domains/knowledge_graph/schemas.py aperag/graph_curation/service.py aperag/indexing/merge_candidate_detector.py tests/unit_test/graph_curation/test_service.py tests/unit_test/contracts/test_graph_curation_suggestion_status_contract.py aperag/migration/versions/20260430070500-7a2b1c3d4e5f.pyuv run ruff format --check aperag/domains/knowledge_graph/api/routes.py aperag/domains/knowledge_graph/db/models.py aperag/domains/knowledge_graph/schemas.py aperag/graph_curation/service.py aperag/indexing/merge_candidate_detector.py tests/unit_test/graph_curation/test_service.py tests/unit_test/contracts/test_graph_curation_suggestion_status_contract.py aperag/migration/versions/20260430070500-7a2b1c3d4e5f.pymake openapi-checkuv run alembic -c aperag/alembic.ini headsyarn api:v2:typesyarn type-check