feat(collection): Wave 10 Chunks C+D — regen service + 2 OpenAPI endpoints#1786
Conversation
…de) (#1790) * docs(modularization): Wave 10 collection summary design — c1-extend-hide locked Design for collection summary auto-regen with per-user hidden summary bot. Captures three-way ratify (architect + Bryce + huangheng) + earayu2 final pin on (c1-extend-hide + 防御 lazy fallback) approach. PR #1786 amend implements. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: fix v1 → v2 OpenAPI path in §9 endpoints Per W9-1 V3→V2 rename, ApeRAG OpenAPI base is /api/v2. huangheng CR nit fold-in (msg=032421e1). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
… C+D) Wave 10 §K.13 Chunks C+D — collection auto-description two-stage regen pipeline. ## New module: ``collection_regen_service.py`` * **``regen_summary(collection_id)``** — Stage 1 with cluster-level lease + 3-tier fallback chain (per huangheng BLOCKER #2): Tier 1: ``_invoke_summary_agent`` — agent-runtime free-explore (scaffolded, returns ``None`` today; Wave 10.1 follow-up wires the headless ``agent_runtime_manager.launch_turn``) Tier 2: ``_invoke_summary_chunks_fallback`` — chunks.jsonl 1st substantive chunk + LLM call (scaffolded, Wave 10.1) Tier 3: transient skip (no writeback, reconciler retries next sweep) * **``regen_description(collection_id)``** — Stage 2 cheap LLM derive from existing ``Collection.summary``. Single LLM call, ~10s. * **Lease primitives** — ``_try_acquire_lease`` / ``_release_lease`` use ``Collection.regen_lease_owner`` + ``regen_lease_expires_at`` (Chunk A schema). Atomic UPDATE acquires, expired leases self- reclaim, concurrent instances race exactly one winner per collection. * **Quality gates** — ``is_valid_summary`` / ``is_valid_description`` enforce length + LLM-error blacklist + alphabetic-char threshold (per huangheng N4 dual-gate). * **Language detection** — ``_detect_language`` returns ``"zh"`` / ``"en"`` based on CJK-vs-Latin char count, drives Stage 2 prompt template selection (per huangheng N1 language-aware). ## New module: ``regen_constants.py`` Pinned thresholds + prompts + tool subset: * ``BULK_THRESHOLD = 10`` / ``DEBOUNCE_WINDOW = 60min`` / ``MIN_STALE_AGE = 10min`` (per earayu2 msg=1b395cae 中保守). * ``LEASE_TTL = 900s``. * ``SUMMARY_AGENT_SYSTEM_PROMPT`` — hard-coded (not user-configurable per huangheng Q1.2), language-aware output (zh 5000-10000 / en 3000-7000), 13 read-only tools subset (per huangheng Q1.4). * ``DESCRIPTION_DERIVE_PROMPT_ZH`` / ``DESCRIPTION_DERIVE_PROMPT_EN`` — Stage 2 derive prompts. ## New OpenAPI endpoints (in ``api/routes.py``) * **``POST /api/v2/collections/{id}/summary/regen``** — Stage 1 trigger; 202 Accepted, 404 collection-not-found, 409 lease-busy, 403 permission-denied. Dispatches as fire-and-forget asyncio task. * **``POST /api/v2/collections/{id}/description/regen``** — Stage 2 trigger; **400 if summary IS NULL** (must regen summary first — per huangheng API contract honest reject), 202 otherwise. ## New schema: ``CollectionRegenTriggerResponse`` 202 Accepted envelope: ``collection_id`` + ``stage`` (summary | description) + ``task_id`` + ``estimated_completion_seconds``. ## Tests (14 new pure-Python unit tests) ``tests/unit_test/knowledge_base/test_collection_regen_service.py`` covers quality-gate + language-detection contracts. Lease + DB flow + OpenAPI integration are covered by Chunk E reconciler tests + Chunk G e2e narrative test (those need live DB fixtures). ## What lands in subsequent chunks * **Chunk E**: ``reconcile_collection_descriptions_hook`` wired into the 30s reconciler — scans ``Collection`` rows by doc-change delta and dispatches Stage 1 + Stage 2 per design pseudocode. * **Chunk F**: frontend collection-form removes description input + adds placeholder. * **Chunk G**: e2e narrative test mirror Wave 7 task #11 pattern. * **Wave 10.1 follow-up**: fill in Tier 1 agent-runtime invocation + Tier 2 chunks.jsonl fallback (currently scaffolded with explicit ``return None`` so the 3-tier fallback contract still holds — every regen today returns the Tier 3 transient skip until 10.1 ships, which is honest behaviour and doesn't lie about progress). ## 12-invariant cross-check * **#10 DB column cap**: enforced via prompt + quality-gate (5000-10000 / 200-500 chars) per spec §K.13. * **#12 grep-zero**: no LightRAG references introduced. * All other invariants: n/a (pure new code in collection-regen lane). ## 4-pattern + 11 mini-pattern pre-check * Pattern 2 (state binding): lease columns from Chunk A wired into service primitives + reconciler hook (Chunk E). * Mini #4 (DTO names): reuse ``Collection`` ORM, no new DTO. * Mini #5 (dependency interface signatures): ``LLMCall = Callable[[str], Awaitable[str]]`` matches ``aperag.indexing.llm.build_collection_llm_callable`` shape (sync wrapped via ``asyncio.to_thread``). * Mini #7 (grep before adding X): grepped existing services for similar regen patterns — none, this is a clean new domain service. * Mini #10 (trigger 3-scenario): the 3-tier fallback chain explicitly enumerates agent-failure / chunks-failure / input-not-ready paths. ## Simple-stable 4-guardrail * #1 不无限扩范围: 2 new modules + 2 endpoints + 1 schema, scaffolded Tier 1+2 to avoid coupling to agent-runtime headless API formalisation (huangheng N2 sediment, deferred to Wave 11). * #2 尽快上线: independent ship, unblocks Chunk E (reconciler) which needs the regen_summary / regen_description entry points. * #3 简单稳定: clear 3-tier fallback contract, lease UNIQUE constraint via UPDATE atomicity (no extra serialisation infrastructure). * #4 私有化部署免维护: 0 operator config — thresholds + prompts hard-coded per design lock; LLM uses collection's existing completion model. ## Test plan - [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped - [x] ``ruff format --check`` / ``ruff check`` — clean on touched files - [ ] e2e CI gates (post-push) - [ ] CR by @huangheng - [ ] Architect ratify after CI green
Wave 10 §K.13 — per-user hidden summary bot for collection regen Stage 1 agent-runtime free-explore. Per ratified design (PR #1790 + thread msg=d6f5e819 ratify): * **(c1-extend-hide)** main path: register-time creation via ``_BotInitOpsAdapter`` extension (same transaction as default agent bot, both succeed-or-both-fail at register time). * **Defense-in-depth lazy fallback**: ``collection_regen_service`` Stage 1 will use ``get_or_create_summary_bot_for_user`` so a user that registered before this PR landed (or whose register hook silently failed per ``user_manager.py:137``) still gets a bot on first regen attempt. ## Schema (1 column + 1 partial-unique index + Python enum value + backfill) * ``Bot.is_system: Boolean default False`` — mirrors existing ``ApiKey.is_system`` precedent (governance/db/models.py:98). UI listings filter ``is_system=True`` rows out so the summary bot is invisible to end users. * Partial unique index ``(user, type, is_system)`` over active (``gmt_deleted IS NULL``) ``is_system=TRUE`` rows — defends against race conditions during register-time + lazy-create. * ``BotType.SUMMARY = "summary"`` — Python enum addition only; ``Bot.type`` is already ``VARCHAR(50)`` (per ``_enum_column(BotType)`` shape) so 0 DDL change for the enum value. * Alembic data migration ``f2c3d4e5b6a8`` backfills one summary bot row per existing user that doesn't already have one. Idempotent ``WHERE NOT EXISTS`` guard so the migration is safe to re-run. ## Register-hook extension (``aperag/app.py``) ``_BotInitOpsAdapter.create_default_bot_for_user`` now also calls ``_create_summary_bot_for_user`` after creating the default agent bot. The new method: * Bypasses ``bot_service.create_bot`` because (a) ``BotCreate.type`` is ``Literal["agent"]`` so the public schema cannot express ``"summary"``, and (b) system bots intentionally skip user-quota / user-visibility logic. * Uses ``get_async_session`` + direct ORM insert; rollback on ``IntegrityError`` so race-conditioned concurrent registers / the backfill migration don't crash. * Tool subset enforcement (13 read-only tools) lives in agent runtime, NOT on the Bot row — keeps schema minimal per simple-stable directive #1. ## What's NOT in this commit (deferred to subsequent commits in this PR) This commit is the **bot infrastructure scaffolding only**. The remaining Wave 10 §K.13 work (Tier 1 agent runtime invocation + Tier 2 chunks.jsonl real impl + ``get_or_create_summary_bot_for_user`` service in ``collection_regen_service`` + supplementary #1 tests + supplementary #2 silent-failure fix + Chunk E reconciler hook) lands in subsequent commits on this same PR before merge per earayu2 ratify "一次到位". ## 12-invariant + 4-pattern + simple-stable 4-guardrail * #10 DB column cap: n/a this commit (no new variable-length data). * #12 grep-zero LightRAG: clean. * Pattern 2 (state binding): ``Bot.is_system`` ORM + alembic both updated atomically in this commit. * Mini #4 (DTO names): no new DTO, reuse existing ``Bot`` / ``BotType`` / ``BotCreate``. * Mini #5 (dependency interface signatures): grepped existing register-time call site (``aperag/app.py:171-186``); extended in place rather than carving a new init op. * Mini #7 (grep before adding X-similar): ``ApiKey.is_system`` precedent verified before adding ``Bot.is_system``. * simple-stable #1 不无限扩范围: 1 column + 1 index, no new service / no new endpoint / no schema for tool subset. * simple-stable #4 私有化部署免维护: backfill data migration ensures existing deployments get summary bots automatically on upgrade — no operator setup. ## Test plan - [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped - [x] ``alembic upgrade e1a2b3c4d5f6:f2c3d4e5b6a8 --sql`` emits ``ALTER TABLE bot ADD COLUMN is_system`` + 2 ``CREATE INDEX`` + ``INSERT INTO bot`` backfill SELECT - [x] ``ruff check`` clean - [ ] e2e CI gates (post-push) - [ ] Subsequent commits (Tier 1+2 + tests + Chunk E) before CR
…ave 10 §K.13) Wave 10 §K.13 — defense-in-depth fallback for the per-user summary bot whose main creation point is the register-time hook ``_BotInitOpsAdapter.create_default_bot_for_user``. The register hook in ``user_manager.py:137`` only ``log.error`` on init failures; it does NOT roll back the user. So a user can register successfully but lack their summary bot. ``get_or_create_summary_bot_for_user`` fetches by ``(user, type=SUMMARY, is_system=True, gmt_deleted IS NULL)`` and lazy-creates if missing. The partial unique index (Chunk B schema migration ``f2c3d4e5b6a8``) handles concurrent race conditions: one caller wins the INSERT, the other rollbacks + re-fetches the winner's row. This commit also wires ``_invoke_summary_agent`` to call the helper so the bot infrastructure is reachable end-to-end. The actual ``agent_runtime_manager.launch_turn`` invocation still falls through to Tier 2 (next commit on this PR ships the full launch_turn flow mirroring ``aperag/domains/evaluation/worker.py:114-180``). ## Test plan - [x] ``uv run pytest tests/unit_test/knowledge_base/`` — 14 pass - [x] ``ruff check`` clean - [ ] CR by @huangheng (queue held until full Wave 10 amend ready)
Replace the stubbed ``_invoke_summary_agent`` with a full ``agent_runtime_manager.launch_turn`` integration mirroring ``aperag/domains/evaluation/worker.py:114-180`` (real Bot/Chat/AgentTurn ORMs, fire-and-forget launch, terminal-status polling with SUMMARY_TIMEOUT_SECONDS, UIMessage-store text extraction). Replace the stubbed ``_invoke_summary_chunks_fallback`` with a chunks.jsonl read + single-LLM-call path: pulls active vector DocumentIndex source_paths for documents in the collection, stitches representative chunks up to CHUNKS_FALLBACK_MAX_CHARS, and calls the collection LLM with a Tier-2 prompt that mirrors the agent path's voice and length contract. Fix ``_default_llm_factory``: ``build_collection_llm_callable`` already returns an async callable; surface it directly instead of wrapping with ``asyncio.to_thread`` (which would have run the coroutine in a worker thread). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…0 §K.13) Wrap the MCP toolset in a ``FilteredToolset`` when ``bot.type`` matches a hardcoded subset. ``BotType.SUMMARY`` gets 13 read-only tools (vector_search / fulltext_search / graph_search / read_document / get_collection_metadata / etc.); other bot types pass through with the full toolset. The mapping lives in the runtime layer (per design doc §4) so the LLM cannot route around it via system-prompt edits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ave 10 §K.13) Supplementary #2 — explicit operator override paths (``POST /collections/{id}/summary/regen`` + ``description/regen``) now await the regen result inline and surface 503 + structured error envelope when all tiers return invalid output, instead of the misleading 202 fire-and-forget. The reconciler path keeps log+skip semantics (next sweep retries). Chunk E — wire ``reconcile_collection_descriptions_hook`` into ``run_reconcile_loop``. Three scenarios covered: * Stage 1 missing-summary: collections with NULL summary * Stage 1 stale-summary: a doc was added/edited/deleted after ``summary_updated_at`` AND the latest edit is past MIN_STALE_AGE * Stage 2 derive: summary newer than description Lease-busy / lease-expired / soft-delete are filtered at SQL selection time so the hook only dispatches collections that are actually eligible. Stage 1 hits exclude themselves from Stage 2 to avoid wasted regen pairs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…K.13)
29 new tests across 3 files covering:
* Tool subset mapping + FilteredToolset wrapping
(``test_summary_bot_tool_subset.py``, 7 tests)
* regen_summary state machine: lease busy / collection deleted /
all-tiers-invalid / Tier 1 success / Tier 1→Tier 2 fallthrough
+ regen_description state machine: summary missing / valid LLM
output. Plus the chunk picker + UIMessage text extractor pure
helpers. (``test_collection_regen_supplementary.py``, 12 tests)
* Reconciler hook scan: missing-summary / lease-held / lease-
expired / soft-deleted / doc-edit-stale / doc-edit-too-recent /
description-stale / description-current / Stage 1 excludes
Stage 2. (``test_collection_regen_reconciler.py``, 9 tests)
Companion file ``test_collection_regen_service.py`` keeps the
quality-gate + language-detection contract pins; this set adds
orchestration coverage so the merge gate (huangheng N4) is met.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
a82fb78 to
cfe290d
Compare
…hs (Wave 10 §K.13)
The Wave 10 hidden per-user summary bot (`is_system=True`,
`type=summary`) was leaking through ``GET /api/v2/bots``: the
endpoint serialised every row through ``Bot`` Pydantic schema, but
``Bot.type: Literal["knowledge", "common", "agent"]`` does not
include ``"summary"`` → 400 ValidationError → e2e-http-provider
hurl test ``12_bot.hurl`` failed.
Fix at the ``db_ops`` layer with a default-deny ``exclude_system``
kwarg so all four user-facing API paths share one guard:
* ``query_bots(users, exclude_system=True)`` — list path
* ``query_bot(user, bot_id, exclude_system=True)`` — get / update /
delete paths
The Pydantic ``Bot.type`` Literal stays unchanged: summary bots are
backend implementation detail and must never reach the public API
surface.
Internal regen plumbing (``get_or_create_summary_bot_for_user``)
queries the ``Bot`` ORM directly via raw SQLAlchemy ``select``, so
the default-deny filter does not block legitimate internal
lookups.
Adds ``tests/unit_test/conversation/test_bot_service_filter_system_bots.py``
with 5 enumeration tests: list excludes, get returns 404, update
returns 404 (no mutation), delete silently ignores (idempotent
no-op), get returns user bot unchanged.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CR by @huangheng — 🟢 LGTM ✅ (round 2 post-rebase + post-BLOCKER fix)CI 全绿 (per own-up #10 SOP `gh pr checks 1786` explicit verify)``` Round 2 verificationRebase verify ✅
BLOCKER #1 fix verify ✅ (commit `bd46f92c`)
Substantive verification (round 1 retained)
12-invariant + 4-pattern + 11 mini-pattern + simple-stable 4-guardrail
simple-stable 4-guardrail
Wave 10 W9-1 conflict check✅ Doesn't break any of #1772-#1790 (alias_redirect_store / cicd-push.yml / Makefile / openapi paths / V3→V2 / FE bot-types / vector batching / model display_name / dev/turbopack / i18n / dynamic entity types / design docs) Verdict🟢 LGTM ✅ round 2 — Wave 10 一次到位 ship: schema + Bot infrastructure + Tier 1 (real agent runtime) + Tier 2 (chunks.jsonl) + tool subset + 3-tier fallback + cluster lease + reconciler + 503 silent fix + default-deny filter + 47 new tests. Architect own-up #16 (appendix A grep miss) + #17 (hidden entity API surface enumeration) sediment to mini-pattern #12. Both lessons captured in this PR's implementation. Per agent ratify lane SOP (own-up #10): @符炫炜 ratify after explicit `gh pr checks 1786` re-verify (already shown 10/10 pass) → `gh pr merge 1786 --squash` (no `--auto` shortcut). Wave 10 PR #1786 ready for architect ratify merge. |
|
Architect ratify ✅ — three-section hard-gate verdict (12-invariant + 4-pattern/12 mini-pattern incl new #12 default-deny + simple-stable 4-guardrail) all pass. huangheng round 2 LGTM + CI 10/10 + Tier 1 canonical evaluation/worker.py pattern verified + own-up #17 default-deny enumeration fixed. Proceeding to squash merge per own-up #10 explicit verify SOP. |
…+ 4-language radio (#1793) Wave 10 §K.13 makes ``Collection.description`` (short) and ``Collection.summary`` (long) auto-generated by the backend regen pipeline. The collection create/settings form must reflect that: * Create page (``action="add"``): drop the description input box — the user no longer types it. * Settings page (``action="edit"``): replace the editable description textarea with a read-only display + "Regenerate description" button that calls ``POST /api/v2/collections/{id}/description/regen``. Add a parallel summary read-only block + "Regenerate summary" button calling ``POST /api/v2/collections/{id}/summary/regen``. * Wave 11 follow-up: expand the language radio from 2 (zh-CN / en-US) to all 4 backend-supported locales (zh-CN / en-US / ja-JP / ko-KR). The endpoints await regen inline (not 202 fire-and-forget) and surface 503 on transient skip; the FE catches the typed response body and surfaces a success toast on the 200 path. Schema: ``description`` becomes optional in the form so existing edit-mode round-trips still validate. The Pydantic backend Bot schema's ``Literal`` is unchanged — system bots stay hidden behind the default-deny db_ops filter shipped in PR #1786. Files: * web/src/features/collection/types.ts — drop removed ``CollectionSummaryTriggerResponse``, add ``CollectionRegenTriggerResponse`` * web/src/features/collection/client-api.ts — new ``regenCollectionSummary`` / ``regenCollectionDescription`` * web/src/app/workspace/collections/collection-form.tsx — replace description input with two read-only blocks + regen buttons, expand language radio, mark description schema optional * web/src/i18n/{zh-CN,en-US}/page_collections.json + merged JSONs: new keys for the auto-generated hint text, regen button labels, success toasts, and ja-JP / ko-KR language labels * web/src/api-v2/schema.d.ts — regenerated from openapi-typescript Local gates: `pnpm exec next build` clean (Wave 10 + Wave 11 admin TS noise pre-existing, baseline preserved); `node scripts/i18n-check.mjs` passes for both locales; `pnpm exec eslint` 0 errors on edited files. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…rrative (#1792) * test(W10-task#6 Chunk G): collection summary/description regen e2e narrative Wave 10 §K.13 Chunk G — end-to-end narrative validation for the summary/description regen flow that chunked PRs A+B (#1783) + C+D+E (#1786) + design (#1790) compose into one user journey. What lands * Single 9-step narrative test in ``tests/integration/test_w10_e2e_summary_description_regen.py``. * Layer 2 env-gated (``RUN_W10_E2E_NARRATIVE=1``); default pytest stays fast (9 collected + 9 skipped). * Module-scoped fixture seeds one synthetic Collection + 3 Documents so the narrative shares state via the production data plane (Postgres ``Collection`` row + lease columns), not Python globals. Step coverage step 1 freshly-seeded Collection has summary / description / *_updated_at all NULL — Wave 10 hard-cut shipped these as nullable Text. step 2 ``regen_summary`` writes ``Collection.summary`` + ``summary_updated_at`` atomically, releases the lease (Tier 1 agent / Tier 2 chunks-fallback path validated by 200 success). step 3 ``is_valid_summary`` / ``is_valid_description`` reject empty, short, and LLM-refusal templates; pass substantive long text (quality gate per design §6.2). step 4 ``regen_description`` derives ``Collection.description`` from the now-populated ``summary``; cheap LLM path returns True and writes ``description_updated_at``. step 5 ``POST /api/v2/collections/{id}/summary/regen`` route exercised via ``regen_collection_summary_view.__wrapped__`` (the ``@audit`` decorator wraps the view); asserts ``CollectionRegenTriggerResponse`` shape + ``stage="summary"`` + uuid task_id. step 6 ``POST /description/regen`` on a fresh no-summary collection raises ``HTTPException(400)`` with ``"summary"`` in detail (design §9 + §10.4 — Stage 2 cannot run without input). step 7 ``reconcile_collection_descriptions_hook`` picks up a collection whose Document was edited past ``MIN_STALE_AGE`` and dispatches at least one regen task — proves the §K.13 Chunk E hook is wired into the reconciler main loop. step 8 Lease-busy state: writing ``regen_lease_owner`` + a far-future ``regen_lease_expires_at`` directly causes ``regen_summary`` to return False without overwriting the row (design §7 atomic semantics). step 9 Failure-mode fold-in (mirror Wave 7 task #11 step 9 + design §10.9): patching ``_default_llm_factory`` to raise makes ``regen_description`` return False; the row's ``description`` and ``description_updated_at`` are NOT mutated. Pins the no-silent-write contract end-to-end. 12-invariant table mostly n/a — narrative-correctness is the hard gate for an e2e PR; material invariants validated implicitly: * §10.1 lease atomic semantics — step 8 * §10.2 3-tier fallback chain — step 2 (agent / chunks-fallback path alive; transient-skip exercised by step 8 indirectly) * §10.4 API 400 reject when summary IS NULL — step 6 * §10.5 quality gate ``is_valid_summary`` / ``is_valid_description`` — step 3 * §10.6 trigger 三场景 (edit case end-to-end) — step 7 * §10.9 silent failure 修复 — step 9 4-pattern pre-check matrix * Pattern 1 v1: ``regen_summary`` / ``regen_description`` importable from ``aperag/domains/knowledge_base/service/collection_regen_service.py`` (Chunk C). ✅ * Pattern 1 v2: 6 ``Collection`` columns (Wave 10 Chunk A) are read by the narrative. ✅ * Pattern 2: ``reconcile_collection_descriptions_hook`` invocation return value is a non-zero ``dispatched`` count (Chunk E wired). ✅ * Pattern 3: route surface ``regen_collection_summary_view`` / ``regen_collection_description_view`` exposed on the knowledge_base router (Chunk D). ✅ simple-stable 4-guardrail * #1 不无限扩范围: one file, no production code change. * #2 先把功能做实: real Postgres + real provider — narrative validates production behaviour, not stubbed surface. * #3 简单稳定: one happy-path narrative + one 400-reject pin + one failure-mode step. Not a regression matrix. * #4 私有化部署免维护: env-var-gated; CI Wave 10 lane flips it on, local-dev stays fast by default. Local verification * ``uv run pytest tests/integration/test_w10_e2e_summary_description_regen.py --collect-only`` → 9 collected. * ``uv run pytest`` (default gate off) → 9 skipped. * ``uv run ruff check`` clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: ruff format Wave 10 e2e narrative test --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…found) (#1822) PR #1786 turned ``query_bot``'s default into ``exclude_system=True``, but ``collection_regen_service`` Stage 1 Tier 1 still routes through ``chat_service.create_chat`` + ``TurnService.get_chat_and_bot``, both of which call ``query_bot`` and additionally enforce ``bot.type == AGENT``. The hidden ``BotType.SUMMARY`` bot fails both gates, surfacing as ``Bot not found: bot…`` 404 toasts on the FE Regen button — Tier 2 fallback never runs because the exception propagates up out of ``_invoke_summary_agent``. Add a ``_allow_system_bot`` keyword on both methods (default ``False`` keeps the user-facing API safe). When ``True``, pass ``exclude_system=False`` to ``query_bot`` and accept SUMMARY in addition to AGENT — both share the agent-runtime infrastructure (the SUMMARY toolset is restricted by ``aperag/domains/agent_runtime`` based on ``bot.type``). ``_invoke_summary_agent`` now opts in. Tests: 6 new unit tests covering both default-deny and ``_allow_system_bot=True`` branches on ``create_chat`` / ``get_chat_and_bot`` / ``create_or_get_turn``. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Wave 10 Chunks C+D — collection auto-description two-stage regen pipeline service + OpenAPI endpoints. Builds on PR #1783 (Chunks A+B, merged).
Scaffolded — Tier 1 agent-runtime invocation + Tier 2 chunks.jsonl fallback both return
Nonetoday, forcing the 3-tier fallback chain to land on Tier 3 (transient skip). Wave 10.1 follow-up wires the actual agent-runtime headless call (per huangheng N2 sediment + design appendix A) and the chunks.jsonl read primitive. The lease + state machine + quality gates + 2 OpenAPI endpoints + reconciler integration interface are fully working.What's in this PR
aperag/domains/knowledge_base/service/collection_regen_service.py(new, ~270 LOC)regen_summary(collection_id)— Stage 1 with cluster lease + 3-tier fallbackregen_description(collection_id)— Stage 2 cheap LLM deriveis_valid_summary/is_valid_description— quality gates (per huangheng N4)_detect_language— language-aware Stage 2 prompt selection (per huangheng N1)_try_acquire_lease/_release_lease— atomic UPDATE on Collection rowaperag/domains/knowledge_base/service/regen_constants.py(new, ~140 LOC)2 OpenAPI endpoints in
aperag/domains/knowledge_base/api/routes.pyPOST /collections/{id}/summary/regen→ 202/404, dispatches Stage 1 fire-and-forgetPOST /collections/{id}/description/regen→ 400 if summary IS NULL (honest reject per huangheng API contract), 202/404 otherwiseCollectionRegenTriggerResponseschema202 envelope:
collection_id+stage+task_id+estimated_completion_seconds14 unit tests
tests/unit_test/knowledge_base/test_collection_regen_service.py— quality-gate + language-detection contracts.What lands subsequently
reconcile_collection_descriptions_hookwired into 30s reconciler12-invariant cross-check
4-pattern + 11 mini-pattern
CollectionORMLLMCall = Callable[[str], Awaitable[str]]matchesbuild_collection_llm_callableshapeSimple-stable 4-guardrail
Test plan
uv run pytest tests/unit_test/— 1186 pass + 15 skippedruff format --check/ruff check— clean🤖 Generated with Claude Code