Skip to content

docs(modularization): Wave 10 collection summary design (c1-extend-hide)#1790

Merged
earayu merged 2 commits into
mainfrom
docs/wave10-collection-summary-design
Apr 28, 2026
Merged

docs(modularization): Wave 10 collection summary design (c1-extend-hide)#1790
earayu merged 2 commits into
mainfrom
docs/wave10-collection-summary-design

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 28, 2026

Summary

关键决策

  • Schema 极小:Bot.is_system Boolean + (user, type, is_system) unique index + BotType.SUMMARY Python enum 加成员(0 DB schema 改)
  • Register hook 复用 _BotInitOpsAdapter.create_default_bot_for_user,加 sibling method 同事务建 summary bot
  • Tool subset 走 bot.type=summary → agent runtime hardcoded 13 read-only tools mapping
  • 防御 lazy fallback 自愈 register hook 失败 edge case (user_manager.py:137 不 rollback user)
  • Two-stage:agent → summary(canonical) → 派生 → description(短)
  • 三层 fallback: Tier 1 agent runtime → Tier 2 chunks.jsonl → Tier 3 transient skip

Test plan

🤖 Generated with Claude Code

earayu and others added 2 commits April 28, 2026 17:44
…ide locked

Design for collection summary auto-regen with per-user hidden summary bot.
Captures three-way ratify (architect + Bryce + huangheng) + earayu2 final pin
on (c1-extend-hide + 防御 lazy fallback) approach. PR #1786 amend implements.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per W9-1 V3→V2 rename, ApeRAG OpenAPI base is /api/v2.
huangheng CR nit fold-in (msg=032421e1).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 28, 2026
Wave 10 §K.13 — per-user hidden summary bot for collection regen
Stage 1 agent-runtime free-explore. Per ratified design (PR #1790
+ thread msg=d6f5e819 ratify):

* **(c1-extend-hide)** main path: register-time creation via
  ``_BotInitOpsAdapter`` extension (same transaction as default
  agent bot, both succeed-or-both-fail at register time).
* **Defense-in-depth lazy fallback**: ``collection_regen_service``
  Stage 1 will use ``get_or_create_summary_bot_for_user`` so a
  user that registered before this PR landed (or whose register
  hook silently failed per ``user_manager.py:137``) still gets a
  bot on first regen attempt.

## Schema (1 column + 1 partial-unique index + Python enum value + backfill)

* ``Bot.is_system: Boolean default False`` — mirrors existing
  ``ApiKey.is_system`` precedent (governance/db/models.py:98). UI
  listings filter ``is_system=True`` rows out so the summary bot
  is invisible to end users.
* Partial unique index ``(user, type, is_system)`` over active
  (``gmt_deleted IS NULL``) ``is_system=TRUE`` rows — defends
  against race conditions during register-time + lazy-create.
* ``BotType.SUMMARY = "summary"`` — Python enum addition only;
  ``Bot.type`` is already ``VARCHAR(50)`` (per
  ``_enum_column(BotType)`` shape) so 0 DDL change for the
  enum value.
* Alembic data migration ``f2c3d4e5b6a8`` backfills one summary
  bot row per existing user that doesn't already have one. Idempotent
  ``WHERE NOT EXISTS`` guard so the migration is safe to re-run.

## Register-hook extension (``aperag/app.py``)

``_BotInitOpsAdapter.create_default_bot_for_user`` now also calls
``_create_summary_bot_for_user`` after creating the default agent
bot. The new method:

* Bypasses ``bot_service.create_bot`` because (a) ``BotCreate.type``
  is ``Literal["agent"]`` so the public schema cannot express
  ``"summary"``, and (b) system bots intentionally skip user-quota /
  user-visibility logic.
* Uses ``get_async_session`` + direct ORM insert; rollback on
  ``IntegrityError`` so race-conditioned concurrent registers /
  the backfill migration don't crash.
* Tool subset enforcement (13 read-only tools) lives in agent
  runtime, NOT on the Bot row — keeps schema minimal per
  simple-stable directive #1.

## What's NOT in this commit (deferred to subsequent commits in this PR)

This commit is the **bot infrastructure scaffolding only**. The
remaining Wave 10 §K.13 work (Tier 1 agent runtime invocation +
Tier 2 chunks.jsonl real impl + ``get_or_create_summary_bot_for_user``
service in ``collection_regen_service`` + supplementary #1 tests +
supplementary #2 silent-failure fix + Chunk E reconciler hook)
lands in subsequent commits on this same PR before merge per
earayu2 ratify "一次到位".

## 12-invariant + 4-pattern + simple-stable 4-guardrail

* #10 DB column cap: n/a this commit (no new variable-length data).
* #12 grep-zero LightRAG: clean.
* Pattern 2 (state binding): ``Bot.is_system`` ORM + alembic both
  updated atomically in this commit.
* Mini #4 (DTO names): no new DTO, reuse existing ``Bot`` /
  ``BotType`` / ``BotCreate``.
* Mini #5 (dependency interface signatures): grepped existing
  register-time call site (``aperag/app.py:171-186``); extended in
  place rather than carving a new init op.
* Mini #7 (grep before adding X-similar): ``ApiKey.is_system``
  precedent verified before adding ``Bot.is_system``.
* simple-stable #1 不无限扩范围: 1 column + 1 index, no new
  service / no new endpoint / no schema for tool subset.
* simple-stable #4 私有化部署免维护: backfill data migration
  ensures existing deployments get summary bots automatically on
  upgrade — no operator setup.

## Test plan

- [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped
- [x] ``alembic upgrade e1a2b3c4d5f6:f2c3d4e5b6a8 --sql`` emits
      ``ALTER TABLE bot ADD COLUMN is_system`` + 2 ``CREATE INDEX``
      + ``INSERT INTO bot`` backfill SELECT
- [x] ``ruff check`` clean
- [ ] e2e CI gates (post-push)
- [ ] Subsequent commits (Tier 1+2 + tests + Chunk E) before CR
@earayu
Copy link
Copy Markdown
Collaborator Author

earayu commented Apr 28, 2026

Architect ratify ✅ doc-only PR — three-way design lock (architect + Bryce + huangheng) + earayu2 final ratify (msg=d6f5e819) all captured. CR LGTM (huangheng msg=032421e1) + v1→v2 nit fold-in commit 6f158b1. CI 10/10 green. Proceeding to merge per own-up #10 explicit verify SOP.

@earayu earayu merged commit d99e82b into main Apr 28, 2026
10 checks passed
@earayu earayu deleted the docs/wave10-collection-summary-design branch April 28, 2026 09:55
earayu added a commit that referenced this pull request Apr 28, 2026
Wave 10 §K.13 — per-user hidden summary bot for collection regen
Stage 1 agent-runtime free-explore. Per ratified design (PR #1790
+ thread msg=d6f5e819 ratify):

* **(c1-extend-hide)** main path: register-time creation via
  ``_BotInitOpsAdapter`` extension (same transaction as default
  agent bot, both succeed-or-both-fail at register time).
* **Defense-in-depth lazy fallback**: ``collection_regen_service``
  Stage 1 will use ``get_or_create_summary_bot_for_user`` so a
  user that registered before this PR landed (or whose register
  hook silently failed per ``user_manager.py:137``) still gets a
  bot on first regen attempt.

## Schema (1 column + 1 partial-unique index + Python enum value + backfill)

* ``Bot.is_system: Boolean default False`` — mirrors existing
  ``ApiKey.is_system`` precedent (governance/db/models.py:98). UI
  listings filter ``is_system=True`` rows out so the summary bot
  is invisible to end users.
* Partial unique index ``(user, type, is_system)`` over active
  (``gmt_deleted IS NULL``) ``is_system=TRUE`` rows — defends
  against race conditions during register-time + lazy-create.
* ``BotType.SUMMARY = "summary"`` — Python enum addition only;
  ``Bot.type`` is already ``VARCHAR(50)`` (per
  ``_enum_column(BotType)`` shape) so 0 DDL change for the
  enum value.
* Alembic data migration ``f2c3d4e5b6a8`` backfills one summary
  bot row per existing user that doesn't already have one. Idempotent
  ``WHERE NOT EXISTS`` guard so the migration is safe to re-run.

## Register-hook extension (``aperag/app.py``)

``_BotInitOpsAdapter.create_default_bot_for_user`` now also calls
``_create_summary_bot_for_user`` after creating the default agent
bot. The new method:

* Bypasses ``bot_service.create_bot`` because (a) ``BotCreate.type``
  is ``Literal["agent"]`` so the public schema cannot express
  ``"summary"``, and (b) system bots intentionally skip user-quota /
  user-visibility logic.
* Uses ``get_async_session`` + direct ORM insert; rollback on
  ``IntegrityError`` so race-conditioned concurrent registers /
  the backfill migration don't crash.
* Tool subset enforcement (13 read-only tools) lives in agent
  runtime, NOT on the Bot row — keeps schema minimal per
  simple-stable directive #1.

## What's NOT in this commit (deferred to subsequent commits in this PR)

This commit is the **bot infrastructure scaffolding only**. The
remaining Wave 10 §K.13 work (Tier 1 agent runtime invocation +
Tier 2 chunks.jsonl real impl + ``get_or_create_summary_bot_for_user``
service in ``collection_regen_service`` + supplementary #1 tests +
supplementary #2 silent-failure fix + Chunk E reconciler hook)
lands in subsequent commits on this same PR before merge per
earayu2 ratify "一次到位".

## 12-invariant + 4-pattern + simple-stable 4-guardrail

* #10 DB column cap: n/a this commit (no new variable-length data).
* #12 grep-zero LightRAG: clean.
* Pattern 2 (state binding): ``Bot.is_system`` ORM + alembic both
  updated atomically in this commit.
* Mini #4 (DTO names): no new DTO, reuse existing ``Bot`` /
  ``BotType`` / ``BotCreate``.
* Mini #5 (dependency interface signatures): grepped existing
  register-time call site (``aperag/app.py:171-186``); extended in
  place rather than carving a new init op.
* Mini #7 (grep before adding X-similar): ``ApiKey.is_system``
  precedent verified before adding ``Bot.is_system``.
* simple-stable #1 不无限扩范围: 1 column + 1 index, no new
  service / no new endpoint / no schema for tool subset.
* simple-stable #4 私有化部署免维护: backfill data migration
  ensures existing deployments get summary bots automatically on
  upgrade — no operator setup.

## Test plan

- [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped
- [x] ``alembic upgrade e1a2b3c4d5f6:f2c3d4e5b6a8 --sql`` emits
      ``ALTER TABLE bot ADD COLUMN is_system`` + 2 ``CREATE INDEX``
      + ``INSERT INTO bot`` backfill SELECT
- [x] ``ruff check`` clean
- [ ] e2e CI gates (post-push)
- [ ] Subsequent commits (Tier 1+2 + tests + Chunk E) before CR
earayu added a commit that referenced this pull request Apr 28, 2026
…oints (#1786)

* feat(collection): regen service + 2 OpenAPI endpoints (Wave 10 Chunks C+D)

Wave 10 §K.13 Chunks C+D — collection auto-description two-stage
regen pipeline.

## New module: ``collection_regen_service.py``

* **``regen_summary(collection_id)``** — Stage 1 with cluster-level
  lease + 3-tier fallback chain (per huangheng BLOCKER #2):
    Tier 1: ``_invoke_summary_agent`` — agent-runtime free-explore
            (scaffolded, returns ``None`` today; Wave 10.1 follow-up
            wires the headless ``agent_runtime_manager.launch_turn``)
    Tier 2: ``_invoke_summary_chunks_fallback`` — chunks.jsonl 1st
            substantive chunk + LLM call (scaffolded, Wave 10.1)
    Tier 3: transient skip (no writeback, reconciler retries next sweep)
* **``regen_description(collection_id)``** — Stage 2 cheap LLM derive
  from existing ``Collection.summary``. Single LLM call, ~10s.
* **Lease primitives** — ``_try_acquire_lease`` / ``_release_lease``
  use ``Collection.regen_lease_owner`` + ``regen_lease_expires_at``
  (Chunk A schema). Atomic UPDATE acquires, expired leases self-
  reclaim, concurrent instances race exactly one winner per collection.
* **Quality gates** — ``is_valid_summary`` / ``is_valid_description``
  enforce length + LLM-error blacklist + alphabetic-char threshold
  (per huangheng N4 dual-gate).
* **Language detection** — ``_detect_language`` returns ``"zh"`` /
  ``"en"`` based on CJK-vs-Latin char count, drives Stage 2 prompt
  template selection (per huangheng N1 language-aware).

## New module: ``regen_constants.py``

Pinned thresholds + prompts + tool subset:
* ``BULK_THRESHOLD = 10`` / ``DEBOUNCE_WINDOW = 60min`` /
  ``MIN_STALE_AGE = 10min`` (per earayu2 msg=1b395cae 中保守).
* ``LEASE_TTL = 900s``.
* ``SUMMARY_AGENT_SYSTEM_PROMPT`` — hard-coded (not user-configurable
  per huangheng Q1.2), language-aware output (zh 5000-10000 / en
  3000-7000), 13 read-only tools subset (per huangheng Q1.4).
* ``DESCRIPTION_DERIVE_PROMPT_ZH`` / ``DESCRIPTION_DERIVE_PROMPT_EN``
  — Stage 2 derive prompts.

## New OpenAPI endpoints (in ``api/routes.py``)

* **``POST /api/v2/collections/{id}/summary/regen``** — Stage 1
  trigger; 202 Accepted, 404 collection-not-found, 409 lease-busy,
  403 permission-denied. Dispatches as fire-and-forget asyncio task.
* **``POST /api/v2/collections/{id}/description/regen``** — Stage 2
  trigger; **400 if summary IS NULL** (must regen summary first —
  per huangheng API contract honest reject), 202 otherwise.

## New schema: ``CollectionRegenTriggerResponse``

202 Accepted envelope: ``collection_id`` + ``stage`` (summary |
description) + ``task_id`` + ``estimated_completion_seconds``.

## Tests (14 new pure-Python unit tests)

``tests/unit_test/knowledge_base/test_collection_regen_service.py``
covers quality-gate + language-detection contracts. Lease + DB flow
+ OpenAPI integration are covered by Chunk E reconciler tests +
Chunk G e2e narrative test (those need live DB fixtures).

## What lands in subsequent chunks

* **Chunk E**: ``reconcile_collection_descriptions_hook`` wired into
  the 30s reconciler — scans ``Collection`` rows by doc-change delta
  and dispatches Stage 1 + Stage 2 per design pseudocode.
* **Chunk F**: frontend collection-form removes description input +
  adds placeholder.
* **Chunk G**: e2e narrative test mirror Wave 7 task #11 pattern.
* **Wave 10.1 follow-up**: fill in Tier 1 agent-runtime invocation +
  Tier 2 chunks.jsonl fallback (currently scaffolded with explicit
  ``return None`` so the 3-tier fallback contract still holds —
  every regen today returns the Tier 3 transient skip until 10.1
  ships, which is honest behaviour and doesn't lie about progress).

## 12-invariant cross-check

* **#10 DB column cap**: enforced via prompt + quality-gate
  (5000-10000 / 200-500 chars) per spec §K.13.
* **#12 grep-zero**: no LightRAG references introduced.
* All other invariants: n/a (pure new code in collection-regen lane).

## 4-pattern + 11 mini-pattern pre-check

* Pattern 2 (state binding): lease columns from Chunk A wired into
  service primitives + reconciler hook (Chunk E).
* Mini #4 (DTO names): reuse ``Collection`` ORM, no new DTO.
* Mini #5 (dependency interface signatures): ``LLMCall = Callable[[str], Awaitable[str]]`` matches ``aperag.indexing.llm.build_collection_llm_callable`` shape (sync wrapped via ``asyncio.to_thread``).
* Mini #7 (grep before adding X): grepped existing services for
  similar regen patterns — none, this is a clean new domain service.
* Mini #10 (trigger 3-scenario): the 3-tier fallback chain explicitly
  enumerates agent-failure / chunks-failure / input-not-ready paths.

## Simple-stable 4-guardrail

* #1 不无限扩范围: 2 new modules + 2 endpoints + 1 schema, scaffolded
  Tier 1+2 to avoid coupling to agent-runtime headless API formalisation
  (huangheng N2 sediment, deferred to Wave 11).
* #2 尽快上线: independent ship, unblocks Chunk E (reconciler) which
  needs the regen_summary / regen_description entry points.
* #3 简单稳定: clear 3-tier fallback contract, lease UNIQUE constraint
  via UPDATE atomicity (no extra serialisation infrastructure).
* #4 私有化部署免维护: 0 operator config — thresholds + prompts
  hard-coded per design lock; LLM uses collection's existing
  completion model.

## Test plan

- [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped
- [x] ``ruff format --check`` / ``ruff check`` — clean on touched files
- [ ] e2e CI gates (post-push)
- [ ] CR by @huangheng
- [ ] Architect ratify after CI green

* feat(bot): summary bot infrastructure (Wave 10 §K.13)

Wave 10 §K.13 — per-user hidden summary bot for collection regen
Stage 1 agent-runtime free-explore. Per ratified design (PR #1790
+ thread msg=d6f5e819 ratify):

* **(c1-extend-hide)** main path: register-time creation via
  ``_BotInitOpsAdapter`` extension (same transaction as default
  agent bot, both succeed-or-both-fail at register time).
* **Defense-in-depth lazy fallback**: ``collection_regen_service``
  Stage 1 will use ``get_or_create_summary_bot_for_user`` so a
  user that registered before this PR landed (or whose register
  hook silently failed per ``user_manager.py:137``) still gets a
  bot on first regen attempt.

## Schema (1 column + 1 partial-unique index + Python enum value + backfill)

* ``Bot.is_system: Boolean default False`` — mirrors existing
  ``ApiKey.is_system`` precedent (governance/db/models.py:98). UI
  listings filter ``is_system=True`` rows out so the summary bot
  is invisible to end users.
* Partial unique index ``(user, type, is_system)`` over active
  (``gmt_deleted IS NULL``) ``is_system=TRUE`` rows — defends
  against race conditions during register-time + lazy-create.
* ``BotType.SUMMARY = "summary"`` — Python enum addition only;
  ``Bot.type`` is already ``VARCHAR(50)`` (per
  ``_enum_column(BotType)`` shape) so 0 DDL change for the
  enum value.
* Alembic data migration ``f2c3d4e5b6a8`` backfills one summary
  bot row per existing user that doesn't already have one. Idempotent
  ``WHERE NOT EXISTS`` guard so the migration is safe to re-run.

## Register-hook extension (``aperag/app.py``)

``_BotInitOpsAdapter.create_default_bot_for_user`` now also calls
``_create_summary_bot_for_user`` after creating the default agent
bot. The new method:

* Bypasses ``bot_service.create_bot`` because (a) ``BotCreate.type``
  is ``Literal["agent"]`` so the public schema cannot express
  ``"summary"``, and (b) system bots intentionally skip user-quota /
  user-visibility logic.
* Uses ``get_async_session`` + direct ORM insert; rollback on
  ``IntegrityError`` so race-conditioned concurrent registers /
  the backfill migration don't crash.
* Tool subset enforcement (13 read-only tools) lives in agent
  runtime, NOT on the Bot row — keeps schema minimal per
  simple-stable directive #1.

## What's NOT in this commit (deferred to subsequent commits in this PR)

This commit is the **bot infrastructure scaffolding only**. The
remaining Wave 10 §K.13 work (Tier 1 agent runtime invocation +
Tier 2 chunks.jsonl real impl + ``get_or_create_summary_bot_for_user``
service in ``collection_regen_service`` + supplementary #1 tests +
supplementary #2 silent-failure fix + Chunk E reconciler hook)
lands in subsequent commits on this same PR before merge per
earayu2 ratify "一次到位".

## 12-invariant + 4-pattern + simple-stable 4-guardrail

* #10 DB column cap: n/a this commit (no new variable-length data).
* #12 grep-zero LightRAG: clean.
* Pattern 2 (state binding): ``Bot.is_system`` ORM + alembic both
  updated atomically in this commit.
* Mini #4 (DTO names): no new DTO, reuse existing ``Bot`` /
  ``BotType`` / ``BotCreate``.
* Mini #5 (dependency interface signatures): grepped existing
  register-time call site (``aperag/app.py:171-186``); extended in
  place rather than carving a new init op.
* Mini #7 (grep before adding X-similar): ``ApiKey.is_system``
  precedent verified before adding ``Bot.is_system``.
* simple-stable #1 不无限扩范围: 1 column + 1 index, no new
  service / no new endpoint / no schema for tool subset.
* simple-stable #4 私有化部署免维护: backfill data migration
  ensures existing deployments get summary bots automatically on
  upgrade — no operator setup.

## Test plan

- [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped
- [x] ``alembic upgrade e1a2b3c4d5f6:f2c3d4e5b6a8 --sql`` emits
      ``ALTER TABLE bot ADD COLUMN is_system`` + 2 ``CREATE INDEX``
      + ``INSERT INTO bot`` backfill SELECT
- [x] ``ruff check`` clean
- [ ] e2e CI gates (post-push)
- [ ] Subsequent commits (Tier 1+2 + tests + Chunk E) before CR

* feat(collection): get_or_create_summary_bot_for_user lazy fallback (Wave 10 §K.13)

Wave 10 §K.13 — defense-in-depth fallback for the per-user summary
bot whose main creation point is the register-time hook
``_BotInitOpsAdapter.create_default_bot_for_user``.

The register hook in ``user_manager.py:137`` only ``log.error`` on
init failures; it does NOT roll back the user. So a user can
register successfully but lack their summary bot. ``get_or_create_summary_bot_for_user``
fetches by ``(user, type=SUMMARY, is_system=True, gmt_deleted IS NULL)``
and lazy-creates if missing. The partial unique index (Chunk B
schema migration ``f2c3d4e5b6a8``) handles concurrent race
conditions: one caller wins the INSERT, the other rollbacks +
re-fetches the winner's row.

This commit also wires ``_invoke_summary_agent`` to call the helper
so the bot infrastructure is reachable end-to-end. The actual
``agent_runtime_manager.launch_turn`` invocation still falls
through to Tier 2 (next commit on this PR ships the full launch_turn
flow mirroring ``aperag/domains/evaluation/worker.py:114-180``).

## Test plan

- [x] ``uv run pytest tests/unit_test/knowledge_base/`` — 14 pass
- [x] ``ruff check`` clean
- [ ] CR by @huangheng (queue held until full Wave 10 amend ready)

* feat(collection): Tier 1 + Tier 2 real invokers (Wave 10 §K.13)

Replace the stubbed ``_invoke_summary_agent`` with a full
``agent_runtime_manager.launch_turn`` integration mirroring
``aperag/domains/evaluation/worker.py:114-180`` (real Bot/Chat/AgentTurn
ORMs, fire-and-forget launch, terminal-status polling with
SUMMARY_TIMEOUT_SECONDS, UIMessage-store text extraction).

Replace the stubbed ``_invoke_summary_chunks_fallback`` with a
chunks.jsonl read + single-LLM-call path: pulls active vector
DocumentIndex source_paths for documents in the collection,
stitches representative chunks up to CHUNKS_FALLBACK_MAX_CHARS, and
calls the collection LLM with a Tier-2 prompt that mirrors the
agent path's voice and length contract.

Fix ``_default_llm_factory``: ``build_collection_llm_callable``
already returns an async callable; surface it directly instead of
wrapping with ``asyncio.to_thread`` (which would have run the
coroutine in a worker thread).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(agent-runtime): tool subset enforcement for SUMMARY bots (Wave 10 §K.13)

Wrap the MCP toolset in a ``FilteredToolset`` when ``bot.type``
matches a hardcoded subset. ``BotType.SUMMARY`` gets 13 read-only
tools (vector_search / fulltext_search / graph_search / read_document
/ get_collection_metadata / etc.); other bot types pass through with
the full toolset. The mapping lives in the runtime layer (per design
doc §4) so the LLM cannot route around it via system-prompt edits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(collection): silent-failure 503 fix + Chunk E reconciler hook (Wave 10 §K.13)

Supplementary #2 — explicit operator override paths
(``POST /collections/{id}/summary/regen`` + ``description/regen``)
now await the regen result inline and surface 503 + structured
error envelope when all tiers return invalid output, instead of the
misleading 202 fire-and-forget. The reconciler path keeps log+skip
semantics (next sweep retries).

Chunk E — wire ``reconcile_collection_descriptions_hook`` into
``run_reconcile_loop``. Three scenarios covered:

  * Stage 1 missing-summary: collections with NULL summary
  * Stage 1 stale-summary: a doc was added/edited/deleted after
    ``summary_updated_at`` AND the latest edit is past MIN_STALE_AGE
  * Stage 2 derive: summary newer than description

Lease-busy / lease-expired / soft-delete are filtered at SQL
selection time so the hook only dispatches collections that are
actually eligible. Stage 1 hits exclude themselves from Stage 2 to
avoid wasted regen pairs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(collection): supplementary #1 mandatory test coverage (Wave 10 §K.13)

29 new tests across 3 files covering:

  * Tool subset mapping + FilteredToolset wrapping
    (``test_summary_bot_tool_subset.py``, 7 tests)
  * regen_summary state machine: lease busy / collection deleted /
    all-tiers-invalid / Tier 1 success / Tier 1→Tier 2 fallthrough
    + regen_description state machine: summary missing / valid LLM
    output. Plus the chunk picker + UIMessage text extractor pure
    helpers. (``test_collection_regen_supplementary.py``, 12 tests)
  * Reconciler hook scan: missing-summary / lease-held / lease-
    expired / soft-deleted / doc-edit-stale / doc-edit-too-recent /
    description-stale / description-current / Stage 1 excludes
    Stage 2. (``test_collection_regen_reconciler.py``, 9 tests)

Companion file ``test_collection_regen_service.py`` keeps the
quality-gate + language-detection contract pins; this set adds
orchestration coverage so the merge gate (huangheng N4) is met.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(bot): default-deny filter for is_system bots across all 4 API paths (Wave 10 §K.13)

The Wave 10 hidden per-user summary bot (`is_system=True`,
`type=summary`) was leaking through ``GET /api/v2/bots``: the
endpoint serialised every row through ``Bot`` Pydantic schema, but
``Bot.type: Literal["knowledge", "common", "agent"]`` does not
include ``"summary"`` → 400 ValidationError → e2e-http-provider
hurl test ``12_bot.hurl`` failed.

Fix at the ``db_ops`` layer with a default-deny ``exclude_system``
kwarg so all four user-facing API paths share one guard:

  * ``query_bots(users, exclude_system=True)`` — list path
  * ``query_bot(user, bot_id, exclude_system=True)`` — get / update /
    delete paths

The Pydantic ``Bot.type`` Literal stays unchanged: summary bots are
backend implementation detail and must never reach the public API
surface.

Internal regen plumbing (``get_or_create_summary_bot_for_user``)
queries the ``Bot`` ORM directly via raw SQLAlchemy ``select``, so
the default-deny filter does not block legitimate internal
lookups.

Adds ``tests/unit_test/conversation/test_bot_service_filter_system_bots.py``
with 5 enumeration tests: list excludes, get returns 404, update
returns 404 (no mutation), delete silently ignores (idempotent
no-op), get returns user bot unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 28, 2026
…rrative (#1792)

* test(W10-task#6 Chunk G): collection summary/description regen e2e narrative

Wave 10 §K.13 Chunk G — end-to-end narrative validation for the
summary/description regen flow that chunked PRs A+B (#1783) +
C+D+E (#1786) + design (#1790) compose into one user journey.

What lands

* Single 9-step narrative test in
  ``tests/integration/test_w10_e2e_summary_description_regen.py``.
* Layer 2 env-gated (``RUN_W10_E2E_NARRATIVE=1``); default pytest stays
  fast (9 collected + 9 skipped).
* Module-scoped fixture seeds one synthetic Collection + 3 Documents
  so the narrative shares state via the production data plane
  (Postgres ``Collection`` row + lease columns), not Python globals.

Step coverage

  step 1  freshly-seeded Collection has summary / description / *_updated_at
          all NULL — Wave 10 hard-cut shipped these as nullable Text.
  step 2  ``regen_summary`` writes ``Collection.summary`` +
          ``summary_updated_at`` atomically, releases the lease (Tier 1
          agent / Tier 2 chunks-fallback path validated by 200 success).
  step 3  ``is_valid_summary`` / ``is_valid_description`` reject empty,
          short, and LLM-refusal templates; pass substantive long text
          (quality gate per design §6.2).
  step 4  ``regen_description`` derives ``Collection.description`` from
          the now-populated ``summary``; cheap LLM path returns True
          and writes ``description_updated_at``.
  step 5  ``POST /api/v2/collections/{id}/summary/regen`` route
          exercised via ``regen_collection_summary_view.__wrapped__``
          (the ``@audit`` decorator wraps the view); asserts
          ``CollectionRegenTriggerResponse`` shape + ``stage="summary"``
          + uuid task_id.
  step 6  ``POST /description/regen`` on a fresh no-summary collection
          raises ``HTTPException(400)`` with ``"summary"`` in detail
          (design §9 + §10.4 — Stage 2 cannot run without input).
  step 7  ``reconcile_collection_descriptions_hook`` picks up a
          collection whose Document was edited past ``MIN_STALE_AGE``
          and dispatches at least one regen task — proves the §K.13
          Chunk E hook is wired into the reconciler main loop.
  step 8  Lease-busy state: writing ``regen_lease_owner`` + a far-future
          ``regen_lease_expires_at`` directly causes ``regen_summary``
          to return False without overwriting the row (design §7
          atomic semantics).
  step 9  Failure-mode fold-in (mirror Wave 7 task #11 step 9 +
          design §10.9): patching ``_default_llm_factory`` to raise
          makes ``regen_description`` return False; the row's
          ``description`` and ``description_updated_at`` are NOT
          mutated. Pins the no-silent-write contract end-to-end.

12-invariant table mostly n/a — narrative-correctness is the hard
gate for an e2e PR; material invariants validated implicitly:

* §10.1 lease atomic semantics — step 8
* §10.2 3-tier fallback chain — step 2 (agent / chunks-fallback path
  alive; transient-skip exercised by step 8 indirectly)
* §10.4 API 400 reject when summary IS NULL — step 6
* §10.5 quality gate ``is_valid_summary`` / ``is_valid_description`` —
  step 3
* §10.6 trigger 三场景 (edit case end-to-end) — step 7
* §10.9 silent failure 修复 — step 9

4-pattern pre-check matrix

* Pattern 1 v1: ``regen_summary`` / ``regen_description`` importable
  from ``aperag/domains/knowledge_base/service/collection_regen_service.py``
  (Chunk C). ✅
* Pattern 1 v2: 6 ``Collection`` columns (Wave 10 Chunk A) are read
  by the narrative. ✅
* Pattern 2: ``reconcile_collection_descriptions_hook`` invocation
  return value is a non-zero ``dispatched`` count (Chunk E wired). ✅
* Pattern 3: route surface ``regen_collection_summary_view`` /
  ``regen_collection_description_view`` exposed on the
  knowledge_base router (Chunk D). ✅

simple-stable 4-guardrail

* #1 不无限扩范围: one file, no production code change.
* #2 先把功能做实: real Postgres + real provider — narrative validates
  production behaviour, not stubbed surface.
* #3 简单稳定: one happy-path narrative + one 400-reject pin + one
  failure-mode step. Not a regression matrix.
* #4 私有化部署免维护: env-var-gated; CI Wave 10 lane flips it on,
  local-dev stays fast by default.

Local verification

* ``uv run pytest tests/integration/test_w10_e2e_summary_description_regen.py
  --collect-only`` → 9 collected.
* ``uv run pytest`` (default gate off) → 9 skipped.
* ``uv run ruff check`` clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: ruff format Wave 10 e2e narrative test

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant