Skip to content

feat(collection): Wave 10 Chunks C+D — regen service + 2 OpenAPI endpoints#1786

Merged
earayu merged 8 commits into
mainfrom
bryce/wave10-chunks-cde-regen-pipeline
Apr 28, 2026
Merged

feat(collection): Wave 10 Chunks C+D — regen service + 2 OpenAPI endpoints#1786
earayu merged 8 commits into
mainfrom
bryce/wave10-chunks-cde-regen-pipeline

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 28, 2026

Summary

Wave 10 Chunks C+D — collection auto-description two-stage regen pipeline service + OpenAPI endpoints. Builds on PR #1783 (Chunks A+B, merged).

Scaffolded — Tier 1 agent-runtime invocation + Tier 2 chunks.jsonl fallback both return None today, forcing the 3-tier fallback chain to land on Tier 3 (transient skip). Wave 10.1 follow-up wires the actual agent-runtime headless call (per huangheng N2 sediment + design appendix A) and the chunks.jsonl read primitive. The lease + state machine + quality gates + 2 OpenAPI endpoints + reconciler integration interface are fully working.

What's in this PR

aperag/domains/knowledge_base/service/collection_regen_service.py (new, ~270 LOC)

  • regen_summary(collection_id) — Stage 1 with cluster lease + 3-tier fallback
  • regen_description(collection_id) — Stage 2 cheap LLM derive
  • is_valid_summary / is_valid_description — quality gates (per huangheng N4)
  • _detect_language — language-aware Stage 2 prompt selection (per huangheng N1)
  • _try_acquire_lease / _release_lease — atomic UPDATE on Collection row

aperag/domains/knowledge_base/service/regen_constants.py (new, ~140 LOC)

  • Thresholds: bulk=10 / debounce=60min / min-stale=10min (中保守 per earayu2)
  • Stage 1: hardcoded system prompt + 13 read-only tool subset + 5 turns / 20K tokens / 60s
  • Stage 2: zh + en derive prompt templates + 30s timeout
  • Lease TTL: 900s

2 OpenAPI endpoints in aperag/domains/knowledge_base/api/routes.py

  • POST /collections/{id}/summary/regen → 202/404, dispatches Stage 1 fire-and-forget
  • POST /collections/{id}/description/regen400 if summary IS NULL (honest reject per huangheng API contract), 202/404 otherwise

CollectionRegenTriggerResponse schema

202 envelope: collection_id + stage + task_id + estimated_completion_seconds

14 unit tests

tests/unit_test/knowledge_base/test_collection_regen_service.py — quality-gate + language-detection contracts.

What lands subsequently

  • Chunk E: reconcile_collection_descriptions_hook wired into 30s reconciler
  • Chunk F: frontend collection-form remove description input
  • Chunk G: e2e narrative test
  • Wave 10.1: fill in Tier 1 agent-runtime + Tier 2 chunks.jsonl fallback (sediment to keep this PR scope-clean)

12-invariant cross-check

4-pattern + 11 mini-pattern

Simple-stable 4-guardrail

Test plan

  • uv run pytest tests/unit_test/ — 1186 pass + 15 skipped
  • ruff format --check / ruff check — clean
  • e2e CI gates (post-push)
  • CR by @huangheng
  • Architect ratify after CI green

🤖 Generated with Claude Code

earayu added a commit that referenced this pull request Apr 28, 2026
…de) (#1790)

* docs(modularization): Wave 10 collection summary design — c1-extend-hide locked

Design for collection summary auto-regen with per-user hidden summary bot.
Captures three-way ratify (architect + Bryce + huangheng) + earayu2 final pin
on (c1-extend-hide + 防御 lazy fallback) approach. PR #1786 amend implements.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: fix v1 → v2 OpenAPI path in §9 endpoints

Per W9-1 V3→V2 rename, ApeRAG OpenAPI base is /api/v2.
huangheng CR nit fold-in (msg=032421e1).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu and others added 7 commits April 28, 2026 18:33
… C+D)

Wave 10 §K.13 Chunks C+D — collection auto-description two-stage
regen pipeline.

## New module: ``collection_regen_service.py``

* **``regen_summary(collection_id)``** — Stage 1 with cluster-level
  lease + 3-tier fallback chain (per huangheng BLOCKER #2):
    Tier 1: ``_invoke_summary_agent`` — agent-runtime free-explore
            (scaffolded, returns ``None`` today; Wave 10.1 follow-up
            wires the headless ``agent_runtime_manager.launch_turn``)
    Tier 2: ``_invoke_summary_chunks_fallback`` — chunks.jsonl 1st
            substantive chunk + LLM call (scaffolded, Wave 10.1)
    Tier 3: transient skip (no writeback, reconciler retries next sweep)
* **``regen_description(collection_id)``** — Stage 2 cheap LLM derive
  from existing ``Collection.summary``. Single LLM call, ~10s.
* **Lease primitives** — ``_try_acquire_lease`` / ``_release_lease``
  use ``Collection.regen_lease_owner`` + ``regen_lease_expires_at``
  (Chunk A schema). Atomic UPDATE acquires, expired leases self-
  reclaim, concurrent instances race exactly one winner per collection.
* **Quality gates** — ``is_valid_summary`` / ``is_valid_description``
  enforce length + LLM-error blacklist + alphabetic-char threshold
  (per huangheng N4 dual-gate).
* **Language detection** — ``_detect_language`` returns ``"zh"`` /
  ``"en"`` based on CJK-vs-Latin char count, drives Stage 2 prompt
  template selection (per huangheng N1 language-aware).

## New module: ``regen_constants.py``

Pinned thresholds + prompts + tool subset:
* ``BULK_THRESHOLD = 10`` / ``DEBOUNCE_WINDOW = 60min`` /
  ``MIN_STALE_AGE = 10min`` (per earayu2 msg=1b395cae 中保守).
* ``LEASE_TTL = 900s``.
* ``SUMMARY_AGENT_SYSTEM_PROMPT`` — hard-coded (not user-configurable
  per huangheng Q1.2), language-aware output (zh 5000-10000 / en
  3000-7000), 13 read-only tools subset (per huangheng Q1.4).
* ``DESCRIPTION_DERIVE_PROMPT_ZH`` / ``DESCRIPTION_DERIVE_PROMPT_EN``
  — Stage 2 derive prompts.

## New OpenAPI endpoints (in ``api/routes.py``)

* **``POST /api/v2/collections/{id}/summary/regen``** — Stage 1
  trigger; 202 Accepted, 404 collection-not-found, 409 lease-busy,
  403 permission-denied. Dispatches as fire-and-forget asyncio task.
* **``POST /api/v2/collections/{id}/description/regen``** — Stage 2
  trigger; **400 if summary IS NULL** (must regen summary first —
  per huangheng API contract honest reject), 202 otherwise.

## New schema: ``CollectionRegenTriggerResponse``

202 Accepted envelope: ``collection_id`` + ``stage`` (summary |
description) + ``task_id`` + ``estimated_completion_seconds``.

## Tests (14 new pure-Python unit tests)

``tests/unit_test/knowledge_base/test_collection_regen_service.py``
covers quality-gate + language-detection contracts. Lease + DB flow
+ OpenAPI integration are covered by Chunk E reconciler tests +
Chunk G e2e narrative test (those need live DB fixtures).

## What lands in subsequent chunks

* **Chunk E**: ``reconcile_collection_descriptions_hook`` wired into
  the 30s reconciler — scans ``Collection`` rows by doc-change delta
  and dispatches Stage 1 + Stage 2 per design pseudocode.
* **Chunk F**: frontend collection-form removes description input +
  adds placeholder.
* **Chunk G**: e2e narrative test mirror Wave 7 task #11 pattern.
* **Wave 10.1 follow-up**: fill in Tier 1 agent-runtime invocation +
  Tier 2 chunks.jsonl fallback (currently scaffolded with explicit
  ``return None`` so the 3-tier fallback contract still holds —
  every regen today returns the Tier 3 transient skip until 10.1
  ships, which is honest behaviour and doesn't lie about progress).

## 12-invariant cross-check

* **#10 DB column cap**: enforced via prompt + quality-gate
  (5000-10000 / 200-500 chars) per spec §K.13.
* **#12 grep-zero**: no LightRAG references introduced.
* All other invariants: n/a (pure new code in collection-regen lane).

## 4-pattern + 11 mini-pattern pre-check

* Pattern 2 (state binding): lease columns from Chunk A wired into
  service primitives + reconciler hook (Chunk E).
* Mini #4 (DTO names): reuse ``Collection`` ORM, no new DTO.
* Mini #5 (dependency interface signatures): ``LLMCall = Callable[[str], Awaitable[str]]`` matches ``aperag.indexing.llm.build_collection_llm_callable`` shape (sync wrapped via ``asyncio.to_thread``).
* Mini #7 (grep before adding X): grepped existing services for
  similar regen patterns — none, this is a clean new domain service.
* Mini #10 (trigger 3-scenario): the 3-tier fallback chain explicitly
  enumerates agent-failure / chunks-failure / input-not-ready paths.

## Simple-stable 4-guardrail

* #1 不无限扩范围: 2 new modules + 2 endpoints + 1 schema, scaffolded
  Tier 1+2 to avoid coupling to agent-runtime headless API formalisation
  (huangheng N2 sediment, deferred to Wave 11).
* #2 尽快上线: independent ship, unblocks Chunk E (reconciler) which
  needs the regen_summary / regen_description entry points.
* #3 简单稳定: clear 3-tier fallback contract, lease UNIQUE constraint
  via UPDATE atomicity (no extra serialisation infrastructure).
* #4 私有化部署免维护: 0 operator config — thresholds + prompts
  hard-coded per design lock; LLM uses collection's existing
  completion model.

## Test plan

- [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped
- [x] ``ruff format --check`` / ``ruff check`` — clean on touched files
- [ ] e2e CI gates (post-push)
- [ ] CR by @huangheng
- [ ] Architect ratify after CI green
Wave 10 §K.13 — per-user hidden summary bot for collection regen
Stage 1 agent-runtime free-explore. Per ratified design (PR #1790
+ thread msg=d6f5e819 ratify):

* **(c1-extend-hide)** main path: register-time creation via
  ``_BotInitOpsAdapter`` extension (same transaction as default
  agent bot, both succeed-or-both-fail at register time).
* **Defense-in-depth lazy fallback**: ``collection_regen_service``
  Stage 1 will use ``get_or_create_summary_bot_for_user`` so a
  user that registered before this PR landed (or whose register
  hook silently failed per ``user_manager.py:137``) still gets a
  bot on first regen attempt.

## Schema (1 column + 1 partial-unique index + Python enum value + backfill)

* ``Bot.is_system: Boolean default False`` — mirrors existing
  ``ApiKey.is_system`` precedent (governance/db/models.py:98). UI
  listings filter ``is_system=True`` rows out so the summary bot
  is invisible to end users.
* Partial unique index ``(user, type, is_system)`` over active
  (``gmt_deleted IS NULL``) ``is_system=TRUE`` rows — defends
  against race conditions during register-time + lazy-create.
* ``BotType.SUMMARY = "summary"`` — Python enum addition only;
  ``Bot.type`` is already ``VARCHAR(50)`` (per
  ``_enum_column(BotType)`` shape) so 0 DDL change for the
  enum value.
* Alembic data migration ``f2c3d4e5b6a8`` backfills one summary
  bot row per existing user that doesn't already have one. Idempotent
  ``WHERE NOT EXISTS`` guard so the migration is safe to re-run.

## Register-hook extension (``aperag/app.py``)

``_BotInitOpsAdapter.create_default_bot_for_user`` now also calls
``_create_summary_bot_for_user`` after creating the default agent
bot. The new method:

* Bypasses ``bot_service.create_bot`` because (a) ``BotCreate.type``
  is ``Literal["agent"]`` so the public schema cannot express
  ``"summary"``, and (b) system bots intentionally skip user-quota /
  user-visibility logic.
* Uses ``get_async_session`` + direct ORM insert; rollback on
  ``IntegrityError`` so race-conditioned concurrent registers /
  the backfill migration don't crash.
* Tool subset enforcement (13 read-only tools) lives in agent
  runtime, NOT on the Bot row — keeps schema minimal per
  simple-stable directive #1.

## What's NOT in this commit (deferred to subsequent commits in this PR)

This commit is the **bot infrastructure scaffolding only**. The
remaining Wave 10 §K.13 work (Tier 1 agent runtime invocation +
Tier 2 chunks.jsonl real impl + ``get_or_create_summary_bot_for_user``
service in ``collection_regen_service`` + supplementary #1 tests +
supplementary #2 silent-failure fix + Chunk E reconciler hook)
lands in subsequent commits on this same PR before merge per
earayu2 ratify "一次到位".

## 12-invariant + 4-pattern + simple-stable 4-guardrail

* #10 DB column cap: n/a this commit (no new variable-length data).
* #12 grep-zero LightRAG: clean.
* Pattern 2 (state binding): ``Bot.is_system`` ORM + alembic both
  updated atomically in this commit.
* Mini #4 (DTO names): no new DTO, reuse existing ``Bot`` /
  ``BotType`` / ``BotCreate``.
* Mini #5 (dependency interface signatures): grepped existing
  register-time call site (``aperag/app.py:171-186``); extended in
  place rather than carving a new init op.
* Mini #7 (grep before adding X-similar): ``ApiKey.is_system``
  precedent verified before adding ``Bot.is_system``.
* simple-stable #1 不无限扩范围: 1 column + 1 index, no new
  service / no new endpoint / no schema for tool subset.
* simple-stable #4 私有化部署免维护: backfill data migration
  ensures existing deployments get summary bots automatically on
  upgrade — no operator setup.

## Test plan

- [x] ``uv run pytest tests/unit_test/`` — 1186 pass + 15 skipped
- [x] ``alembic upgrade e1a2b3c4d5f6:f2c3d4e5b6a8 --sql`` emits
      ``ALTER TABLE bot ADD COLUMN is_system`` + 2 ``CREATE INDEX``
      + ``INSERT INTO bot`` backfill SELECT
- [x] ``ruff check`` clean
- [ ] e2e CI gates (post-push)
- [ ] Subsequent commits (Tier 1+2 + tests + Chunk E) before CR
…ave 10 §K.13)

Wave 10 §K.13 — defense-in-depth fallback for the per-user summary
bot whose main creation point is the register-time hook
``_BotInitOpsAdapter.create_default_bot_for_user``.

The register hook in ``user_manager.py:137`` only ``log.error`` on
init failures; it does NOT roll back the user. So a user can
register successfully but lack their summary bot. ``get_or_create_summary_bot_for_user``
fetches by ``(user, type=SUMMARY, is_system=True, gmt_deleted IS NULL)``
and lazy-creates if missing. The partial unique index (Chunk B
schema migration ``f2c3d4e5b6a8``) handles concurrent race
conditions: one caller wins the INSERT, the other rollbacks +
re-fetches the winner's row.

This commit also wires ``_invoke_summary_agent`` to call the helper
so the bot infrastructure is reachable end-to-end. The actual
``agent_runtime_manager.launch_turn`` invocation still falls
through to Tier 2 (next commit on this PR ships the full launch_turn
flow mirroring ``aperag/domains/evaluation/worker.py:114-180``).

## Test plan

- [x] ``uv run pytest tests/unit_test/knowledge_base/`` — 14 pass
- [x] ``ruff check`` clean
- [ ] CR by @huangheng (queue held until full Wave 10 amend ready)
Replace the stubbed ``_invoke_summary_agent`` with a full
``agent_runtime_manager.launch_turn`` integration mirroring
``aperag/domains/evaluation/worker.py:114-180`` (real Bot/Chat/AgentTurn
ORMs, fire-and-forget launch, terminal-status polling with
SUMMARY_TIMEOUT_SECONDS, UIMessage-store text extraction).

Replace the stubbed ``_invoke_summary_chunks_fallback`` with a
chunks.jsonl read + single-LLM-call path: pulls active vector
DocumentIndex source_paths for documents in the collection,
stitches representative chunks up to CHUNKS_FALLBACK_MAX_CHARS, and
calls the collection LLM with a Tier-2 prompt that mirrors the
agent path's voice and length contract.

Fix ``_default_llm_factory``: ``build_collection_llm_callable``
already returns an async callable; surface it directly instead of
wrapping with ``asyncio.to_thread`` (which would have run the
coroutine in a worker thread).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…0 §K.13)

Wrap the MCP toolset in a ``FilteredToolset`` when ``bot.type``
matches a hardcoded subset. ``BotType.SUMMARY`` gets 13 read-only
tools (vector_search / fulltext_search / graph_search / read_document
/ get_collection_metadata / etc.); other bot types pass through with
the full toolset. The mapping lives in the runtime layer (per design
doc §4) so the LLM cannot route around it via system-prompt edits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ave 10 §K.13)

Supplementary #2 — explicit operator override paths
(``POST /collections/{id}/summary/regen`` + ``description/regen``)
now await the regen result inline and surface 503 + structured
error envelope when all tiers return invalid output, instead of the
misleading 202 fire-and-forget. The reconciler path keeps log+skip
semantics (next sweep retries).

Chunk E — wire ``reconcile_collection_descriptions_hook`` into
``run_reconcile_loop``. Three scenarios covered:

  * Stage 1 missing-summary: collections with NULL summary
  * Stage 1 stale-summary: a doc was added/edited/deleted after
    ``summary_updated_at`` AND the latest edit is past MIN_STALE_AGE
  * Stage 2 derive: summary newer than description

Lease-busy / lease-expired / soft-delete are filtered at SQL
selection time so the hook only dispatches collections that are
actually eligible. Stage 1 hits exclude themselves from Stage 2 to
avoid wasted regen pairs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…K.13)

29 new tests across 3 files covering:

  * Tool subset mapping + FilteredToolset wrapping
    (``test_summary_bot_tool_subset.py``, 7 tests)
  * regen_summary state machine: lease busy / collection deleted /
    all-tiers-invalid / Tier 1 success / Tier 1→Tier 2 fallthrough
    + regen_description state machine: summary missing / valid LLM
    output. Plus the chunk picker + UIMessage text extractor pure
    helpers. (``test_collection_regen_supplementary.py``, 12 tests)
  * Reconciler hook scan: missing-summary / lease-held / lease-
    expired / soft-deleted / doc-edit-stale / doc-edit-too-recent /
    description-stale / description-current / Stage 1 excludes
    Stage 2. (``test_collection_regen_reconciler.py``, 9 tests)

Companion file ``test_collection_regen_service.py`` keeps the
quality-gate + language-detection contract pins; this set adds
orchestration coverage so the merge gate (huangheng N4) is met.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu earayu force-pushed the bryce/wave10-chunks-cde-regen-pipeline branch from a82fb78 to cfe290d Compare April 28, 2026 10:34
…hs (Wave 10 §K.13)

The Wave 10 hidden per-user summary bot (`is_system=True`,
`type=summary`) was leaking through ``GET /api/v2/bots``: the
endpoint serialised every row through ``Bot`` Pydantic schema, but
``Bot.type: Literal["knowledge", "common", "agent"]`` does not
include ``"summary"`` → 400 ValidationError → e2e-http-provider
hurl test ``12_bot.hurl`` failed.

Fix at the ``db_ops`` layer with a default-deny ``exclude_system``
kwarg so all four user-facing API paths share one guard:

  * ``query_bots(users, exclude_system=True)`` — list path
  * ``query_bot(user, bot_id, exclude_system=True)`` — get / update /
    delete paths

The Pydantic ``Bot.type`` Literal stays unchanged: summary bots are
backend implementation detail and must never reach the public API
surface.

Internal regen plumbing (``get_or_create_summary_bot_for_user``)
queries the ``Bot`` ORM directly via raw SQLAlchemy ``select``, so
the default-deny filter does not block legitimate internal
lookups.

Adds ``tests/unit_test/conversation/test_bot_service_filter_system_bots.py``
with 5 enumeration tests: list excludes, get returns 404, update
returns 404 (no mutation), delete silently ignores (idempotent
no-op), get returns user bot unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu
Copy link
Copy Markdown
Collaborator Author

earayu commented Apr 28, 2026

CR by @huangheng — 🟢 LGTM ✅ (round 2 post-rebase + post-BLOCKER fix)

CI 全绿 (per own-up #10 SOP `gh pr checks 1786` explicit verify)

```
lint-and-unit pass 4m5s
e2e-http-compose / provider-preflight ×3 pass
e2e-http-compose / e2e-http-smoke ×3 pass
e2e-http-compose / e2e-http-provider ×3 pass ← was FAIL pre-fix
```

Round 2 verification

Rebase verify ✅

BLOCKER #1 fix verify ✅ (commit `bd46f92c`)

Layer Fix
`db_ops.query_bot/query_bots/query_bots_count` ✅ added `exclude_system: bool = True` kwarg + WHERE clause at DB layer (per huangheng nit, not service-level filter)
`bot_service.list_bots/get_bot/update_bot/delete_bot` ✅ all 4 user-facing API surfaces enforce default-deny (per architect own-up #17 enumeration)
Pydantic `Bot` schema ✅ `Literal` unchanged — summary bot never reaches API → no schema relaxation
Internal `get_or_create_summary_bot_for_user` ✅ raw ORM `select(Bot)` bypass — internal plumbing unaffected
5 new tests in `test_bot_service_filter_system_bots.py` ✅ all pass: list excludes / get 404 / get user-bot / update 404 / delete idempotent no-op

Substantive verification (round 1 retained)

Item Status
Alembic up/down/up/check roundtrip ✅ 0 drift, migration `f2c3d4e5b6a8` clean
Pattern 1 v1 grep `collection_summary_service` ✅ 0 live refs (only docstring/comment markers)
Tier 1 mirrors `evaluation/worker.py:114-180` ✅ canonical 8-step pattern (chat → turn → claim → launch → poll → cancel-on-timeout → uimessage_store → extract)
Tier 2 chunks.jsonl read + LLM call ✅ stitches active vector indexes capped at `CHUNKS_FALLBACK_MAX_CHARS`
Tool subset enforcement ✅ `FilteredToolset` (pydantic_ai) wraps for `bot.type=summary` → 13 read-only tools, LLM cannot route around
503 silent-failure fix ✅ inline `await regen_summary` + 503 on False (replaces misleading 202)
Chunk E reconciler hook ✅ `reconcile_collection_descriptions_hook` covers 3 scenarios (no summary / stale post-doc-edit / description out-of-date) with `asyncio.create_task` fire-and-forget
Quality gates ✅ separate `is_valid_summary` (200/50-char) + `is_valid_description` (50/20-char) thresholds + keyword blacklist + alpha-char regex
Local full unit suite 1224 passed / 15 skipped (round 1) → after fix +5 = 1229 expected
Linting ✅ ruff check + format clean

12-invariant + 4-pattern + 11 mini-pattern + simple-stable 4-guardrail

# Item Status
Invariant #10 length cap ✅ summary 5000-10000 / description 200-500 prompt-enforced + agent runtime token cap 20K
Invariant #12 grep-zero LightRAG
Pattern 1 v1 caller import grep ✅ 0 legacy refs
Pattern 2 state binding ✅ 5 cols on Collection (PR #1783) + Bot.is_system + BotType.SUMMARY + 2 alembic migrations
Mini #4 DTO names ✅ no new DTO; reuses Collection ORM
Mini #5 dependency 接口签名 grep ✅ Tier 1 invocation matches `evaluation/worker.py` real production caller (architect own-up #16 sediment lesson applied)
Mini #6 binding pattern conform ✅ reconciler hook same Pattern B style as existing 30s loop
Mini #7 grep-before-add ✅ existing register hook reused (per design v3.1)
Mini #8 port legacy invariant ✅ "汇聚 N doc summary → description" semantic preserved via 2-stage
Mini #9 ratify fold-in ✅ all design pivots (D→E→G + (c1-extend-hide) + lazy fallback + 503 silent fix + own-up #17 default-deny) folded
Mini #10 trigger logic 三场景 ✅ `gmt_updated > X` count covers add/edit/delete
Mini #11 user directive race detect ✅ msg=1b395cae + msg=e318b050 + msg=d6f5e819 + msg=158ca916 全 reframe 立即 detect
Mini #12 (new) hidden/system entity API surface ✅ default-deny enumeration of all 4 user-facing API paths (architect own-up #17, this PR is canonical example)

simple-stable 4-guardrail

  • feat/frontend #1 不丢失算法: ✅ docstrings/comments preserve "Wave 10 hard-cut" context
  • feat: auth bearer token support #2 尽快上线: ✅ 7 commits + 1 BLOCKER fix; same-PR cross-session continuation per earayu2 ratify (msg=ff59b7aa)
  • feat: api test #3 简单稳定: ✅ minimal schema (1 column + 1 partial unique index + Python enum + backfill) + register hook reuse + agent runtime canonical pattern
  • fix: upload token #4 私有化部署免维护: ✅ 0 operator config, summary bot lifecycle bound to user lifecycle, default-deny ensures no system bot leaks to UI

Wave 10 W9-1 conflict check

✅ Doesn't break any of #1772-#1790 (alias_redirect_store / cicd-push.yml / Makefile / openapi paths / V3→V2 / FE bot-types / vector batching / model display_name / dev/turbopack / i18n / dynamic entity types / design docs)

Verdict

🟢 LGTM ✅ round 2 — Wave 10 一次到位 ship: schema + Bot infrastructure + Tier 1 (real agent runtime) + Tier 2 (chunks.jsonl) + tool subset + 3-tier fallback + cluster lease + reconciler + 503 silent fix + default-deny filter + 47 new tests.

Architect own-up #16 (appendix A grep miss) + #17 (hidden entity API surface enumeration) sediment to mini-pattern #12. Both lessons captured in this PR's implementation.

Per agent ratify lane SOP (own-up #10): @符炫炜 ratify after explicit `gh pr checks 1786` re-verify (already shown 10/10 pass) → `gh pr merge 1786 --squash` (no `--auto` shortcut).

Wave 10 PR #1786 ready for architect ratify merge.

@earayu
Copy link
Copy Markdown
Collaborator Author

earayu commented Apr 28, 2026

Architect ratify ✅ — three-section hard-gate verdict (12-invariant + 4-pattern/12 mini-pattern incl new #12 default-deny + simple-stable 4-guardrail) all pass. huangheng round 2 LGTM + CI 10/10 + Tier 1 canonical evaluation/worker.py pattern verified + own-up #17 default-deny enumeration fixed. Proceeding to squash merge per own-up #10 explicit verify SOP.

@earayu earayu merged commit 27cf113 into main Apr 28, 2026
10 checks passed
@earayu earayu deleted the bryce/wave10-chunks-cde-regen-pipeline branch April 28, 2026 10:55
earayu added a commit that referenced this pull request Apr 28, 2026
…+ 4-language radio (#1793)

Wave 10 §K.13 makes ``Collection.description`` (short) and
``Collection.summary`` (long) auto-generated by the backend regen
pipeline. The collection create/settings form must reflect that:

  * Create page (``action="add"``): drop the description input box —
    the user no longer types it.
  * Settings page (``action="edit"``): replace the editable description
    textarea with a read-only display + "Regenerate description" button
    that calls ``POST /api/v2/collections/{id}/description/regen``.
    Add a parallel summary read-only block + "Regenerate summary"
    button calling ``POST /api/v2/collections/{id}/summary/regen``.
  * Wave 11 follow-up: expand the language radio from 2 (zh-CN /
    en-US) to all 4 backend-supported locales (zh-CN / en-US / ja-JP /
    ko-KR).

The endpoints await regen inline (not 202 fire-and-forget) and
surface 503 on transient skip; the FE catches the typed response
body and surfaces a success toast on the 200 path.

Schema: ``description`` becomes optional in the form so existing
edit-mode round-trips still validate. The Pydantic backend Bot
schema's ``Literal`` is unchanged — system bots stay hidden behind
the default-deny db_ops filter shipped in PR #1786.

Files:
  * web/src/features/collection/types.ts — drop removed
    ``CollectionSummaryTriggerResponse``, add
    ``CollectionRegenTriggerResponse``
  * web/src/features/collection/client-api.ts — new
    ``regenCollectionSummary`` / ``regenCollectionDescription``
  * web/src/app/workspace/collections/collection-form.tsx — replace
    description input with two read-only blocks + regen buttons,
    expand language radio, mark description schema optional
  * web/src/i18n/{zh-CN,en-US}/page_collections.json + merged JSONs:
    new keys for the auto-generated hint text, regen button labels,
    success toasts, and ja-JP / ko-KR language labels
  * web/src/api-v2/schema.d.ts — regenerated from openapi-typescript

Local gates: `pnpm exec next build` clean (Wave 10 + Wave 11 admin
TS noise pre-existing, baseline preserved); `node scripts/i18n-check.mjs`
passes for both locales; `pnpm exec eslint` 0 errors on edited files.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 28, 2026
…rrative (#1792)

* test(W10-task#6 Chunk G): collection summary/description regen e2e narrative

Wave 10 §K.13 Chunk G — end-to-end narrative validation for the
summary/description regen flow that chunked PRs A+B (#1783) +
C+D+E (#1786) + design (#1790) compose into one user journey.

What lands

* Single 9-step narrative test in
  ``tests/integration/test_w10_e2e_summary_description_regen.py``.
* Layer 2 env-gated (``RUN_W10_E2E_NARRATIVE=1``); default pytest stays
  fast (9 collected + 9 skipped).
* Module-scoped fixture seeds one synthetic Collection + 3 Documents
  so the narrative shares state via the production data plane
  (Postgres ``Collection`` row + lease columns), not Python globals.

Step coverage

  step 1  freshly-seeded Collection has summary / description / *_updated_at
          all NULL — Wave 10 hard-cut shipped these as nullable Text.
  step 2  ``regen_summary`` writes ``Collection.summary`` +
          ``summary_updated_at`` atomically, releases the lease (Tier 1
          agent / Tier 2 chunks-fallback path validated by 200 success).
  step 3  ``is_valid_summary`` / ``is_valid_description`` reject empty,
          short, and LLM-refusal templates; pass substantive long text
          (quality gate per design §6.2).
  step 4  ``regen_description`` derives ``Collection.description`` from
          the now-populated ``summary``; cheap LLM path returns True
          and writes ``description_updated_at``.
  step 5  ``POST /api/v2/collections/{id}/summary/regen`` route
          exercised via ``regen_collection_summary_view.__wrapped__``
          (the ``@audit`` decorator wraps the view); asserts
          ``CollectionRegenTriggerResponse`` shape + ``stage="summary"``
          + uuid task_id.
  step 6  ``POST /description/regen`` on a fresh no-summary collection
          raises ``HTTPException(400)`` with ``"summary"`` in detail
          (design §9 + §10.4 — Stage 2 cannot run without input).
  step 7  ``reconcile_collection_descriptions_hook`` picks up a
          collection whose Document was edited past ``MIN_STALE_AGE``
          and dispatches at least one regen task — proves the §K.13
          Chunk E hook is wired into the reconciler main loop.
  step 8  Lease-busy state: writing ``regen_lease_owner`` + a far-future
          ``regen_lease_expires_at`` directly causes ``regen_summary``
          to return False without overwriting the row (design §7
          atomic semantics).
  step 9  Failure-mode fold-in (mirror Wave 7 task #11 step 9 +
          design §10.9): patching ``_default_llm_factory`` to raise
          makes ``regen_description`` return False; the row's
          ``description`` and ``description_updated_at`` are NOT
          mutated. Pins the no-silent-write contract end-to-end.

12-invariant table mostly n/a — narrative-correctness is the hard
gate for an e2e PR; material invariants validated implicitly:

* §10.1 lease atomic semantics — step 8
* §10.2 3-tier fallback chain — step 2 (agent / chunks-fallback path
  alive; transient-skip exercised by step 8 indirectly)
* §10.4 API 400 reject when summary IS NULL — step 6
* §10.5 quality gate ``is_valid_summary`` / ``is_valid_description`` —
  step 3
* §10.6 trigger 三场景 (edit case end-to-end) — step 7
* §10.9 silent failure 修复 — step 9

4-pattern pre-check matrix

* Pattern 1 v1: ``regen_summary`` / ``regen_description`` importable
  from ``aperag/domains/knowledge_base/service/collection_regen_service.py``
  (Chunk C). ✅
* Pattern 1 v2: 6 ``Collection`` columns (Wave 10 Chunk A) are read
  by the narrative. ✅
* Pattern 2: ``reconcile_collection_descriptions_hook`` invocation
  return value is a non-zero ``dispatched`` count (Chunk E wired). ✅
* Pattern 3: route surface ``regen_collection_summary_view`` /
  ``regen_collection_description_view`` exposed on the
  knowledge_base router (Chunk D). ✅

simple-stable 4-guardrail

* #1 不无限扩范围: one file, no production code change.
* #2 先把功能做实: real Postgres + real provider — narrative validates
  production behaviour, not stubbed surface.
* #3 简单稳定: one happy-path narrative + one 400-reject pin + one
  failure-mode step. Not a regression matrix.
* #4 私有化部署免维护: env-var-gated; CI Wave 10 lane flips it on,
  local-dev stays fast by default.

Local verification

* ``uv run pytest tests/integration/test_w10_e2e_summary_description_regen.py
  --collect-only`` → 9 collected.
* ``uv run pytest`` (default gate off) → 9 skipped.
* ``uv run ruff check`` clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: ruff format Wave 10 e2e narrative test

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 28, 2026
…found) (#1822)

PR #1786 turned ``query_bot``'s default into ``exclude_system=True``, but
``collection_regen_service`` Stage 1 Tier 1 still routes through
``chat_service.create_chat`` + ``TurnService.get_chat_and_bot``, both of
which call ``query_bot`` and additionally enforce ``bot.type == AGENT``.
The hidden ``BotType.SUMMARY`` bot fails both gates, surfacing as
``Bot not found: bot…`` 404 toasts on the FE Regen button — Tier 2
fallback never runs because the exception propagates up out of
``_invoke_summary_agent``.

Add a ``_allow_system_bot`` keyword on both methods (default ``False``
keeps the user-facing API safe). When ``True``, pass
``exclude_system=False`` to ``query_bot`` and accept SUMMARY in addition
to AGENT — both share the agent-runtime infrastructure (the SUMMARY
toolset is restricted by ``aperag/domains/agent_runtime`` based on
``bot.type``). ``_invoke_summary_agent`` now opts in.

Tests: 6 new unit tests covering both default-deny and
``_allow_system_bot=True`` branches on ``create_chat`` /
``get_chat_and_bot`` / ``create_or_get_turn``.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant