refactor(phase8 #73 D8.1): backend AI SDK v5 stream emitter by earayu · Pull Request #1695 · apecloud/ApeRAG

earayu · 2026-04-25T12:30:02Z

Phase 8 task #73 — D8.1 backend AI SDK v5-compatible stream emitter

Closes task #73 (msg=6a5a8459) per PM scope-lock msg=82ba98fc + architect contract first-cut LGTM msg=27ca7cee.

Scope (PM-locked)

Backend wire emission only. Replace the legacy AgentTimelineEventEnvelope SSE wire with AI SDK v5 UI Message Stream Protocol parts (per docs/modularization/agent-message-protocol-design.md).

NOT touched (handed to other lanes):

DB / Redis storage → [Features] support preview and download documents #74 Bryce (D8.2)
Tool lifecycle / consent / elicitation gates → [Features] integrate YuQue as a data source #75 chenyexuan (D8.3)
FE consumer → [Features] Integrate Feishu doc as a data source #76 dongdong (D8.4a)
The agent reasoning loop, model invocation, tool execution internals

Write set (5 files, +1571/-43)

Backend

aperag/domains/agent_runtime/wire/__init__.py (NEW) — public surface (StreamPart, StreamPartAdapter, TranslatorState, translate_envelope, dump_part, parse_part)
aperag/domains/agent_runtime/wire/parts.py (NEW, 411 LOC) — Pydantic models for AI SDK v5 stream parts
aperag/domains/agent_runtime/wire/translator.py (NEW, 420 LOC) — pure translate_envelope(envelope, state, *, safe_tool_name_resolver=None) -> list[StreamPart] + per-turn TranslatorState
aperag/domains/agent_runtime/api/routes.py — SSE route updated:
- Header now includes x-vercel-ai-ui-message-stream: v1
- _format_part_frame replaces legacy _format_sse (no event: field, single-line JSON data:)
- Sequence semantics: Envelope-atomic replay (see §Sequence Convention below)
- Generator wrapped in try/except → emits synthetic error part on uncaught exception

Tests

tests/unit_test/test_agent_runtime_wire_parts.py (NEW, 570 LOC) — 14 tests
2 existing agent_runtime tests updated to v5 shape

Sequence Convention (canonical lock per architect msg=7b2169c4 + clarification msg=0b8516b6)

Implementation choice: Option 2 — Envelope-atomic replay (per architect's enumeration).

Mechanics:

Each AgentTimelineEvent.sequence (DB-backed) maps 1:1 to one envelope, which the translator may fan out into N stream parts.
On the SSE wire, only the LAST part of a fan-out group carries the id: {sequence} line; intermediate parts are emitted with data: only (no id:).
HTML5 SSE spec: client Last-Event-ID only advances when a frame with id: is received. So if a client disconnects mid-fan-out (before the closing id: of the current envelope), its last advanced cursor remains at the previous envelope's sequence.
On reconnect (Last-Event-ID: <prev>), server replays from sequence > prev, which means the entire current envelope is replayed from scratch — including parts the client may have already received.

Client expectations (relevant to #76 D8.4a FE):

Must dedup by stable part identifiers: toolCallId (tool parts), artifact_id (citations / source-urls), text-block id (text-start/-delta/-end). On replay, an already-received part is identified by the same stable id and dropped.
AI SDK v5 useChat / message-store layers handle this naturally — their reducers are id-keyed.

Trade-off rationale:

Pro: zero DB / Redis schema change, full alignment with existing sequence semantics, [Features] support preview and download documents #74 at-rest storage unchanged.
Con: mid-fan-out disconnect causes client to receive duplicates of earlier parts in the in-flight envelope; client-side dedup expected.

Errata for first-cut msg=df929617 §B.4

The first-cut said "fan-out 时保持 sequence 单调递增（对应每个新 part 一个 sequence）" — that wording would imply Option 3 (per-part new sequence). The actual implementation is Option 2 above; the PR description is the authoritative canonical reference.

Contract first-cut for #74 / #75 unblock (per PM msg=387fd639)

Wire format

HTTP/1.1 200 OK
Content-Type: text/event-stream
x-vercel-ai-ui-message-stream: v1

[ intermediate parts of fan-out: ]
data: {"type":"<part-type>", ...}

[ last part of fan-out: ]
id: <sequence>
data: {"type":"<part-type>", ...}

: heartbeat

Resume / error / abort

Resume: Last-Event-ID: <last-seen-sequence> header → replay from sequence > last-seen (envelope-atomic).
Error: turn-level uncaught exception → synthetic error {errorText} + finish part emitted before stream closes.
Abort: user cancel / lease loss → abort + finish part emitted before stream closes.

Hand-off seams (for #74 / #75)

@bryce [Features] support preview and download documents #74: emitter consumes AgentTimelineEventEnvelope unchanged. UIMessage at-rest schema (D8 §2) is your write-set; my wire/parts.py literal type tags align with your aperag/domains/agent_runtime/uimessage.py (Bryce msg=f3f9fc90 confirmed).
@chenyexuan [Features] integrate YuQue as a data source #75: translator exposes safe_tool_name_resolver: Callable[[str], tuple[str, dict]] | None hook on TranslatorState. data-tool-consent / data-elicitation part literals are reserved in parts.py.

Acceptance gates

✅ pytest test_agent_runtime_wire_parts + agent_runtime/ + test_v1_ghost_guard + test_modularization_boundaries + test_openapi_spec -q → 55 passed
✅ Pre-commit make lint + make add-license clean
✅ Rebased on latest main 128409ba (chore: fix code generate websocket connect failed #66 G5b-impl included)

Caveats / known gaps (out of #73 scope, flagged for downstream)

reference_bundle items not yet inlined in envelope data: today runtime.py emits data={artifact_id, artifact_type} only. Translator looks for data['items'] first then data['payload']['items'], defaulting to empty list. Without runtime inlining items into the envelope (or storage materializing them when SSE reads), no citations will surface to FE. Out of [Features] integrate Slack bot as a frontend of kubechat #73 scope — flag for [Features] support preview and download documents #74 / runtime inlining hook decision before [Features] Integrate Feishu doc as a data source #76 D8.4a integration.
text-end not emitted on mid-stream turn.failed: failure paths emit error + finish but don't close any open text-start block. FE should treat error/finish as implicit close.
SafeToolName + metadata population: deferred to [Features] integrate YuQue as a data source #75 D8.3 via the safe_tool_name_resolver hook. Translator currently emits raw envelope tool_name with empty metadata={}.
data-tool-consent / data-elicitation flow: part-type literals reserved in parts.py; flow logic is [Features] integrate YuQue as a data source #75's lane.

Ghost-check

none. No new backend coupling, no DB schema change, no FE touch. The wire format change is the explicit hard-cut per Phase 8 philosophy (earayu2 msg=78fdb6fc) — FE consumers updated in #76 D8.4a (standby).

🤖 Generated with Claude Code

Land the wire-emission half of D8.1 — the agent-runtime SSE endpoint now emits AI SDK v5 ``UI Message Stream Protocol`` part frames in place of the legacy ``AgentTimelineEventEnvelope`` JSON, advertising itself via the ``x-vercel-ai-ui-message-stream: v1`` response header that the FE ``@ai-sdk/react`` consumer (#76) keys on. New ``aperag/domains/agent_runtime/wire/`` sub-package: * ``parts.py`` — Pydantic models for every v5 part type the runtime emits + ``data-citation`` (Anthropic-shape) / ``data-activity`` ApeRAG extensions + placeholder ``data-tool-consent`` / ``data-elicitation`` literals reserved for #75 chenyexuan; exposed as a discriminated ``StreamPart`` union with a ``TypeAdapter`` for round-trip parsing. * ``translator.py`` — pure ``translate_envelope(envelope, state)`` function mapping each timeline envelope to one-or-more parts per the D8.1 mapping table; per-turn ``TranslatorState`` carries text-block lifecycle bookkeeping; ``safe_tool_name_resolver`` hook reserved for #75 (raw tool name + empty metadata until then). SSE route (``api/routes.py``) updated: * New ``_format_part_frame`` writes ``id: <seq>\ndata: <json>\n\n`` AI SDK v5 frames; only the LAST part of an envelope fan-out gets the SSE ``id:`` so ``Last-Event-ID`` resume keeps pointing at the next envelope (translator docstring documents the invariant). * ``stream_turn_events_view`` now wraps each envelope through the translator and yields one frame per part. Heartbeat switched to the SSE-comment form (``: heartbeat\n\n``) which is invisible to the v5 consumer. Generator wrapped in try/except that emits a synthetic ``error`` part on uncaught exceptions before re-raising. Out of scope (per PM lock msg=82ba98fc): DB / Redis storage (#74), tool consent / elicitation / SafeToolName plumbing (#75), FE consumer (#76), agent reasoning loop. The translator is read-only over envelopes; storage shape is unchanged. Tests: * ``tests/unit_test/test_agent_runtime_wire_parts.py`` — 14 contract tests covering every envelope→part mapping, JSON round-trip across the union, ``safe_tool_name_resolver`` plug-in seam, SSE response headers (v5 marker + Content-Type), and ``Last-Event-ID`` resume semantics. * Updated ``test_agent_runtime_v3.py`` and ``test_agent_runtime_openapi_contract.py`` to assert on the new AI SDK v5 wire shape (hard-cut per Phase 8 msg=78fdb6fc — no dual emission, no envelope-format fallback). Acceptance gates green: wire-parts suite + modularization_boundaries + v1_ghost_guard + openapi_spec all pass; ``make lint`` + ``make add-license`` clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… elicitation (#1696) * feat(phase8 #74 D8.2): first-cut UIMessage at-rest storage for agent path Phase 8 task #74 (D8.2) — first cut of the at-rest UIMessage storage layer per the canonical ``docs/modularization/agent-message-protocol-design.md`` and ``docs/modularization/agent-runtime-mcp-design.md`` (in main). This PR delivers the foundation: * ``aperag/domains/agent_runtime/uimessage.py`` (NEW) — pydantic schema for ``UIMessage`` and every ``UIMessagePart`` variant (text / tool / source-url / source-document / data-citation / data-activity / data-tool-consent / data-elicitation), plus ``persistable_parts`` / ``args_preview`` / ``args_hash`` helpers enforcing D9 §A7 raw-args-private rule. * ``aperag/domains/agent_runtime/db/models.py`` — new ``AgentMessage`` ORM (``agent_message`` table; 1:1 with ``agent_turn`` via ``turn_id``; ``parts`` JSON column carries the full UIMessage at rest; ``schema_version`` tag for FE forward-compat). Legacy ``AgentArtifact`` / ``AgentTimelineEvent`` tables retained during D8.x rollout — D8.6 (#80) will drop them once the FE renderer is consuming AgentMessage exclusively. * ``aperag/migration/versions/...d8e2c4a17b91_add_agent_message_table.py`` — new alembic revision chained off ``7c4e9e1f8b21``; pure additive (no rename / drop in this PR), idempotent migration. * ``aperag/domains/agent_runtime/storage.py`` — extend ``AgentRuntimeRedisStore`` with ``write_message_snapshot`` / ``read_message_snapshot`` / ``delete_message_snapshot`` keyed on ``agent_runtime:turn:<id>:message``; same TTL as the live event buffer. * ``aperag/domains/agent_runtime/uimessage_store.py`` (NEW) — ``UIMessageStore`` wraps the DB row + Redis snapshot behind a single ``write`` / ``read`` / ``delete`` surface. ``write`` filters transient parts (currently only ``data-activity``); ``read`` prefers Redis but falls back to the durable DB row when the snapshot is cold. ``UIMessageDbOps`` is a SQLAlchemy-bound helper kept separate so unit tests can inject in-memory fakes. * ``tests/unit_test/agent_runtime/test_uimessage_at_rest.py`` (NEW) — at-rest reload contract tests pinning the three invariants Weston named as the prerequisite for unblocking D8.4b (msg=50c90f6f / msg=cef89ed8): round-trip fidelity across every persistable part variant, transient exclusion, snapshot consistency between Redis and DB. Out of scope (left for follow-up commits / sibling lanes per PM msg=a3c31f79): * Wire/streaming emitter — D8.1 (#73, cuiwenbo) * Tool / citation / consent / elicitation enforcement of the 7-point D9 §A4 contract — D8.3 (#75, chenyexuan) * Full event-to-UIMessage projection in the runtime services — follow-up commit on this branch once #73 stream contract is visible * Drop of legacy ``agent_artifact`` / ``agent_timeline_event`` tables — D8.6 (#80) * Non-agent bot path migration — D8.5 (#79) * FE renderer — D8.4a/b/c (#76/#77/#78) Gates: 709 pass / 29 skip / 1 deselect / 0 fail unit suite (incl. 7 new contract tests + 24 boundary intact); ruff lint+format clean. * refactor(phase8 #73 D8.1): backend AI SDK v5 stream emitter Land the wire-emission half of D8.1 — the agent-runtime SSE endpoint now emits AI SDK v5 ``UI Message Stream Protocol`` part frames in place of the legacy ``AgentTimelineEventEnvelope`` JSON, advertising itself via the ``x-vercel-ai-ui-message-stream: v1`` response header that the FE ``@ai-sdk/react`` consumer (#76) keys on. New ``aperag/domains/agent_runtime/wire/`` sub-package: * ``parts.py`` — Pydantic models for every v5 part type the runtime emits + ``data-citation`` (Anthropic-shape) / ``data-activity`` ApeRAG extensions + placeholder ``data-tool-consent`` / ``data-elicitation`` literals reserved for #75 chenyexuan; exposed as a discriminated ``StreamPart`` union with a ``TypeAdapter`` for round-trip parsing. * ``translator.py`` — pure ``translate_envelope(envelope, state)`` function mapping each timeline envelope to one-or-more parts per the D8.1 mapping table; per-turn ``TranslatorState`` carries text-block lifecycle bookkeeping; ``safe_tool_name_resolver`` hook reserved for #75 (raw tool name + empty metadata until then). SSE route (``api/routes.py``) updated: * New ``_format_part_frame`` writes ``id: <seq>\ndata: <json>\n\n`` AI SDK v5 frames; only the LAST part of an envelope fan-out gets the SSE ``id:`` so ``Last-Event-ID`` resume keeps pointing at the next envelope (translator docstring documents the invariant). * ``stream_turn_events_view`` now wraps each envelope through the translator and yields one frame per part. Heartbeat switched to the SSE-comment form (``: heartbeat\n\n``) which is invisible to the v5 consumer. Generator wrapped in try/except that emits a synthetic ``error`` part on uncaught exceptions before re-raising. Out of scope (per PM lock msg=82ba98fc): DB / Redis storage (#74), tool consent / elicitation / SafeToolName plumbing (#75), FE consumer (#76), agent reasoning loop. The translator is read-only over envelopes; storage shape is unchanged. Tests: * ``tests/unit_test/test_agent_runtime_wire_parts.py`` — 14 contract tests covering every envelope→part mapping, JSON round-trip across the union, ``safe_tool_name_resolver`` plug-in seam, SSE response headers (v5 marker + Content-Type), and ``Last-Event-ID`` resume semantics. * Updated ``test_agent_runtime_v3.py`` and ``test_agent_runtime_openapi_contract.py`` to assert on the new AI SDK v5 wire shape (hard-cut per Phase 8 msg=78fdb6fc — no dual emission, no envelope-format fallback). Acceptance gates green: wire-parts suite + modularization_boundaries + v1_ghost_guard + openapi_spec all pass; ``make lint`` + ``make add-license`` clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #74 D8.2): wrap data-* parts in {type, data: {...}} per D8 §2 canonical Architect canonical lock 2026-04-25 (msg=ad6168e7) + PM scope-tightening (msg=1ff7ed9e): persisted data-* parts must round-trip byte-for-byte with the wire shape produced by #73 cuiwenbo's emitter — D8 §2 forbids a wire/at-rest converter layer. Pre-fix at-rest used flat fields (DataCitationPart.cited_text/.location, DataToolConsentPart.tool_call_id/..., DataElicitationPart.elicitation_id/...) which violated the same-schema canonical and would have forced #75 chenyexuan or the FE renderer (#76/#77) to maintain dual code paths. This commit: - Introduces inner data classes (CitationData / ActivityData / ToolConsentData / ElicitationData) so each data-* part follows {type, data: {...}} with the field set unchanged. - Updates the every-part fixture in the contract test to construct parts via the wrapped form. - Adds test_data_parts_use_wrapped_data_shape — a dedicated lock that reads the persisted DB row and asserts each data-* part's keys are exactly {type, data} and that data carries the canonical fields. Tests: 8/8 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 711/711 (29 skip), ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #74 D8.2): align ToolPart with D8 §2.4 / D9 §A1+§A6 SafeToolName shape Weston minimal CR (msg=1812fb03) + architect canonical affirm (msg=8412dce5): the at-rest ToolPart used a flat `type: "tool"` literal plus a separate `tool_name` field, which is neither the AI SDK v5 streaming form (`tool-input-*` / `tool-output-*`) nor the v5 consolidated form (`type: "tool-<safeName>"`). That third intermediate shape would have forced #75 emit + #76/#77 FE renderer to do `tool` -> `tool-<name>` conversion — the same wire/at-rest schema drift class we just rejected for the data-* parts. This commit: - Encodes the SafeToolName directly in `ToolPart.type` via a regex- validated `^tool-[A-Za-z0-9_-]+$` discriminator string, matching D8 §2.4 + D9 §A1/§A6. - Drops the redundant `tool_name` field; MCP server/tool identity remains carried in `metadata`. - Replaces the misplaced `args_preview` / `args_hash` fields with the canonical `input: Optional[Any]`. Those redaction helpers stay module-level (`args_preview()` / `args_hash()`) so #75 D8.3 can use them when building DataToolConsentPart.data per D9 §A7. - Updates the every-part fixture and the round-trip expected_types to the new tool-`<name>` discriminator. - Adds test_tool_part_type_uses_safe_tool_name_form — pins the persisted tool part `type` matches the SafeToolName regex and confirms no top-level `tool_name` field leaks back. SafeToolName *resolution* (raw MCP name → safe form, collision hash suffix per D9 §A6) remains #75's scope; #74 only enforces the canonical storage shape. Tests: 9/9 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 711/711 (29 skip) — the one observed concurrent_control flake passes on rerun. Ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #74 D8.2): persist UIMessage parts with canonical camelCase aliases Weston minimal CR (msg=59a459c6) + architect canonical affirm: the at-rest part models lacked Pydantic aliases, so `model_dump(by_alias=True)` fell back to snake_case (`source_id`, `tool_call_id`, `args_preview`, `elicitation_id`, etc.) — diverging from cuiwenbo wire `parts.py` (#73) which already serializes camelCase per AI SDK v5. That breaks the D8 §2 same-schema invariant a third time and would have forced #76/#77 FE renderer to handle two casings. This commit attaches `Field(alias=...)` + `ConfigDict(populate_by_name=True)` to every camelCase-canonical field so JSON serialization matches the wire byte-for-byte while Python call sites still use snake_case: - SourceUrlPart.source_id → sourceId - SourceDocumentPart.source_id → sourceId - SourceDocumentPart.media_type → mediaType - ToolPart.tool_call_id → toolCallId - ToolPart.error_text → errorText - ToolConsentData.tool_call_id → toolCallId - ToolConsentData.tool_name → toolName - ToolConsentData.args_preview → argsPreview - ToolConsentData.args_hash → argsHash - ToolConsentData.requested_at → requestedAt - ElicitationData.elicitation_id → elicitationId Snake_case stays where D8 §2 / Anthropic-shape canon requires it: CitationData.cited_text and the four CitationLocation variants (char_location / page_location / content_block_location / url_citation plus their internal start_char / end_char / doc_index / doc_title / page_index / block_index fields) follow the Anthropic citation convention unchanged. Tests: - test_data_parts_use_wrapped_data_shape now asserts the wrapped data-tool-consent / data-elicitation payloads carry camelCase keys (toolCallId / argsPreview / requestedAt / elicitationId, etc.). - New test_persisted_keys_use_canonical_camelcase locks the camelCase contract end-to-end against the persisted DB row, explicitly failing if any of the legacy snake_case forms reappear. - test_tool_part_type_uses_safe_tool_name_form additionally pins toolCallId on the tool part. Gates: 10/10 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 712/29 skip/0 fail (concurrent_control flake deselected, pre-existing). Ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(phase8 #75 D8.3): backend tool lifecycle + citations + consent + elicitation Implements the seven-point D9 §A4 contract that gates tool execution in the agent runtime, plus the Anthropic-shape citation transform: - tools/safe_name.py -- D9 §A1+§A6 SafeToolName + collision sha256 suffix + (mcpServer, mcpToolName, safeName) reverse lookup - tools/registry.py -- D9 §1.1+§A5 three-tier MCP registry with system-namespace reservation and audit-logged admin alias (no silent override) - tools/authorization.py -- D9 §2 three-level auth (visibility / invocation / consent) with §2.2 default policy + per-tool risk overrides - tools/args_cache.py -- D9 §A7 backend-private raw-args cache with short TTL; wire-side argsPreview / argsHash re-exported from the canonical helpers in aperag/domains/agent_runtime/uimessage.py (single-source-of-truth) - tools/consent.py -- D9 §3 consent request <-> decision flow with asyncio.Event waiter, single-use raw-args consume, denial-drops-cache invariant - tools/elicitation.py -- D9 §5 elicitation request <-> answer flow with schema-validated response + cancel hook; pluggable validator (default checks JSON Schema required fields) - tools/lifecycle.py -- envelope event-type constants for tool.consent.* / tool.elicitation.* + translate_lifecycle_envelope() translator extension + LifecycleEmitter glue between consent/elicitation services and the runtime's EventService.append_event path - tools/citations.py -- typed Anthropic-shape citation builder for char_location / page_location / content_block_location / url_citation, fed from RAG ReferenceBundleItem metadata Wire-side refinement: - wire/parts.py DataToolConsentPart + DataElicitationPart placeholders refined to use the canonical wrapped {type, data: ToolConsentData / ElicitationData} shape (no more `transient: True` placeholder; per D9 §3.1 / §5.1 these parts are persisted, audit-trail relevant) api/routes.py: - chained translate_lifecycle_envelope() after translate_envelope() so consent/elicitation envelopes emit DataToolConsentPart / DataElicitationPart on the SSE stream - new POST /agent/turns/{turn_id}/consent/{tool_call_id} -- records the user's decision, wakes the runtime waiter, appends the tool.consent.decided envelope so SSE replay carries the resolved part - new POST /agent/turns/{turn_id}/elicit/{elicitation_id} -- submits a schema-validated response, wakes the waiter, appends the tool.elicitation.resolved envelope Contract tests (focused unit_test/agent_runtime/test_tools_*.py, 82 new tests, all passing locally; full unit suite 814 / 29 skip / 0 fail): - test_tools_safe_name.py (12 tests) -- D9 §A1+§A6 lock - test_tools_registry.py (12 tests) -- D9 §1.1+§A5 lock - test_tools_authorization.py (11 tests) -- D9 §2 lock - test_tools_args_cache.py (12 tests) -- D9 §A7 raw-args privacy lock - test_tools_consent.py ( 9 tests) -- D9 §3 consent flow lock - test_tools_elicitation.py ( 9 tests) -- D9 §5 elicitation lock - test_tools_lifecycle.py ( 9 tests) -- D9 §6 translator extension - test_tools_citations.py ( 9 tests) -- D8 §2.5 typed citation lock 7-point D9 §A4 verification: 1. SafeToolName + MCP metadata (D9 §A1+§A6) -- safe_name.py 2. AI SDK v5 + data-tool-consent custom data-part (§A2) -- wire/parts.py + lifecycle.py 3. argsPreview + argsHash backend-private (§A7) -- args_cache.py + consent.py 4. Registry no silent system override (§A5) -- registry.py 5. data-elicitation schema-validated input (§5) -- elicitation.py 6. Three-level authorization (§2) -- authorization.py 7. PydanticAI as default candidate (§A3) -- runtime backbone unchanged (per architect msg=ff619d8a / Weston msg=50c90f6f C2 lock, this PR scope explicitly excludes backbone rewrite) Built on: - #73 D8.1 wire emitter (cuiwenbo, PR #1695 / 5113730 in main) -- consumes wire/parts.py + chains lifecycle translator via api/routes.py - #74 D8.2 at-rest UIMessage storage (Bryce, PR #1694 head be7406c) -- imports ToolConsentData / ElicitationData / args_preview / args_hash from aperag/domains/agent_runtime/uimessage.py for wire/at-rest same-schema canonical * fix(phase8 #74 D8.2): align DataElicitationPart with D9 §5.1 canonical Weston minimal CR (msg=51dffdc9) + PM lock (msg=042b0a7b): the at-rest ElicitationData was missing the canonical `serverName` field and used a non-canonical `submitted` state literal. D9 §5.1 locks the shape as: { type: "data-elicitation", data: { elicitationId: string, serverName: string, // MCP server requesting input prompt: string, schema: JsonSchema, state: "pending" | "answered" | "cancelled" }} This commit: - Adds `server_name: str = Field(alias="serverName")` to ElicitationData so MCP server identity round-trips with the elicitation request. - Tightens `state` to `Literal["pending", "answered", "cancelled"]` per D9 §5.1 / §6.3 — the previous `submitted` would have forced #75 emit to translate state on every elicitation reply. - Keeps `response: Optional[dict[str, Any]]` per PM msg=042b0a7b ("可以保留但不能替代 canonical 字段"); it carries the user's submitted value at-rest after the POST endpoint completes the round-trip. Tests: - Updates the every-part fixture with a representative serverName. - test_data_parts_use_wrapped_data_shape now asserts `serverName` is in the persisted data-elicitation keys. - test_persisted_keys_use_canonical_camelcase locks `serverName` (not `server_name`) and the canonical state literal. - New test_data_elicitation_answered_state_round_trip — explicit round-trip of a `state="answered"` elicitation with a populated response, pinning the canonical state vocabulary against regression. Gates: 11/11 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 713 passed / 29 skipped / 0 failed (concurrent_control flake deselected, pre-existing). Ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #75 D8.3): align elicitation to D9 §5 / D9.1 canonical (serverName + state="answered") Fast-follow per PR description's Test plan TODO. Reconciles ``ElicitationService`` and ``LifecycleEmitter.request_elicitation`` with the canonical ``ElicitationData`` shape locked by Bryce's #1694 head ``04d268be`` (Weston msg=89bafde9 4th-blocker fix + architect msg=8a76e5e0 D9.1 amend): - ``ElicitationOutcome`` literal: ``"submitted"`` -> ``"answered"`` (canonical state vocabulary per D9 §5.1 / D9.1) - ``ElicitationService.request_input(*, server_name=...)``: required kwarg threaded through to populate ``ElicitationData.server_name`` so the FE consent UI can surface which MCP server initiated the elicitation - ``LifecycleEmitter.request_elicitation(*, server_name=...)``: matching kwarg propagated to the underlying service - contract tests updated: ``test_payload_carries_canonical_server_name`` + ``test_request_input_rejects_empty_server_name`` added; existing state assertions flipped to ``"answered"`` Tests: ``pytest tests/unit_test/agent_runtime/test_tools_*.py -q`` => 84 passed (was 82 + 2 new server_name tests). Wire / at-rest shape stays canonical-clean: ``ElicitationData`` is imported directly from ``aperag/domains/agent_runtime/uimessage.py`` so the field set + alias casing follow #74 ``be7406c5`` -> ``04d268be`` single-source-of-truth. * fix(phase8 #75 D8.3): tenant ownership + multi-tenant registry + default-deny auth Address Weston's three blockers from minimal CR (msg=57cf4632) + the architect-upgraded fourth blocker (msg=19f2c9a9). All within PR scope per PM lock (msg=ab2ed5d3); none deferred. ## B2 (tenant-bound consent + elicitation ownership) - ``ConsentService`` records ``ConsentBinding(turn_id, user_id)`` at ``request_consent`` time; ``decide()`` raises :class:`ConsentOwnershipError` when ``actor_user_id`` does not match the bound user, or when ``expected_turn_id`` is provided and does not match the bound turn (defense in depth even when the user matches). - ``ElicitationService`` mirrors the same pattern via ``ElicitationBinding`` + :class:`ElicitationOwnershipError`. ``cancel(*, bypass_ownership=True)`` is reserved for internal-only callers (timeout sweeper / abort path) so user- facing handlers cannot accidentally skip the check. - ``LifecycleEmitter.request_consent`` / ``LifecycleEmitter.request_elicitation`` thread the new ``turn_id`` + ``user_id`` kwargs through to the underlying services. - HTTP endpoints moved to ``chat_id``-scoped paths to align with the existing pattern (``/agent/chats/{chat_id}/turns/{turn_id}/...``) and to leverage ``turn_service.get_turn_snapshot(user, chat, turn)`` for HTTP-layer ownership pre-check (raises ``ResourceNotFoundException`` -> 404 on cross-user / unknown turn). New endpoints: POST /agent/chats/{chat_id}/turns/{turn_id}/consent/{tool_call_id} POST /agent/chats/{chat_id}/turns/{turn_id}/elicit/{elicitation_id} Both translate ``ConsentOwnershipError`` / ``ElicitationOwnershipError`` -> 403, ``KeyError`` -> 404, ``ValueError`` -> 409 (already resolved) or 422 (validation). - Regression tests: test_decide_rejects_cross_user_actor / cross_turn_actor (consent) test_submit_rejects_cross_user_actor / cross_turn_actor (elicitation) test_request_consent_rejects_empty_turn_or_user test_request_input_rejects_empty_server_name (already there) ## B3 (registry composite key per scope_ref) - ``_ScopeIndex.entries`` keyed on ``(scope_ref, name)`` tuple; system tier uses ``scope_ref=None`` (single global namespace). Bot/user tiers use the owning ``scope_ref`` so different bots / users can independently register the same name without collision -- per D9 §1.1 multi-tenant boundary. - New ``_tier_key()`` helper composes the right key shape per scope. - ``effective_servers()`` switched to keyed iteration so the ``scope_ref`` filter happens at lookup time (was after iteration, which was too late once a same-name entry had already been overwritten). - ``unregister(scope, name, *, scope_ref=None)`` API added so bot/user removals can target the right (scope_ref, name) pair. - Regression tests: test_two_bots_can_register_same_name_without_collision test_two_users_can_register_same_name_without_collision test_user_register_does_not_leak_to_other_user_resolution test_bot_register_does_not_leak_to_other_bot_resolution test_unregister_is_scope_ref_aware_for_bot_user_tiers ## B4 (unknown-risk default-deny) - ``ToolAuthorizationPolicy.evaluate`` -- when the ``risk_resolver`` returns ``None`` for an unknown tool, the policy now returns ``visible=True, can_invoke_auto=False, requires_consent=True, risk="writes_user_data"`` instead of the previous ``READ_ONLY`` auto-invocable default. Per architect canonical lock msg=19f2c9a9: misclassified side-effect tools must NOT silently bypass the consent gate; the security-first fail-closed posture only costs an extra consent prompt for tools that operators forget to classify as ``READ_ONLY``. - Regression test: test_unknown_tool_default_deny_per_security_canonical test_unknown_tool_filter_visible_keeps_consent_required_tool ## Gates - ``pytest tests/unit_test/agent_runtime/test_tools_*.py -q``: 95 passed (was 84 + 11 new B2/B3/B4 tests; old elicitation tests re-targeted to ``actor_user_id="user-1"`` to match the test-fixture binding ``user_id="user-1"``) - ``pytest tests/unit_test/ -q --deselect concurrent_control/test_performance_comparison.py``: 828 passed / 29 skipped / 0 failed - ``ruff check`` + ``ruff format --check``: clean --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

earayu merged commit 5113730 into main Apr 25, 2026
4 checks passed

earayu deleted the phase8/d81-stream-emitter branch April 25, 2026 12:42

earayu mentioned this pull request Apr 25, 2026

feat(phase8 #75 D8.3): backend tool lifecycle + citations + consent + elicitation #1696

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(phase8 #73 D8.1): backend AI SDK v5 stream emitter#1695

refactor(phase8 #73 D8.1): backend AI SDK v5 stream emitter#1695
earayu merged 1 commit intomainfrom
phase8/d81-stream-emitter

earayu commented Apr 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

earayu commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Phase 8 task #73 — D8.1 backend AI SDK v5-compatible stream emitter

Scope (PM-locked)

Write set (5 files, +1571/-43)

Backend

Tests

Sequence Convention (canonical lock per architect msg=7b2169c4 + clarification msg=0b8516b6)

Errata for first-cut msg=df929617 §B.4

Contract first-cut for #74 / #75 unblock (per PM msg=387fd639)

Wire format

Resume / error / abort

Hand-off seams (for #74 / #75)

Acceptance gates

Caveats / known gaps (out of #73 scope, flagged for downstream)

Ghost-check

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

earayu commented Apr 25, 2026 •

edited

Loading