refactor(phase8 #73 D8.1): backend AI SDK v5 stream emitter#1695
Merged
Conversation
Land the wire-emission half of D8.1 — the agent-runtime SSE endpoint now emits AI SDK v5 ``UI Message Stream Protocol`` part frames in place of the legacy ``AgentTimelineEventEnvelope`` JSON, advertising itself via the ``x-vercel-ai-ui-message-stream: v1`` response header that the FE ``@ai-sdk/react`` consumer (#76) keys on. New ``aperag/domains/agent_runtime/wire/`` sub-package: * ``parts.py`` — Pydantic models for every v5 part type the runtime emits + ``data-citation`` (Anthropic-shape) / ``data-activity`` ApeRAG extensions + placeholder ``data-tool-consent`` / ``data-elicitation`` literals reserved for #75 chenyexuan; exposed as a discriminated ``StreamPart`` union with a ``TypeAdapter`` for round-trip parsing. * ``translator.py`` — pure ``translate_envelope(envelope, state)`` function mapping each timeline envelope to one-or-more parts per the D8.1 mapping table; per-turn ``TranslatorState`` carries text-block lifecycle bookkeeping; ``safe_tool_name_resolver`` hook reserved for #75 (raw tool name + empty metadata until then). SSE route (``api/routes.py``) updated: * New ``_format_part_frame`` writes ``id: <seq>\ndata: <json>\n\n`` AI SDK v5 frames; only the LAST part of an envelope fan-out gets the SSE ``id:`` so ``Last-Event-ID`` resume keeps pointing at the next envelope (translator docstring documents the invariant). * ``stream_turn_events_view`` now wraps each envelope through the translator and yields one frame per part. Heartbeat switched to the SSE-comment form (``: heartbeat\n\n``) which is invisible to the v5 consumer. Generator wrapped in try/except that emits a synthetic ``error`` part on uncaught exceptions before re-raising. Out of scope (per PM lock msg=82ba98fc): DB / Redis storage (#74), tool consent / elicitation / SafeToolName plumbing (#75), FE consumer (#76), agent reasoning loop. The translator is read-only over envelopes; storage shape is unchanged. Tests: * ``tests/unit_test/test_agent_runtime_wire_parts.py`` — 14 contract tests covering every envelope→part mapping, JSON round-trip across the union, ``safe_tool_name_resolver`` plug-in seam, SSE response headers (v5 marker + Content-Type), and ``Last-Event-ID`` resume semantics. * Updated ``test_agent_runtime_v3.py`` and ``test_agent_runtime_openapi_contract.py`` to assert on the new AI SDK v5 wire shape (hard-cut per Phase 8 msg=78fdb6fc — no dual emission, no envelope-format fallback). Acceptance gates green: wire-parts suite + modularization_boundaries + v1_ghost_guard + openapi_spec all pass; ``make lint`` + ``make add-license`` clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
10 tasks
earayu
added a commit
that referenced
this pull request
Apr 25, 2026
… elicitation (#1696) * feat(phase8 #74 D8.2): first-cut UIMessage at-rest storage for agent path Phase 8 task #74 (D8.2) — first cut of the at-rest UIMessage storage layer per the canonical ``docs/modularization/agent-message-protocol-design.md`` and ``docs/modularization/agent-runtime-mcp-design.md`` (in main). This PR delivers the foundation: * ``aperag/domains/agent_runtime/uimessage.py`` (NEW) — pydantic schema for ``UIMessage`` and every ``UIMessagePart`` variant (text / tool / source-url / source-document / data-citation / data-activity / data-tool-consent / data-elicitation), plus ``persistable_parts`` / ``args_preview`` / ``args_hash`` helpers enforcing D9 §A7 raw-args-private rule. * ``aperag/domains/agent_runtime/db/models.py`` — new ``AgentMessage`` ORM (``agent_message`` table; 1:1 with ``agent_turn`` via ``turn_id``; ``parts`` JSON column carries the full UIMessage at rest; ``schema_version`` tag for FE forward-compat). Legacy ``AgentArtifact`` / ``AgentTimelineEvent`` tables retained during D8.x rollout — D8.6 (#80) will drop them once the FE renderer is consuming AgentMessage exclusively. * ``aperag/migration/versions/...d8e2c4a17b91_add_agent_message_table.py`` — new alembic revision chained off ``7c4e9e1f8b21``; pure additive (no rename / drop in this PR), idempotent migration. * ``aperag/domains/agent_runtime/storage.py`` — extend ``AgentRuntimeRedisStore`` with ``write_message_snapshot`` / ``read_message_snapshot`` / ``delete_message_snapshot`` keyed on ``agent_runtime:turn:<id>:message``; same TTL as the live event buffer. * ``aperag/domains/agent_runtime/uimessage_store.py`` (NEW) — ``UIMessageStore`` wraps the DB row + Redis snapshot behind a single ``write`` / ``read`` / ``delete`` surface. ``write`` filters transient parts (currently only ``data-activity``); ``read`` prefers Redis but falls back to the durable DB row when the snapshot is cold. ``UIMessageDbOps`` is a SQLAlchemy-bound helper kept separate so unit tests can inject in-memory fakes. * ``tests/unit_test/agent_runtime/test_uimessage_at_rest.py`` (NEW) — at-rest reload contract tests pinning the three invariants Weston named as the prerequisite for unblocking D8.4b (msg=50c90f6f / msg=cef89ed8): round-trip fidelity across every persistable part variant, transient exclusion, snapshot consistency between Redis and DB. Out of scope (left for follow-up commits / sibling lanes per PM msg=a3c31f79): * Wire/streaming emitter — D8.1 (#73, cuiwenbo) * Tool / citation / consent / elicitation enforcement of the 7-point D9 §A4 contract — D8.3 (#75, chenyexuan) * Full event-to-UIMessage projection in the runtime services — follow-up commit on this branch once #73 stream contract is visible * Drop of legacy ``agent_artifact`` / ``agent_timeline_event`` tables — D8.6 (#80) * Non-agent bot path migration — D8.5 (#79) * FE renderer — D8.4a/b/c (#76/#77/#78) Gates: 709 pass / 29 skip / 1 deselect / 0 fail unit suite (incl. 7 new contract tests + 24 boundary intact); ruff lint+format clean. * refactor(phase8 #73 D8.1): backend AI SDK v5 stream emitter Land the wire-emission half of D8.1 — the agent-runtime SSE endpoint now emits AI SDK v5 ``UI Message Stream Protocol`` part frames in place of the legacy ``AgentTimelineEventEnvelope`` JSON, advertising itself via the ``x-vercel-ai-ui-message-stream: v1`` response header that the FE ``@ai-sdk/react`` consumer (#76) keys on. New ``aperag/domains/agent_runtime/wire/`` sub-package: * ``parts.py`` — Pydantic models for every v5 part type the runtime emits + ``data-citation`` (Anthropic-shape) / ``data-activity`` ApeRAG extensions + placeholder ``data-tool-consent`` / ``data-elicitation`` literals reserved for #75 chenyexuan; exposed as a discriminated ``StreamPart`` union with a ``TypeAdapter`` for round-trip parsing. * ``translator.py`` — pure ``translate_envelope(envelope, state)`` function mapping each timeline envelope to one-or-more parts per the D8.1 mapping table; per-turn ``TranslatorState`` carries text-block lifecycle bookkeeping; ``safe_tool_name_resolver`` hook reserved for #75 (raw tool name + empty metadata until then). SSE route (``api/routes.py``) updated: * New ``_format_part_frame`` writes ``id: <seq>\ndata: <json>\n\n`` AI SDK v5 frames; only the LAST part of an envelope fan-out gets the SSE ``id:`` so ``Last-Event-ID`` resume keeps pointing at the next envelope (translator docstring documents the invariant). * ``stream_turn_events_view`` now wraps each envelope through the translator and yields one frame per part. Heartbeat switched to the SSE-comment form (``: heartbeat\n\n``) which is invisible to the v5 consumer. Generator wrapped in try/except that emits a synthetic ``error`` part on uncaught exceptions before re-raising. Out of scope (per PM lock msg=82ba98fc): DB / Redis storage (#74), tool consent / elicitation / SafeToolName plumbing (#75), FE consumer (#76), agent reasoning loop. The translator is read-only over envelopes; storage shape is unchanged. Tests: * ``tests/unit_test/test_agent_runtime_wire_parts.py`` — 14 contract tests covering every envelope→part mapping, JSON round-trip across the union, ``safe_tool_name_resolver`` plug-in seam, SSE response headers (v5 marker + Content-Type), and ``Last-Event-ID`` resume semantics. * Updated ``test_agent_runtime_v3.py`` and ``test_agent_runtime_openapi_contract.py`` to assert on the new AI SDK v5 wire shape (hard-cut per Phase 8 msg=78fdb6fc — no dual emission, no envelope-format fallback). Acceptance gates green: wire-parts suite + modularization_boundaries + v1_ghost_guard + openapi_spec all pass; ``make lint`` + ``make add-license`` clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #74 D8.2): wrap data-* parts in {type, data: {...}} per D8 §2 canonical Architect canonical lock 2026-04-25 (msg=ad6168e7) + PM scope-tightening (msg=1ff7ed9e): persisted data-* parts must round-trip byte-for-byte with the wire shape produced by #73 cuiwenbo's emitter — D8 §2 forbids a wire/at-rest converter layer. Pre-fix at-rest used flat fields (DataCitationPart.cited_text/.location, DataToolConsentPart.tool_call_id/..., DataElicitationPart.elicitation_id/...) which violated the same-schema canonical and would have forced #75 chenyexuan or the FE renderer (#76/#77) to maintain dual code paths. This commit: - Introduces inner data classes (CitationData / ActivityData / ToolConsentData / ElicitationData) so each data-* part follows {type, data: {...}} with the field set unchanged. - Updates the every-part fixture in the contract test to construct parts via the wrapped form. - Adds test_data_parts_use_wrapped_data_shape — a dedicated lock that reads the persisted DB row and asserts each data-* part's keys are exactly {type, data} and that data carries the canonical fields. Tests: 8/8 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 711/711 (29 skip), ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #74 D8.2): align ToolPart with D8 §2.4 / D9 §A1+§A6 SafeToolName shape Weston minimal CR (msg=1812fb03) + architect canonical affirm (msg=8412dce5): the at-rest ToolPart used a flat `type: "tool"` literal plus a separate `tool_name` field, which is neither the AI SDK v5 streaming form (`tool-input-*` / `tool-output-*`) nor the v5 consolidated form (`type: "tool-<safeName>"`). That third intermediate shape would have forced #75 emit + #76/#77 FE renderer to do `tool` -> `tool-<name>` conversion — the same wire/at-rest schema drift class we just rejected for the data-* parts. This commit: - Encodes the SafeToolName directly in `ToolPart.type` via a regex- validated `^tool-[A-Za-z0-9_-]+$` discriminator string, matching D8 §2.4 + D9 §A1/§A6. - Drops the redundant `tool_name` field; MCP server/tool identity remains carried in `metadata`. - Replaces the misplaced `args_preview` / `args_hash` fields with the canonical `input: Optional[Any]`. Those redaction helpers stay module-level (`args_preview()` / `args_hash()`) so #75 D8.3 can use them when building DataToolConsentPart.data per D9 §A7. - Updates the every-part fixture and the round-trip expected_types to the new tool-`<name>` discriminator. - Adds test_tool_part_type_uses_safe_tool_name_form — pins the persisted tool part `type` matches the SafeToolName regex and confirms no top-level `tool_name` field leaks back. SafeToolName *resolution* (raw MCP name → safe form, collision hash suffix per D9 §A6) remains #75's scope; #74 only enforces the canonical storage shape. Tests: 9/9 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 711/711 (29 skip) — the one observed concurrent_control flake passes on rerun. Ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #74 D8.2): persist UIMessage parts with canonical camelCase aliases Weston minimal CR (msg=59a459c6) + architect canonical affirm: the at-rest part models lacked Pydantic aliases, so `model_dump(by_alias=True)` fell back to snake_case (`source_id`, `tool_call_id`, `args_preview`, `elicitation_id`, etc.) — diverging from cuiwenbo wire `parts.py` (#73) which already serializes camelCase per AI SDK v5. That breaks the D8 §2 same-schema invariant a third time and would have forced #76/#77 FE renderer to handle two casings. This commit attaches `Field(alias=...)` + `ConfigDict(populate_by_name=True)` to every camelCase-canonical field so JSON serialization matches the wire byte-for-byte while Python call sites still use snake_case: - SourceUrlPart.source_id → sourceId - SourceDocumentPart.source_id → sourceId - SourceDocumentPart.media_type → mediaType - ToolPart.tool_call_id → toolCallId - ToolPart.error_text → errorText - ToolConsentData.tool_call_id → toolCallId - ToolConsentData.tool_name → toolName - ToolConsentData.args_preview → argsPreview - ToolConsentData.args_hash → argsHash - ToolConsentData.requested_at → requestedAt - ElicitationData.elicitation_id → elicitationId Snake_case stays where D8 §2 / Anthropic-shape canon requires it: CitationData.cited_text and the four CitationLocation variants (char_location / page_location / content_block_location / url_citation plus their internal start_char / end_char / doc_index / doc_title / page_index / block_index fields) follow the Anthropic citation convention unchanged. Tests: - test_data_parts_use_wrapped_data_shape now asserts the wrapped data-tool-consent / data-elicitation payloads carry camelCase keys (toolCallId / argsPreview / requestedAt / elicitationId, etc.). - New test_persisted_keys_use_canonical_camelcase locks the camelCase contract end-to-end against the persisted DB row, explicitly failing if any of the legacy snake_case forms reappear. - test_tool_part_type_uses_safe_tool_name_form additionally pins toolCallId on the tool part. Gates: 10/10 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 712/29 skip/0 fail (concurrent_control flake deselected, pre-existing). Ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(phase8 #75 D8.3): backend tool lifecycle + citations + consent + elicitation Implements the seven-point D9 §A4 contract that gates tool execution in the agent runtime, plus the Anthropic-shape citation transform: - tools/safe_name.py -- D9 §A1+§A6 SafeToolName + collision sha256 suffix + (mcpServer, mcpToolName, safeName) reverse lookup - tools/registry.py -- D9 §1.1+§A5 three-tier MCP registry with system-namespace reservation and audit-logged admin alias (no silent override) - tools/authorization.py -- D9 §2 three-level auth (visibility / invocation / consent) with §2.2 default policy + per-tool risk overrides - tools/args_cache.py -- D9 §A7 backend-private raw-args cache with short TTL; wire-side argsPreview / argsHash re-exported from the canonical helpers in aperag/domains/agent_runtime/uimessage.py (single-source-of-truth) - tools/consent.py -- D9 §3 consent request <-> decision flow with asyncio.Event waiter, single-use raw-args consume, denial-drops-cache invariant - tools/elicitation.py -- D9 §5 elicitation request <-> answer flow with schema-validated response + cancel hook; pluggable validator (default checks JSON Schema required fields) - tools/lifecycle.py -- envelope event-type constants for tool.consent.* / tool.elicitation.* + translate_lifecycle_envelope() translator extension + LifecycleEmitter glue between consent/elicitation services and the runtime's EventService.append_event path - tools/citations.py -- typed Anthropic-shape citation builder for char_location / page_location / content_block_location / url_citation, fed from RAG ReferenceBundleItem metadata Wire-side refinement: - wire/parts.py DataToolConsentPart + DataElicitationPart placeholders refined to use the canonical wrapped {type, data: ToolConsentData / ElicitationData} shape (no more `transient: True` placeholder; per D9 §3.1 / §5.1 these parts are persisted, audit-trail relevant) api/routes.py: - chained translate_lifecycle_envelope() after translate_envelope() so consent/elicitation envelopes emit DataToolConsentPart / DataElicitationPart on the SSE stream - new POST /agent/turns/{turn_id}/consent/{tool_call_id} -- records the user's decision, wakes the runtime waiter, appends the tool.consent.decided envelope so SSE replay carries the resolved part - new POST /agent/turns/{turn_id}/elicit/{elicitation_id} -- submits a schema-validated response, wakes the waiter, appends the tool.elicitation.resolved envelope Contract tests (focused unit_test/agent_runtime/test_tools_*.py, 82 new tests, all passing locally; full unit suite 814 / 29 skip / 0 fail): - test_tools_safe_name.py (12 tests) -- D9 §A1+§A6 lock - test_tools_registry.py (12 tests) -- D9 §1.1+§A5 lock - test_tools_authorization.py (11 tests) -- D9 §2 lock - test_tools_args_cache.py (12 tests) -- D9 §A7 raw-args privacy lock - test_tools_consent.py ( 9 tests) -- D9 §3 consent flow lock - test_tools_elicitation.py ( 9 tests) -- D9 §5 elicitation lock - test_tools_lifecycle.py ( 9 tests) -- D9 §6 translator extension - test_tools_citations.py ( 9 tests) -- D8 §2.5 typed citation lock 7-point D9 §A4 verification: 1. SafeToolName + MCP metadata (D9 §A1+§A6) -- safe_name.py 2. AI SDK v5 + data-tool-consent custom data-part (§A2) -- wire/parts.py + lifecycle.py 3. argsPreview + argsHash backend-private (§A7) -- args_cache.py + consent.py 4. Registry no silent system override (§A5) -- registry.py 5. data-elicitation schema-validated input (§5) -- elicitation.py 6. Three-level authorization (§2) -- authorization.py 7. PydanticAI as default candidate (§A3) -- runtime backbone unchanged (per architect msg=ff619d8a / Weston msg=50c90f6f C2 lock, this PR scope explicitly excludes backbone rewrite) Built on: - #73 D8.1 wire emitter (cuiwenbo, PR #1695 / 5113730 in main) -- consumes wire/parts.py + chains lifecycle translator via api/routes.py - #74 D8.2 at-rest UIMessage storage (Bryce, PR #1694 head be7406c) -- imports ToolConsentData / ElicitationData / args_preview / args_hash from aperag/domains/agent_runtime/uimessage.py for wire/at-rest same-schema canonical * fix(phase8 #74 D8.2): align DataElicitationPart with D9 §5.1 canonical Weston minimal CR (msg=51dffdc9) + PM lock (msg=042b0a7b): the at-rest ElicitationData was missing the canonical `serverName` field and used a non-canonical `submitted` state literal. D9 §5.1 locks the shape as: { type: "data-elicitation", data: { elicitationId: string, serverName: string, // MCP server requesting input prompt: string, schema: JsonSchema, state: "pending" | "answered" | "cancelled" }} This commit: - Adds `server_name: str = Field(alias="serverName")` to ElicitationData so MCP server identity round-trips with the elicitation request. - Tightens `state` to `Literal["pending", "answered", "cancelled"]` per D9 §5.1 / §6.3 — the previous `submitted` would have forced #75 emit to translate state on every elicitation reply. - Keeps `response: Optional[dict[str, Any]]` per PM msg=042b0a7b ("可以保留但不能替代 canonical 字段"); it carries the user's submitted value at-rest after the POST endpoint completes the round-trip. Tests: - Updates the every-part fixture with a representative serverName. - test_data_parts_use_wrapped_data_shape now asserts `serverName` is in the persisted data-elicitation keys. - test_persisted_keys_use_canonical_camelcase locks `serverName` (not `server_name`) and the canonical state literal. - New test_data_elicitation_answered_state_round_trip — explicit round-trip of a `state="answered"` elicitation with a populated response, pinning the canonical state vocabulary against regression. Gates: 11/11 in agent_runtime/test_uimessage_at_rest.py pass; full unit suite 713 passed / 29 skipped / 0 failed (concurrent_control flake deselected, pre-existing). Ruff check + format clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(phase8 #75 D8.3): align elicitation to D9 §5 / D9.1 canonical (serverName + state="answered") Fast-follow per PR description's Test plan TODO. Reconciles ``ElicitationService`` and ``LifecycleEmitter.request_elicitation`` with the canonical ``ElicitationData`` shape locked by Bryce's #1694 head ``04d268be`` (Weston msg=89bafde9 4th-blocker fix + architect msg=8a76e5e0 D9.1 amend): - ``ElicitationOutcome`` literal: ``"submitted"`` -> ``"answered"`` (canonical state vocabulary per D9 §5.1 / D9.1) - ``ElicitationService.request_input(*, server_name=...)``: required kwarg threaded through to populate ``ElicitationData.server_name`` so the FE consent UI can surface which MCP server initiated the elicitation - ``LifecycleEmitter.request_elicitation(*, server_name=...)``: matching kwarg propagated to the underlying service - contract tests updated: ``test_payload_carries_canonical_server_name`` + ``test_request_input_rejects_empty_server_name`` added; existing state assertions flipped to ``"answered"`` Tests: ``pytest tests/unit_test/agent_runtime/test_tools_*.py -q`` => 84 passed (was 82 + 2 new server_name tests). Wire / at-rest shape stays canonical-clean: ``ElicitationData`` is imported directly from ``aperag/domains/agent_runtime/uimessage.py`` so the field set + alias casing follow #74 ``be7406c5`` -> ``04d268be`` single-source-of-truth. * fix(phase8 #75 D8.3): tenant ownership + multi-tenant registry + default-deny auth Address Weston's three blockers from minimal CR (msg=57cf4632) + the architect-upgraded fourth blocker (msg=19f2c9a9). All within PR scope per PM lock (msg=ab2ed5d3); none deferred. ## B2 (tenant-bound consent + elicitation ownership) - ``ConsentService`` records ``ConsentBinding(turn_id, user_id)`` at ``request_consent`` time; ``decide()`` raises :class:`ConsentOwnershipError` when ``actor_user_id`` does not match the bound user, or when ``expected_turn_id`` is provided and does not match the bound turn (defense in depth even when the user matches). - ``ElicitationService`` mirrors the same pattern via ``ElicitationBinding`` + :class:`ElicitationOwnershipError`. ``cancel(*, bypass_ownership=True)`` is reserved for internal-only callers (timeout sweeper / abort path) so user- facing handlers cannot accidentally skip the check. - ``LifecycleEmitter.request_consent`` / ``LifecycleEmitter.request_elicitation`` thread the new ``turn_id`` + ``user_id`` kwargs through to the underlying services. - HTTP endpoints moved to ``chat_id``-scoped paths to align with the existing pattern (``/agent/chats/{chat_id}/turns/{turn_id}/...``) and to leverage ``turn_service.get_turn_snapshot(user, chat, turn)`` for HTTP-layer ownership pre-check (raises ``ResourceNotFoundException`` -> 404 on cross-user / unknown turn). New endpoints: POST /agent/chats/{chat_id}/turns/{turn_id}/consent/{tool_call_id} POST /agent/chats/{chat_id}/turns/{turn_id}/elicit/{elicitation_id} Both translate ``ConsentOwnershipError`` / ``ElicitationOwnershipError`` -> 403, ``KeyError`` -> 404, ``ValueError`` -> 409 (already resolved) or 422 (validation). - Regression tests: test_decide_rejects_cross_user_actor / cross_turn_actor (consent) test_submit_rejects_cross_user_actor / cross_turn_actor (elicitation) test_request_consent_rejects_empty_turn_or_user test_request_input_rejects_empty_server_name (already there) ## B3 (registry composite key per scope_ref) - ``_ScopeIndex.entries`` keyed on ``(scope_ref, name)`` tuple; system tier uses ``scope_ref=None`` (single global namespace). Bot/user tiers use the owning ``scope_ref`` so different bots / users can independently register the same name without collision -- per D9 §1.1 multi-tenant boundary. - New ``_tier_key()`` helper composes the right key shape per scope. - ``effective_servers()`` switched to keyed iteration so the ``scope_ref`` filter happens at lookup time (was after iteration, which was too late once a same-name entry had already been overwritten). - ``unregister(scope, name, *, scope_ref=None)`` API added so bot/user removals can target the right (scope_ref, name) pair. - Regression tests: test_two_bots_can_register_same_name_without_collision test_two_users_can_register_same_name_without_collision test_user_register_does_not_leak_to_other_user_resolution test_bot_register_does_not_leak_to_other_bot_resolution test_unregister_is_scope_ref_aware_for_bot_user_tiers ## B4 (unknown-risk default-deny) - ``ToolAuthorizationPolicy.evaluate`` -- when the ``risk_resolver`` returns ``None`` for an unknown tool, the policy now returns ``visible=True, can_invoke_auto=False, requires_consent=True, risk="writes_user_data"`` instead of the previous ``READ_ONLY`` auto-invocable default. Per architect canonical lock msg=19f2c9a9: misclassified side-effect tools must NOT silently bypass the consent gate; the security-first fail-closed posture only costs an extra consent prompt for tools that operators forget to classify as ``READ_ONLY``. - Regression test: test_unknown_tool_default_deny_per_security_canonical test_unknown_tool_filter_visible_keeps_consent_required_tool ## Gates - ``pytest tests/unit_test/agent_runtime/test_tools_*.py -q``: 95 passed (was 84 + 11 new B2/B3/B4 tests; old elicitation tests re-targeted to ``actor_user_id="user-1"`` to match the test-fixture binding ``user_id="user-1"``) - ``pytest tests/unit_test/ -q --deselect concurrent_control/test_performance_comparison.py``: 828 passed / 29 skipped / 0 failed - ``ruff check`` + ``ruff format --check``: clean --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 8 task #73 — D8.1 backend AI SDK v5-compatible stream emitter
Closes task #73 (msg=6a5a8459) per PM scope-lock msg=82ba98fc + architect contract first-cut LGTM msg=27ca7cee.
Scope (PM-locked)
Backend wire emission only. Replace the legacy
AgentTimelineEventEnvelopeSSE wire with AI SDK v5 UI Message Stream Protocol parts (perdocs/modularization/agent-message-protocol-design.md).NOT touched (handed to other lanes):
Write set (5 files, +1571/-43)
Backend
aperag/domains/agent_runtime/wire/__init__.py(NEW) — public surface (StreamPart,StreamPartAdapter,TranslatorState,translate_envelope,dump_part,parse_part)aperag/domains/agent_runtime/wire/parts.py(NEW, 411 LOC) — Pydantic models for AI SDK v5 stream partsaperag/domains/agent_runtime/wire/translator.py(NEW, 420 LOC) — puretranslate_envelope(envelope, state, *, safe_tool_name_resolver=None) -> list[StreamPart]+ per-turnTranslatorStateaperag/domains/agent_runtime/api/routes.py— SSE route updated:x-vercel-ai-ui-message-stream: v1_format_part_framereplaces legacy_format_sse(noevent:field, single-line JSONdata:)errorpart on uncaught exceptionTests
tests/unit_test/test_agent_runtime_wire_parts.py(NEW, 570 LOC) — 14 testsSequence Convention (canonical lock per architect msg=7b2169c4 + clarification msg=0b8516b6)
Implementation choice: Option 2 — Envelope-atomic replay (per architect's enumeration).
Mechanics:
AgentTimelineEvent.sequence(DB-backed) maps 1:1 to one envelope, which the translator may fan out into N stream parts.id: {sequence}line; intermediate parts are emitted withdata:only (noid:).Last-Event-IDonly advances when a frame withid:is received. So if a client disconnects mid-fan-out (before the closingid:of the current envelope), its last advanced cursor remains at the previous envelope's sequence.Last-Event-ID: <prev>), server replays fromsequence > prev, which means the entire current envelope is replayed from scratch — including parts the client may have already received.Client expectations (relevant to #76 D8.4a FE):
toolCallId(tool parts),artifact_id(citations / source-urls), text-blockid(text-start/-delta/-end). On replay, an already-received part is identified by the same stable id and dropped.useChat/ message-store layers handle this naturally — their reducers are id-keyed.Trade-off rationale:
sequencesemantics, [Features] support preview and download documents #74 at-rest storage unchanged.Errata for first-cut msg=df929617 §B.4
The first-cut said "fan-out 时保持 sequence 单调递增(对应每个新 part 一个 sequence)" — that wording would imply Option 3 (per-part new sequence). The actual implementation is Option 2 above; the PR description is the authoritative canonical reference.
Contract first-cut for #74 / #75 unblock (per PM msg=387fd639)
Wire format
Resume / error / abort
Last-Event-ID: <last-seen-sequence>header → replay fromsequence > last-seen(envelope-atomic).error {errorText}+finishpart emitted before stream closes.abort+finishpart emitted before stream closes.Hand-off seams (for #74 / #75)
AgentTimelineEventEnvelopeunchanged. UIMessage at-rest schema (D8 §2) is your write-set; mywire/parts.pyliteral type tags align with youraperag/domains/agent_runtime/uimessage.py(Bryce msg=f3f9fc90 confirmed).safe_tool_name_resolver: Callable[[str], tuple[str, dict]] | Nonehook onTranslatorState.data-tool-consent/data-elicitationpart literals are reserved inparts.py.Acceptance gates
pytest test_agent_runtime_wire_parts + agent_runtime/ + test_v1_ghost_guard + test_modularization_boundaries + test_openapi_spec -q→ 55 passedmake lint+make add-licenseclean128409ba(chore: fix code generate websocket connect failed #66 G5b-impl included)Caveats / known gaps (out of #73 scope, flagged for downstream)
reference_bundleitems not yet inlined in envelopedata: todayruntime.pyemitsdata={artifact_id, artifact_type}only. Translator looks fordata['items']first thendata['payload']['items'], defaulting to empty list. Without runtime inlining items into the envelope (or storage materializing them when SSE reads), no citations will surface to FE. Out of [Features] integrate Slack bot as a frontend of kubechat #73 scope — flag for [Features] support preview and download documents #74 / runtime inlining hook decision before [Features] Integrate Feishu doc as a data source #76 D8.4a integration.text-endnot emitted on mid-streamturn.failed: failure paths emiterror + finishbut don't close any opentext-startblock. FE should treaterror/finishas implicit close.safe_tool_name_resolverhook. Translator currently emits raw envelopetool_namewith emptymetadata={}.data-tool-consent/data-elicitationflow: part-type literals reserved inparts.py; flow logic is [Features] integrate YuQue as a data source #75's lane.Ghost-check
none. No new backend coupling, no DB schema change, no FE touch. The wire format change is the explicit hard-cut per Phase 8 philosophy (earayu2 msg=78fdb6fc) — FE consumers updated in #76 D8.4a (standby).
🤖 Generated with Claude Code