feat(openab-agent): text streaming via SSE#933
Conversation
- LlmProvider::chat() now accepts an optional TextCallback that receives text chunks as they arrive from the LLM - AnthropicProvider: switch to stream:true, parse SSE events (content_block_delta/text_delta), invoke callback per chunk - OpenAiProvider: parse SSE line-by-line (response.output_text.delta), invoke callback per chunk instead of collecting full response - Agent::run() forwards the callback to the provider - ACP server emits session/update notifications per text chunk, enabling real-time streaming to Discord - Set agentCapabilities.streaming = true in initialize response - Add reqwest 'stream' feature, tokio-util, futures-util deps - Add test: test_agent_streams_text_via_callback
|
All PRs must reference a prior Discord discussion to ensure community alignment before implementation. Please edit the PR description to include a link like: This PR will be automatically closed in 3 days if the link is not added. |
Address findings from 覺渡法師: F1 🔴: Fix fake streaming — callback now writes directly to stdout via Arc<Mutex<Stdout>> instead of buffering in a Vec. Text chunks reach the harness immediately as they arrive from the LLM. F2 🟡: Mark filesystem-touching test with #[ignore] F3 🟡: Rename tests to <scenario>_<expected_outcome> pattern F4 🟡: Change TextCallback from Box<dyn Fn> to dyn Fn (type alias) to avoid double-indirection when passed as &TextCallback
F1 🔴: Fix premature break when model returns text + tool_calls in same turn. Now only breaks when tool_calls is empty — text with concurrent tool_use correctly continues the loop. F3 🟡: Add 'error' event handling to Anthropic SSE parser for robustness against mid-stream errors. F4 🟡: Add 'error' event handling to OpenAI SSE parser. Note: the OpenAI Responses API emits fully-assembled items via response.output_item.done — no manual argument fragment merging is needed (unlike Chat Completions streaming).
|
CHANGES REQUESTED What This PR DoesAdds real-time text streaming to openab-agent. LLM responses are now emitted as How It Works
Findings
Reviewers
What's NextThe harness side ( |
- Remove redundant session_id_owned/stdout intermediates that trigger clippy::redundant_clone with -D warnings - Add move to streaming callback closure for clarity - Fix Cargo.lock to list futures-util (matches Cargo.toml)
OpenAB PR ScreeningThis is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Screening reportposted screening comment and moved the project item to `PR-Screening`.GitHub comment: #933 (comment) IntentAdd real-time text streaming to FeatFeature work. Adds provider SSE streaming, optional Who It ServesPrimary: Discord users. Secondary: agent runtime operators and maintainers. Rewritten PromptImplement text-only streaming in Merge PitchWorth advancing because it tackles visible response latency. Main risks are SSE parsing, callback ordering/backpressure, and preserving tool-call behavior. Best-Practice ComparisonOpenClaw/Hermes mostly apply around explicit delivery routing, session isolation, and observable run logs. Persistence/file locking are less relevant unless streamed partial state is later resumed. Implementation Options
Comparison TableIncluded in the GitHub comment. RecommendationBalanced path: advance the agent-side streaming PR if tests protect provider parsing and tool behavior, then split harness handling into the next PR. |
- Replace match with single arm to if-let (clippy::single_match) - Replace len() > 0 with !is_empty() (clippy::len_zero)
…utcome> convention
- Revert loop condition to original: break when text+tool_calls coexist (F1 — behavioral change reverted) - Preserve current_text in OpenAI path when output_items also present, avoiding silent discard (F3) - Add TODO for stdout handle consolidation in future multi-session work (F2)
What This PR Does
Adds real-time text streaming to openab-agent. Instead of waiting for the full LLM response before sending anything to the harness, text chunks are now emitted as
session/updatenotifications as they arrive from the API.How It Works
Architecture change:
Changes by file:
Cargo.tomlreqwest/stream,tokio-util,futures-utilllm.rsLlmProvider::chat()acceptsOption<&TextCallback>agent.rsAgent::run()accepts and forwards the callbackacp.rssession/updateper text chunk;streaming: truein capabilitiesKey design decisions:
Nonefor the callback gives the same behavior as beforecontent_block_delta/text_deltaeventsresponse.output_text.deltaeventsTesting
test_agent_streams_text_via_callbackunit testNonefor callback (no behavior change)What's Next
The harness (
src/acp/connection.rs) needs to handle multiplesession/updatenotifications and progressively edit the Discord message. That's a separate PR.