-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Problem
When using runner.run_async() with StreamingMode.SSE, there is no way to distinguish which partial streaming chunks belong to which LLM response turn within a single invocation.
In a typical flow where the LLM responds with text, calls a tool, and then responds with more text, the events look like this:
Event(partial=True, text="Looking up...") # LLM call 1
Event(partial=True, text="Looking up info...") # LLM call 1
Event(partial=False, function_call=...) # LLM call 1 (final)
Event(function_response=...) # Tool result
Event(partial=True, text="Here are the...") # LLM call 2
Event(partial=True, text="Here are the results...") # LLM call 2
Event(partial=False, text="Here are the results...") # LLM call 2 (final)
All of these events share the same invocation_id, and the id field changes on every yield. There is no field to programmatically determine that the first two partial chunks belong to LLM call 1 and the last two belong to LLM call 2.
Current workaround
The only way to detect turn boundaries is by observing transition patterns: a partial=False event followed by function call/response events implicitly signals the end of one text group. This is fragile and requires the consumer to maintain state tracking.
Proposed solution
Add a turn_id field to Event that:
- Is a 1-based integer counter incremented with each LLM call inside
run_async - Remains the same across all partial chunks and the final aggregated event of that LLM call
- Changes when a new LLM call starts (e.g., after tool execution)
Use case
We build a chat UI that streams partial text to the user via WebSocket. When the agent produces text, calls a tool, and then produces more text, we need to render them as separate message bubbles. Without a turn identifier, we cannot cleanly separate the two text blocks on the client side without implementing brittle heuristics based on event type transitions.