Skip to content

stream_async() yields events with non-serializable fields, breaking AgentCore SSE streaming #1928

@sbaryakov

Description

@sbaryakov

Summary

When using stream_async() with AgentCore (following the official example), events yielded to the caller contain non-JSON-serializable objects (Agent, OpenTelemetry Span, etc.) merged into the event dict by prepare(). This causes AgentCore to fall back to repr() serialization, producing massive Python repr strings on the SSE wire that the consumer cannot parse.

Observed behavior

On long-running agent invocations (~13 tool-calling turns with extended thinking enabled), the SSE response stream hits the AgentCore 100 MB response payload limit and is truncated with HTTP/2 INTERNAL_ERROR before the final structured_result can be emitted.

Measured breakdown of a typical stream:

Content Events Size % of stream
ModelStreamEvent subclasses (repr serialized) 782 99.8 MB 99.8%
ModelStreamChunkEvent (valid JSON) 956 200 KB 0.2%

The 782 repr-serialized events average 131 KB each because they contain the full chain-of-thought reasoning text. They appear on the wire as Python repr (single quotes) instead of JSON, e.g.:

data: {"reasoningText": "Let me think...", "delta": {"reasoningContent": {"text": "Let me think..."}}, "reasoning": true, "agent": <strands.agent.agent.Agent object at 0x...>, "event_loop_cycle_span": <opentelemetry.trace.Span object at 0x...>, ...}

Root cause

ModelStreamEvent.prepare() (src/strands/types/_events.py) merges invocation_state directly into the event dict:

class ModelStreamEvent(TypedEvent):
    def prepare(self, invocation_state: dict) -> None:
        if "delta" in self:
            self.update(invocation_state)

invocation_state contains non-serializable objects (Agent instance, Span, etc.). Since all ModelStreamEvent subclasses (TextStreamEvent, ReasoningTextStreamEvent, ToolUseStreamEvent, ReasoningSignatureStreamEvent) have a "delta" key, they all get these fields merged in.

ModelStreamChunkEvent is unaffected because it has no prepare() override.

Reproduction

Follow the AgentCore streaming example with an agent that uses extended thinking and makes multiple tool calls. The stream will contain both valid JSON events and Python repr events. On long invocations the repr events accumulate past 100 MB and the stream is truncated.

Current workaround

We filter events before yielding to AgentCore:

try:
    json.dumps(event)
except (TypeError, ValueError):
    continue
yield event

This works but feels like a workaround for something the SDK should handle.

Possible fixes (open question — we may be missing something)

We are not sure if there is a reason prepare() needs to merge into the dict itself, or if there is a recommended pattern for filtering events before yielding to AgentCore that we have missed. If so, we would appreciate guidance.

Some ideas if this is indeed unintended:

  1. Store internal state separately — e.g. event._context = invocation_state instead of self.update(invocation_state), keeping the dict serializable
  2. Strip internal fields in stream_async() before yielding to the caller
  3. Document the filtering requirement in the AgentCore streaming example

Environment

  • strands-agents >= 0.1.0
  • AgentCore Runtime (us-east-1)
  • Agent uses Bedrock Claude with extended thinking and structured output
  • ~13 tool-calling turns per invocation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions