feat(bidirectional_streaming): Add experimental bidirectional streaming MVP POC implementation #1

mehtarac · 2025-10-03T13:32:45Z

Description

Pull Request: Bidirectional Streaming Implementation

Overview

This PR introduces bidirectional streaming capabilities to Strands SDK for real-time, interactive conversations between users and AI models through persistent connections. This changes Strands from a request-response pattern to a concurrent, connection-based streaming approach.

Problem Statement

Strands currently uses a sequential request-response architecture that prevents real-time interaction:

Users cannot interrupt ongoing responses
No support for concurrent tool execution during model generation
Each interaction requires a complete request-response cycle
No native audio input/output capabilities

Solution

Bidirectional streaming introduces persistent connections with concurrent processing:

Real-time interruption during model generation
Concurrent tool execution without blocking conversation flow
Native audio support with format normalization across providers
Persistent connections lasting 8-30 minutes depending on provider

Architecture Overview

graph TB
    subgraph "Current Unidirectional Architecture"
        A1[Agent] --> B1[Model.stream]
        B1 --> C1[Sequential Events]
        C1 --> D1[Tool Execution BLOCKS]
        D1 --> E1[Response Complete]
    end
    
    subgraph "New Bidirectional Architecture"
        A2[BidirectionalAgent] --> B2[BidirectionalConnection]
        B2 --> C2[Model Events Processor]
        B2 --> D2[Tool Execution Processor]  
        B2 --> E2[Connection Coordinator]
        
        C2 --> F2[Event Queue]
        D2 --> G2[Tool Queue]
        E2 --> H2[Background Tasks Management]
        
        F2 --> I2[Agent.receive]
        G2 --> J2[Concurrent Tool Execution]
    end

Component Architecture

1. BidirectionalAgent - User Interface Layer

The BidirectionalAgent provides the user-facing interface for bidirectional streaming conversations. It follows the same patterns as Strands' existing Agent class but is built for persistent connections and real-time interaction.

Like the standard Agent, BidirectionalAgent uses compositional design, delegating to specialized components (ToolRegistry, ToolExecutor) rather than implementing functionality directly. It requires a BidirectionalModel type in its constructor, providing compile-time validation that prevents runtime configuration errors.

Key differences from the standard Agent:

Connection Management: Manages persistent connections instead of discrete request-response cycles
Real-time Interface: Provides concurrent methods (send_audio(), interrupt(), receive()) for live interaction
Concurrent Design: Built for real-time processing from initialization, maintaining familiar patterns (start_conversation() parallels invoke_async())

2. BidirectionalConnection - Concurrent Event Loop Engine

The BidirectionalConnection transforms Strands from sequential event processing to concurrent task coordination. This replaces the existing event_loop_cycle() pattern with persistent, concurrent processing.

Current Event Loop Architecture

The existing event loop processes one conversation turn at a time in a sequential pattern (see Event Loop Cycle documentation).

Each call to event_loop_cycle() handles one complete conversation turn then terminates. Tool execution blocks the entire conversation flow until completion.

New Concurrent Architecture

BidirectionalConnection runs continuously throughout the connection (8-30 minutes) with three concurrent processors working together:

graph TB
    A[Model Events Processor] --> D[Event Queue]
    B[Tool Execution Processor] --> E[Tool Queue]
    C[Connection Coordinator] --> F[Connection State]
    
    D --> G[Agent.receive]
    E --> H[Tool Results]
    
    I[Provider Events] --> A
    J[Tool Requests] --> B
    K[User Input] --> A

The three processors work concurrently:

Model Events Processor: Receives continuous events from the provider, converts them to Strands format, and routes to appropriate handlers
Tool Execution Processor: Executes tools concurrently without blocking conversation flow, with cancellation support during interruptions
Connection Coordinator: Supervises background tasks, manages connection lifecycle, and coordinates interruption handling

Event Loop Design

sequenceDiagram
    participant User
    participant Agent as BidirectionalAgent
    participant Conn as BidirectionalConnection
    participant ModelSession as BidirectionalModelSession
    participant ModelEventsTask as _process_model_events
    participant ToolExecTask as _process_tool_execution
    participant CycleTask as bidirectional_event_loop_cycle
    participant Provider as Provider Stream

    User->>Agent: start_conversation()
    Agent->>+Conn: start_bidirectional_connection(agent)
    Conn->>+ModelSession: model.create_bidirectional_connection()
    ModelSession->>Provider: Initialize provider stream
    
    par Background Task Initialization
        Conn->>ModelEventsTask: asyncio.create_task(_process_model_events)
        Conn->>ToolExecTask: asyncio.create_task(_process_tool_execution)
        Conn->>CycleTask: asyncio.create_task(bidirectional_event_loop_cycle)
    end
    
    Conn-->>-Agent: return BidirectionalConnection
    
    User->>Agent: send_audio(audio_input)
    Agent->>ModelSession: send_audio_content(audio_input)
    ModelSession->>Provider: Send formatted provider event
    
    loop Concurrent Processing
        Provider-->>ModelSession: Raw provider events
        ModelSession->>ModelSession: Convert to standardized format
        ModelEventsTask->>ModelSession: receive_events()
        ModelSession-->>ModelEventsTask: Standardized events
        
        alt Tool Use Event
            ModelEventsTask->>ToolExecTask: tool_queue.put(tool_use)
            ToolExecTask->>ToolExecTask: Execute tool with Strands infrastructure
            ToolExecTask->>ModelSession: send_tool_result(result)
            ModelSession->>Provider: Send formatted tool result
        else Text/Audio Output
            ModelEventsTask->>Agent: agent._output_queue.put(event)
            Agent-->>User: receive() yields event
        else Interruption Detected
            ModelEventsTask->>Conn: _handle_interruption()
            Conn->>ToolExecTask: Cancel pending tool tasks
            Conn->>Agent: Clear audio output queue
        end
        
        CycleTask->>CycleTask: Supervise background tasks health
    end

Event Flow and Processing

The sequence diagram shows the actual implementation flow with accurate component interactions:

Connection Setup: start_bidirectional_connection() creates a model session and launches three background tasks
Task Management: Model events task calls receive_events(), tool execution task monitors tool queue, cycle task supervises health
Input Processing: User input goes through Agent → ModelSession → Provider with proper formatting
Event Streaming: Provider events flow through ModelSession normalization before reaching background tasks
Tool Execution: Tools execute using existing Strands infrastructure with results sent back through ModelSession
Output Flow: Events reach user through Agent's output queue consumed by receive() method
Interruption: Detected by model events task, handled by connection with task cancellation and queue clearing

Key implementation detail: Events flow through the BidirectionalModelSession layer which normalizes provider-specific formats before reaching the background processing tasks.

3. Model Interface - Protocol Normalization

The new model interface creates a unified interface across different bidirectional streaming protocols. This design maintains Strands' core philosophy that users should be able to switch between model providers without changing their application code.

Separation from Existing Model Architecture

The existing Model interface handles stateless, discrete operations where each stream() call is independent. The new BidirectionalModel interfaces manage persistent connections with continuous event streams and multiple concurrent input methods (send_audio_content(), send_text_content(), send_interrupt()). This separation is necessary because bidirectional streaming providers use different protocols compared to traditional request-response models. Each provider implements their own event sequences, connection management, and data formats for real-time streaming.

4. Bidirectional Type System

The type system extends Strands' existing StreamEvent types to support bidirectional streaming while maintaining full backward compatibility.

New event types include:

Audio Events: audioOutput and audioInput with standardized format (raw bytes, explicit sample rates)
Connection Events: BidirectionalConnectionStart and BidirectionalConnectionEnd for lifecycle management
Interruption Events: interruptionDetected for real-time conversation control

5. Nova Sonic Model Provider Implementation

Strands follows a model-agnostic philosophy, supporting multiple AI providers through a unified interface. Users can switch between Amazon Bedrock, Anthropic, OpenAI, Ollama, and others without changing their application code. This same philosophy extends to bidirectional streaming.

Nova Sonic is Amazon's bidirectional speech-to-speech streaming model, and serves as the reference implementation for this architecture. Nova Sonic requires event sequencing with hierarchical structures (sessionStart → promptStart → contentStart → input → contentEnd). The implementation handles this complexity internally while presenting a simple send_text() and send_audio() interface to users.

Implementation Benefits

Architecture Advantages

Separation of Concerns: Each component has a single responsibility
Concurrent Design: Built for real-time processing
Provider Agnostic: Unified interface abstracts protocol complexity
Type Safe: Compile-time guarantees prevent runtime configuration errors

Maintained Compatibility

Existing Agent Class: Unchanged and fully functional
Current Model Providers: No modifications to existing model implementations
Tool Definitions: All existing tools work with bidirectional agents
Type System: BidirectionalStreamEvent inherits all existing StreamEvent fields

Experimental Status

Current State

This implementation is a working proof-of-concept that validates the architectural approach with Nova Sonic integration. The core functionality is operational and demonstrates end-to-end bidirectional streaming capabilities.

API Stability Warning

This feature is experimental and subject to breaking changes:

Interface methods and parameters may evolve
Event types and data structures will be refined
Provider implementations may undergo changes
Integration patterns will be optimized based on usage feedback

Testing and Validation

Interactive Test Script

The implementation includes a comprehensive test script at src/strands/experimental/bidirectional_streaming/tests/test_bidirectional_streaming.py that demonstrates real-time bidirectional streaming capabilities:

# Run the interactive test
python src/strands/experimental/bidirectional_streaming/tests/test_bidirectional_streaming.py

Recommended Setup: Use headphones for the best experience to prevent audio feedback between microphone and speakers.

The test script demonstrates:

Real-time Audio Processing: Live microphone input and speaker output with 16kHz/24kHz sample rates
Interruption Handling: Responsive interruption detection with immediate audio queue clearing
Concurrent Operations: Simultaneous audio recording, playback, event processing, and sending
Tool Integration: Calculator tool execution during conversation flow
Connection Management: Complete connection lifecycle with proper cleanup

Related Issues

strands-agents#217

Documentation PR

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…ng MVP POC implementation

to send(), Updated imports

…on 09-29, added a lock for interruption handling

…nitialized, and removed asyncio.sleep() as they were mainly for defensive purposes and following the pattern of nova sonic samples.

…dd user messages to the agent messages

JackYPCOnline · 2025-10-03T16:51:33Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

+
+    def __init__(
+        self,
+        model: BidirectionalModel,


in strands we have Union[Model, str, None] = None in init, we can make it same here for future extensibility

Can make this change

linked in https://github.com/orgs/strands-agents/projects/12/views/1?pane=issue&itemId=131451564&issue=strands-agents%7Cprivate-sdk-python-staging%7C245

src/strands/experimental/bidirectional_streaming/agent/agent.py

JackYPCOnline · 2025-10-03T17:33:30Z

src/strands/experimental/bidirectional_streaming/models/bidirectional_model.py

+        Converts provider-specific events to a common format that can be
+        processed uniformly by the event loop.
+        """
+        raise NotImplementedError


I got smiliar comment just leave blank instead of raise error

can make this change

linked in https://github.com/orgs/strands-agents/projects/12/views/1?pane=issue&itemId=131450934&issue=strands-agents%7Cprivate-sdk-python-staging%7C243

src/strands/experimental/bidirectional_streaming/models/novasonic.py

JackYPCOnline · 2025-10-03T18:27:29Z

src/strands/experimental/bidirectional_streaming/models/novasonic.py

+                        await self._handle_response_data(result.value.bytes_.decode("utf-8"))
+
+                except asyncio.TimeoutError:
+                    await asyncio.sleep(0.1)


Question: Does this mean we do nothing if there is a timeout while processing ? Do we wanna retry here?

src/strands/experimental/bidirectional_streaming/models/novasonic.py

JackYPCOnline · 2025-10-03T18:40:15Z

src/strands/experimental/bidirectional_streaming/models/novasonic.py

+            raise
+
+
+class NovaSonicBidirectionalModel(BidirectionalModel):


Maybe not in same file?

Can iterate on this in a follow-up

mkmeral · 2025-10-03T15:26:20Z

src/strands/experimental/bidirectional_streaming/event_loop/bidirectional_event_loop.py

+            logger.error("Tool error send failed: %s", str(send_error))
+
+
+def _extract_callable_function(tool_func: any) -> any:


why do we need this?

needed this due to tool executor not being implemented as part of the POC and using this method we were directly calling the tool as a function call. This is fixed in the new PR: #5

mkmeral · 2025-10-03T15:26:25Z

src/strands/experimental/bidirectional_streaming/event_loop/bidirectional_event_loop.py

+                    # Execute tool function with provided input
+                    result = actual_func(**tool_use.get("input", {}))
+
+                    tool_result = _create_success_result(tool_use["toolUseId"], result)


TODO: ToolContext

I am assuming this is part of bar raising the types?

mkmeral · 2025-10-03T15:30:18Z

src/strands/experimental/bidirectional_streaming/types/bidirectional_streaming.py

+    channels: Literal[1, 2]
+
+
+class TextOutputEvent(TypedDict):


TODO: Add transcription

I am assuming this is part of bar raising the types?

mkmeral · 2025-10-03T15:30:54Z

src/strands/experimental/bidirectional_streaming/models/bidirectional_model.py

+        raise NotImplementedError
+
+    @abc.abstractmethod
+    async def send_audio_content(self, audio_input: AudioInputEvent) -> None:


we will merge these into single send(BidirectionalInputEvent) (or sth similar) right?

yes we can do that as part of issue concerned with bar-raising the model interface.

mkmeral · 2025-10-06T09:59:17Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

+        tools: list | None = None,
+        system_prompt: str | None = None,
+        messages: Messages | None = None,
+    ):


we should add **kwargs to our interfaces/methods for future extensibility

Can iterate on this

mkmeral

Approving to merge and iterate faster. We should still iterate over these comments

mehtarac added 15 commits September 25, 2025 10:47

feat(bidirectional_streaming): Add experimental bidirectional streami…

107f035

…ng MVP POC implementation

Updated doc strings, updated method from send_text() and send_audio()

9165a20

to send(), Updated imports

Updated minimum python runtime dependency

15df9f9

fix imports

3a0e7d5

fix linting issues

f7e67ae

Remove typing module and rely on python's built-in types

c654621

add typing to methods

1f1abac

Improve comments and remove unused method _convert_to_strands_event

eb543b5

Updated: fixed module imports baesd on the new smithy python release …

5921f8b

…on 09-29, added a lock for interruption handling

Removed unnecessary _output_queue check as the queue will always be i…

8cb4d98

…nitialized, and removed asyncio.sleep() as they were mainly for defensive purposes and following the pattern of nova sonic samples.

Remove redundant interruption checks

7a6e53e

Unified tool result and tool error methods, Added implementation to a…

a586261

…dd user messages to the agent messages

Modified logging to use python logger

16d9b46

Removed logging utility

04265ba

Updated types

8a7396c

mehtarac had a problem deploying to auto-approve October 3, 2025 13:32 — with GitHub Actions Failure

mehtarac changed the title ~~Strands bidi~~ feat(bidirectional_streaming): Add experimental bidirectional streaming MVP POC implementation Oct 3, 2025

mehtarac mentioned this pull request Oct 3, 2025

feat(bidirectional_streaming): Add experimental bidirectional streaming MVP POC implementation strands-agents/sdk-python#924

Closed

7 tasks

JackYPCOnline reviewed Oct 3, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Show resolved Hide resolved

JackYPCOnline reviewed Oct 3, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/models/novasonic.py Show resolved Hide resolved

JackYPCOnline reviewed Oct 3, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/models/novasonic.py Show resolved Hide resolved

JackYPCOnline reviewed Oct 3, 2025

View reviewed changes

mkmeral reviewed Oct 6, 2025

View reviewed changes

mkmeral approved these changes Oct 6, 2025

View reviewed changes

mehtarac merged commit 759eba5 into main Oct 6, 2025
1 of 12 checks passed

		logger.error("Tool error send failed: %s", str(send_error))


		def _extract_callable_function(tool_func: any) -> any:

feat(bidirectional_streaming): Add experimental bidirectional streaming MVP POC implementation #1

feat(bidirectional_streaming): Add experimental bidirectional streaming MVP POC implementation #1

Uh oh!

Conversation

mehtarac commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Pull Request: Bidirectional Streaming Implementation

Overview

Problem Statement

Solution

Architecture Overview

Component Architecture

1. BidirectionalAgent - User Interface Layer

2. BidirectionalConnection - Concurrent Event Loop Engine

Current Event Loop Architecture

New Concurrent Architecture

Event Loop Design

Event Flow and Processing

3. Model Interface - Protocol Normalization

Separation from Existing Model Architecture

4. Bidirectional Type System

5. Nova Sonic Model Provider Implementation

Implementation Benefits

Architecture Advantages

Maintained Compatibility

Experimental Status

Current State

API Stability Warning

Testing and Validation

Interactive Test Script

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

JackYPCOnline Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mehtarac commented Oct 3, 2025 •

edited

Loading

JackYPCOnline Oct 3, 2025 •

edited

Loading