Python: feat(a2a): use non-streaming transport and return_immediately for background ops#5963
Python: feat(a2a): use non-streaming transport and return_immediately for background ops#5963giles17 wants to merge 5 commits into
Conversation
…kground ops When stream=False, use a client configured with streaming=False so the SDK sends a single HTTP POST to message/send instead of opening an SSE connection via message/stream. This matches the A2A protocol's design: non-streaming calls use direct request/response, streaming calls use Server-Sent Events. Also sets return_immediately=background on SendMessageConfiguration so the server respects the caller's intent for background operations. Changes: - Create separate streaming and non-streaming internal clients (sharing the same httpx connection pool) to match protocol transport semantics - Select non-streaming client for run(stream=False) calls - Add SendMessageConfiguration with return_immediately=background - Fallback to streaming client when non-streaming unavailable (e.g. user provides their own client via constructor) - Add tests for client selection and return_immediately behavior Resolves microsoft#5936 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates the Python A2A agent to use the correct transport for non-streaming calls and to signal background execution to the server via return_immediately, aligning behavior with A2A protocol intent and the .NET implementation.
Changes:
- Instantiate both streaming and non-streaming internal A2A clients and select between them per
run(stream=...). - Set
SendMessageConfiguration(return_immediately=background)on outgoingSendMessageRequest. - Add tests covering transport selection, fallback behavior, and
return_immediatelyconfiguration.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| python/packages/a2a/agent_framework_a2a/_agent.py | Adds dual-client construction (streaming + non-streaming) and per-call client selection; attaches return_immediately to send requests. |
| python/packages/a2a/tests/test_a2a_agent.py | Adds tests for client selection/fallback and background return_immediately; captures last request in the mock client. |
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 81%
✓ Correctness
The PR correctly implements non-streaming transport selection and return_immediately configuration. The logic in active_client selection, constructor initialization paths, and the continuation_token/subscribe code path are all sound. The two issues flaged in the prior review thread (mock last_request not initialized in init, and using truthiness instead of
is not Nonefor client check) remain unresolved but no new correctness issues were found.
✓ Security Reliability
The changes are well-structured from a security and reliability perspective. Both streaming and non-streaming clients share the same httpx.AsyncClient connection pool, which is properly cleaned up in aexit. The _non_streaming_client attribute is correctly initialized on all successful construction paths (set to None when a user-provided client is used, or set via factory on the normal path). The existing unresolved review comments already cover the relevant defensive-coding concerns (explicit is-not-None check, last_request initialization, and conditional configuration). No new security or reliability issues found beyond those.
✓ Test Coverage
The new tests cover the key behavioral aspects well: client selection logic (streaming vs non-streaming), fallback behavior, and
return_immediatelyconfiguration. However, the constructor path that creates dual clients viaClientFactorywithstreaming=True/Falseconfigs is entirely untested — all tests use theclient=parameter which bypasses that code. The existingtest_a2a_agent_initialization_with_timeout_parameterpatchesClientFactoryat class level, so it doesn't verify the two separate streaming configs are created correctly.
✗ Design Approach
The main design issue is in A2Agent initialization: the new constructor now makes creation of the optional non-streaming client part of the same required negotiation step as the streaming client. That means a server/configuration that can be used successfully via streaming can still fail agent construction entirely if only the non-streaming client cannot be created, even though the updated run path already treats a missing
_non_streaming_clientas a valid fallback.
Automated review by giles17's agents
- Initialize last_request in MockA2AClient.__init__ for explicit state - Use 'is not None' instead of truthiness for _non_streaming_client check - Assert return_immediately propagates through non-streaming client path Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only attach SendMessageConfiguration to the request when background=True, keeping requests minimal and preserving server-side defaults for normal (foreground) operations. This follows the framework pattern of only setting optional fields when they have meaningful values. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| a2a_stream = self.client.send_message(SendMessageRequest(message=a2a_message)) | ||
| request = SendMessageRequest(message=a2a_message) | ||
| if background: | ||
| request.configuration.return_immediately = True |
There was a problem hiding this comment.
The return_immediately option has no effect on streaming operations according to the spec: https://a2a-protocol.org/latest/specification/#322-sendmessageconfiguration.
So, I'm curious about what the active_client.send_message call returns in a non-streaming scenario. Does it return a stream with a single update containing a task Id, or does it return an open stream that will receive updates as the task is updated on the server side? The first option seems correct, while the second does not.
There was a problem hiding this comment.
Thanks, I've updated the code to only set it when background=True AND stream=False.
In non-streaming mode, the SDK's send_message does a single POST to message/send, gets back a SendMessageResponse (containing either a Task or Message), wraps it in a single StreamResponse, and yields it. So, it's a stream with exactly one item.
Per the A2A spec, return_immediately only applies to message/send (non-streaming). It has no effect on streaming operations. Only set the configuration field when both background=True and stream=False. Adds test verifying streaming+background does not set return_immediately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Motivation and Context
Resolves #5936
The Python A2A agent was always using SSE streaming transport (
message/stream) regardless of whether the caller requested streaming. This means even a simplerun('Hello')call opens an SSE connection instead of making a single HTTP POST tomessage/send. Additionally, whenbackground=True, the server had no way to know it should return immediately.Changes
Non-streaming transport selection
httpx.AsyncClientconnection pool) to match A2A protocol transport semanticsstream=False, select the non-streaming client which usesmessage/send(single request/response)stream=True, use the streaming client which usesmessage/stream(SSE)return_immediately configuration
SendMessageConfiguration(return_immediately=background)on outgoingSendMessageRequestbackground=True, the server is informed it should return the Task immediately without blockingWhy two clients?
The Python A2A SDK (
a2a-sdk) bakes the transport choice (streaming vs non-streaming) intoClientConfigat client construction time — there's no per-call control. The .NET SDK exposes separateSendMessageAsyncandSendStreamingMessageAsyncmethods on a single client. To achieve equivalent behavior in Python, we create two lightweight wrapper clients that share the same underlying HTTP connection pool.Tests
stream=Falsestream=Truereturn_immediatelybeing set correctlyComparison with .NET
This matches the .NET implementation in
A2AAgent.cs:RunCoreAsync→SendMessageAsync(non-streaming) withReturnImmediatelyRunCoreStreamingAsync→SendStreamingMessageAsync(streaming)