Python: [BREAKING] Moved to a single get_response and run API #3379

eavanvalkenburg · 2026-01-22T17:34:18Z

Motivation and Context

Summary

Migrate chat/agent telemetry to mixin-based usage and remove legacy decorators, with streaming telemetry now using finalizers/teardown hooks instead of consuming streams.
- This makes understanding the code a lot simpler, because we can set attributes on the chat client in the init of those mixin (making them technically not a mixin)
- Added those parameters to the constructors, making it easier to configure things like function calling
Replace function invocation decorators with FunctionInvokingChatClient/FunctionInvokingMixin across clients, tests, and samples; update docs/comments accordingly.
Introducing a ResponseStream object that can is created to unify the API's
- It is generic over TUpdate and TFinal, which in our case is usually ChatResponseUpdate and ChatReponse or the agent equivalent.
- It features a update_hook mechanism, to allow you to run code while the internal stream is being unpacked, this can mostly be leveraged by middleware
- It features a teardown hook mechanism, this get's run when the stream is exhausted, it's used now by the telemtry to record the duration
- It features a finalizer (one or more) mechanism, that runs after the end of the stream, which is used to turn the updates list into a final object, this can be used by middleware and is also used in function calling and telemetry
- In principle the ResponseStream is created by the most lowlevel object, the actual chat client implementations, and ideally all the layers in between should only use the hooks to do something, FunctionCalling does not work that way, because there are multiple calls to the underlying chat client that all then have to be combined into a single stream at runtime. Agent also creates a new stream, because it goes from ResponseStream[ChatResponseUpdate, ChatResponse] to ResponseStream[AgentResponseUpdate, AgentResponse], but the object has a classmethod called wrap that is used to wrap the ResponseStream from the chat client into the new ResponseStream in the Agent.
Overall this change reduces the number of times we iterate the stream and return a new AsyncGenerator and the new hooks actually make it simpler to create middleware that alters the stream (as the sample shows), it should therefore also improve performance a bit.
Removed use_instrumentation/use_agent_instrumentation and use_function_invocation decorators; mixins are now the supported path.

Description

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add [BREAKING] prefix to the title of the PR.

Fixes #3585
Fixes #3607
Fixes #3617

Copilot

Pull request overview

This PR consolidates the Python Agent Framework's streaming and non-streaming APIs into a unified interface. The primary changes include:

Changes:

Unified run() and get_response() methods with stream parameter replacing separate run_stream() and get_streaming_response() methods
Migration from decorator-based (@use_instrumentation, @use_function_invocation) to mixin-based architecture for telemetry and function invocation
Introduction of ResponseStream class for unified stream handling with hooks, finalizers, and teardown support
Renamed AgentExecutionException to AgentRunException

Reviewed changes

Copilot reviewed 84 out of 85 changed files in this pull request and generated 28 comments.

Show a summary per file

File	Description
`_types.py`	Added `ResponseStream` class for unified streaming, updated `prepare_messages` to handle None
`_clients.py`	Refactored `BaseChatClient` with unified `get_response()` method, introduced `FunctionInvokingChatClient` mixin
`openai/_responses_client.py`	Consolidated streaming/non-streaming into single `_inner_get_response()` method
`openai/_chat_client.py`	Similar consolidation for chat completions API
`openai/_assistants_client.py`	Unified assistants API with stream parameter
`_workflows/_workflow.py`	Consolidated `run()` and `run_stream()` into single `run(stream=bool)` method
`_workflows/_agent.py`	Updated `WorkflowAgent.run()` to use stream parameter
Test files (multiple)	Updated all tests to use `run(stream=True)` and `get_response(stream=True)`
Sample files (multiple)	Updated samples to demonstrate new unified API
Provider clients	Updated all provider implementations (Azure, Anthropic, Bedrock, Ollama, etc.) to use mixins

python/packages/core/agent_framework/openai/_responses_client.py

python/packages/core/agent_framework/openai/_assistants_client.py

python/packages/core/agent_framework/_clients.py

python/packages/core/agent_framework/exceptions.py

python/packages/core/agent_framework/observability.py

python/packages/core/tests/workflow/test_agent_executor_tool_calls.py

python/packages/core/tests/workflow/test_handoff.py

python/packages/core/agent_framework/_tools.py

markwallace-microsoft · 2026-01-23T11:16:58Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/a2a/agent_framework_a2a
_agent.py	148	8	94%	262, 400–401, 438–439, 468–470
packages/ag-ui/agent_framework_ag_ui
_client.py	150	17	88%	83–84, 88–92, 96–100, 263, 295, 464–466
_event_converters.py	69	0	100%
_message_adapters.py	443	89	79%	64, 74–75, 84–87, 116–117, 120, 124–126, 135–137, 146–157, 159–161, 191, 204–205, 215–216, 253, 256, 258, 261, 264, 280, 297, 319, 350, 355, 366–367, 418, 434–435, 495–498, 500, 506, 514–515, 517, 521–524, 537, 626–629, 631, 697, 732–734, 736–739, 742–743, 745, 751, 754, 756, 759, 761, 767–768, 770
_run.py	441	124	71%	154–161, 304, 323–324, 339–340, 351, 354–355, 357–358, 360–362, 372, 382–385, 389–391, 393, 403, 406–409, 411–412, 415–421, 424–426, 429, 445–447, 454, 460–461, 463–464, 478–484, 495, 508, 510–511, 545–546, 603–605, 617–619, 643, 648–650, 766, 777–778, 785, 803–805, 839–841, 856, 862, 870, 872, 908–914, 917–920, 922–931, 934, 941–942, 947, 953–955, 968–970
_types.py	36	0	100%
_utils.py	101	2	98%	257, 262
packages/ag-ui/agent_framework_ag_ui/_orchestration
_tooling.py	57	0	100%
packages/anthropic/agent_framework_anthropic
_chat_client.py	361	150	58%	371, 403, 405, 420, 442–445, 454, 456, 487–491, 493, 495–496, 498, 503–504, 506, 539–540, 549, 551–552, 557, 574–575, 617, 632, 636–637, 653, 662, 664, 668–669, 712–714, 716, 729–730, 737–739, 743–745, 749–752, 763, 765, 787, 797, 819–825, 832–833, 841–842, 850–853, 860–861, 867–868, 874–875, 881, 889–891, 895, 902–903, 909–910, 916–917, 923, 931–934, 941–942, 961, 968–969, 988, 1010, 1012, 1021–1022, 1028, 1050–1051, 1057–1058, 1067–1077, 1084–1090, 1097–1103, 1110–1119, 1126–1129
packages/azure-ai/agent_framework_azure_ai
_agent_provider.py	115	3	97%	122–123, 251
_chat_client.py	484	75	84%	382, 387–388, 390–391, 394, 397, 399, 404, 665–666, 668, 671, 674, 677–682, 685, 687, 695, 707–709, 713, 716–717, 725–728, 738, 746–749, 751–752, 754–755, 762, 770–771, 779–780, 785–786, 790–797, 802, 805, 813, 819, 827–829, 832, 854–855, 988, 1016, 1031, 1152, 1178, 1187, 1196, 1329
_client.py	196	13	93%	360, 362, 411, 440–445, 488, 524, 526, 602
_project_provider.py	115	6	94%	132–133, 211, 309, 353, 386
packages/chatkit/agent_framework_chatkit
_converter.py	133	46	65%	115, 120, 168, 170, 340, 393, 395, 414–416, 418, 436, 438, 440, 443, 455, 465, 483, 503–527, 529–531
packages/copilotstudio/agent_framework_copilotstudio
_agent.py	83	5	93%	155–156, 191, 199, 316
packages/core/agent_framework
_agents.py	320	35	89%	473, 885, 921, 1020–1022, 1135, 1176, 1178, 1187–1192, 1198, 1200, 1210–1211, 1218, 1220–1221, 1229–1233, 1241–1242, 1244, 1249, 1251, 1285, 1325, 1345
_clients.py	52	3	94%	294, 495, 497
_middleware.py	335	16	95%	80, 83, 88, 797, 799, 801, 922, 949, 951, 976, 1057, 1061, 1183, 1187, 1248, 1322
_serialization.py	105	4	96%	516, 532, 542, 610
_tools.py	793	73	90%	232, 278, 329, 331, 359, 529, 564–565, 667, 669, 689, 707, 721, 733, 738, 740, 747, 780, 851–853, 894, 919–928, 934–943, 979, 987, 1228, 1433, 1490, 1494, 1574–1577, 1595, 1597–1598, 1703, 1759, 1761, 1777, 1779, 1844, 1871, 1924, 1992, 2194, 2223–2224, 2339–2344
_types.py	1130	101	91%	86, 109–110, 164, 169, 188, 190, 194, 198, 200, 202, 204, 222, 226, 252, 274, 279, 284, 288, 314, 318, 664–665, 1036, 1098, 1115, 1133, 1138, 1156, 1166, 1183–1184, 1186, 1204–1205, 1207, 1214–1215, 1217, 1252, 1263–1264, 1266, 1304, 1549, 1554, 1558, 1562, 1748, 1757, 1767, 1812, 1855–1860, 1882, 1887, 2182, 2288, 2297, 2445, 2672, 2676, 2688, 2695, 2706, 2866–2868, 2999, 3026, 3035, 3294–3296, 3299–3301, 3305, 3310, 3314, 3426–3428, 3456, 3510, 3514–3516, 3518, 3529–3530, 3533–3537, 3543
exceptions.py	48	0	100%
observability.py	606	84	86%	332, 334–336, 339–341, 346–347, 353–354, 360–361, 368, 370–372, 375–377, 382–383, 389–390, 396–397, 404, 660, 663, 671–672, 675–678, 680, 683–685, 688–689, 717, 719, 730–732, 734–737, 741, 749, 850, 852, 1001, 1003, 1007–1012, 1014, 1017–1021, 1023, 1135–1136, 1138, 1189–1190, 1325, 1373–1374, 1490–1492, 1551, 1721, 1875, 1877
packages/core/agent_framework/_workflows
_agent.py	284	45	84%	61, 69–75, 103–104, 296, 354, 368, 381, 430–433, 439, 445, 449–450, 453–459, 463–464, 533, 540, 546–547, 558, 590, 597, 618, 627, 631, 633–635, 642
_agent_executor.py	171	23	86%	95, 117, 151, 167–168, 219–220, 222–223, 255–257, 265–267, 277–279, 281, 285, 289, 293–294
_base_group_chat_orchestrator.py	170	12	92%	135, 301, 316, 350–352, 356, 375, 436, 480–482
_const.py	6	0	100%
_conversation_state.py	36	4	88%	40, 44, 47, 64
_group_chat.py	285	37	87%	172, 333, 340, 367, 378–379, 385, 390, 406, 433–438, 440, 473–476, 478, 483–487, 648, 653, 667, 748, 754, 799, 819, 915, 934, 953, 963
_handoff.py	381	57	85%	110–111, 113, 142–143, 168–178, 180, 182, 184, 189, 291, 345, 370, 396, 404–405, 419, 468–469, 499, 546–548, 731, 738, 743, 830, 833, 842–845, 855, 860, 867, 873–876, 911, 916, 1113, 1116, 1124, 1142, 1149, 1224
_magentic.py	614	91	85%	68–77, 82, 86–97, 262, 273, 277, 297, 358, 367, 369, 411, 428, 437–438, 440–442, 444, 455, 597, 599, 639, 687, 723–725, 727, 735–738, 742–745, 788, 815–818, 909, 915, 921, 960, 998, 1027, 1044, 1055, 1109–1110, 1114–1116, 1140, 1161–1162, 1175, 1191, 1213, 1261–1262, 1300–1301, 1457, 1466, 1469, 1474, 1870, 1912, 1927, 1956
_message_utils.py	18	3	83%	22, 33, 37
_orchestration_request_info.py	54	0	100%
_orchestrator_helpers.py	21	3	85%	44, 90–91
_runner_context.py	168	6	96%	84, 87, 383, 403, 491, 495
_workflow.py	252	17	93%	89, 259–261, 263–264, 282, 310, 411, 679, 713, 718, 721, 740–742, 807
_workflow_builder.py	278	36	87%	259, 594, 693, 700–701, 802, 805, 810, 812, 819, 822–826, 828, 890, 965, 968, 1028–1029, 1174, 1188–1195, 1197, 1200, 1202–1204, 1212
_workflow_context.py	177	24	86%	63–64, 72, 76, 90, 166, 191, 309, 428, 471–473, 475, 477–478, 480–481, 490–492, 494–496, 498
packages/core/agent_framework/azure
_chat_client.py	79	4	94%	301, 303, 316–317
_responses_client.py	37	6	83%	146, 169, 198–201
packages/core/agent_framework/openai
_assistant_provider.py	110	11	90%	156–157, 169, 294, 360, 475–480
_assistants_client.py	275	35	87%	359, 361, 363, 366, 370–371, 374, 377, 382–383, 385, 388–390, 395, 406, 431, 433, 435, 437, 439, 444, 447, 450, 454, 465, 550, 635, 672, 709–712, 764, 781
_chat_client.py	264	21	92%	180–181, 185, 295, 302, 383–390, 392–395, 405, 490, 527, 543
_responses_client.py	560	62	88%	277–278, 283, 314, 322, 345, 407, 439, 464, 470, 488–489, 511, 516, 572, 587, 601, 614, 669, 748, 753, 757–759, 763–764, 787, 856, 878–879, 894–895, 913–914, 1045–1046, 1062, 1064, 1139–1147, 1195, 1250, 1265, 1301–1302, 1304–1306, 1320–1322, 1332–1333, 1339, 1354
_shared.py	135	16	88%	63, 69–72, 151, 153, 155, 162, 164, 177, 253, 277, 341–342, 344
packages/mem0/agent_framework_mem0
_provider.py	85	3	96%	174–175, 178
packages/purview/agent_framework_purview
_middleware.py	95	0	100%
packages/redis/agent_framework_redis
_chat_message_store.py	149	14	90%	199, 232, 322–323, 326, 329, 485, 575–579, 588, 592
_provider.py	189	9	95%	255, 257, 265, 270–271, 274, 329, 386, 398
TOTAL	16506	2017	87%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
3855	225 💤	0 ❌	0 🔥	1m 11s ⏱️

python/packages/a2a/agent_framework_a2a/_agent.py

python/packages/anthropic/agent_framework_anthropic/_chat_client.py

python/packages/azure-ai/agent_framework_azure_ai/_client.py

python/packages/core/agent_framework/_workflows/_agent_executor.py

python/packages/core/agent_framework/_agents.py

python/packages/core/agent_framework/observability.py

eavanvalkenburg · 2026-02-02T10:26:04Z

Fixes lingering CI failures: import missing response types in streaming telemetry finalizers, move AG-UI tests to ag_ui_tests with config updates, and track service thread IDs in AG-UI test client.\n\nChecks: uv run poe fmt/lint/pyright/mypy; uv run poe all-tests.

python/packages/core/agent_framework/observability.py

python/packages/core/agent_framework/_tools.py

Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init. When tool_choice is not specified (None), it will now not be sent to the API, allowing the API's default behavior to be used. Users who want tool_choice='auto' can still explicitly set it either in default_options or at runtime. Fixes microsoft#3585

In OpenAI Assistants client, tools were not being sent when tool_choice='none'. This was incorrect - tool_choice='none' means the model won't call tools, but tools should still be available in the request (they may be used later in the conversation). Fixes microsoft#3585

Adds a regression test to ensure that when tool_choice='none' is set but tools are provided, the tools are still sent to the API. This verifies the fix for microsoft#3585.

Apply the same fix to OpenAI Responses client and Azure AI client: - OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls - Azure AI: Remove tool_choice != 'none' check when adding tools When tool_choice='none', the model won't call tools, but tools should still be sent to the API so they're available for future turns. Also update README to clarify tool_choice=required supports multiple tools. Fixes microsoft#3585

Move tool_choice processing outside of the 'if tools' block in OpenAI Responses client so tool_choice is sent to the API even when no tools are provided.

Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect that parallel_tool_calls is now preserved even when no tools are present, consistent with the tool_choice behavior.

- Update ChatMessage instantiation to use keyword args (role=, text=, contents=) - Fix Role enum comparisons to use .value for string comparison - Add created_at to AgentResponse in error handling - Fix AgentResponse.from_updates -> from_agent_run_response_updates - Fix DurableAgentStateMessage.from_chat_message to convert Role enum to string - Add Role import where needed

- Fix ChatMessage usage in workflow files (use text= instead of contents= for strings) - Fix AgentResponse.from_updates -> from_agent_run_response_updates in workflow files - Fix test files for ChatMessage and Role enum usage

- Fix ChatMessage in _magentic.py replan method - Fix Role enum comparison in test assertions - Fix remaining test files with old ChatMessage syntax

- Add Role import where missing - Fix ChatMessage signature: positional args to keyword args (role=, text=, contents=) - Fix Role enum comparisons: .role.value instead of .role string - Fix FinishReason enum usage in ag-ui event converters - Rename AgentResponse.from_updates to from_agent_run_response_updates in ag-ui Fixes API compatibility after Types API Review improvements merge

…ages - Fix redis provider: Role enum comparison using .value - Fix redis tests: ChatMessage signature and Role comparisons - Fix github_copilot tests: ChatMessage signature and Role comparisons - Update docstring examples in redis chat message store

- Fix executor: ChatMessage signature change - Fix conversations: Role enum to string conversion in two places - Fix tests: ChatMessage signatures and Role comparisons

- Fix a2a tests: Role comparisons and ChatMessage signatures - Fix lab tau2 source: Role enum comparison in flip_messages, log_messages, sliding_window - Fix lab tau2 tests: ChatMessage signatures and Role comparisons

After rebasing on upstream/main which merged PR microsoft#3647 (Types API Review improvements), fix all packages to use the new API: - ChatMessage: Use keyword args (role=, text=, contents=) instead of positional args - Role: Compare using .value attribute since it's now an enum Packages fixed: - ag-ui: Fixed Role value extraction bugs in _message_adapters.py - anthropic: Fixed ChatMessage and Role comparisons in tests - azure-ai: Fixed Role comparison in _client.py - azure-ai-search: Fixed ChatMessage and Role in source/tests - bedrock: Fixed ChatMessage signatures in tests - chatkit: Fixed ChatMessage and Role in source/tests - copilotstudio: Fixed ChatMessage and Role in tests - declarative: Fixed ChatMessage in _executors_agents.py - mem0: Fixed ChatMessage and Role in source/tests - purview: Fixed ChatMessage in source/tests

- durabletask: Use str() fallback in role value extraction - core: Fix ChatMessage in _orchestrator_helpers.py to use keyword args - core: Add type ignore for _conversation_state.py contents deserialization - ag-ui: Fix type ignore comments (call-overload instead of arg-type) - azure-ai-search: Fix get_role_value type hint to accept Any - lab: Move get_role_value to module level with Any type hint

- Increase job timeout from 10 to 15 minutes - Reduce per-test timeout to 60s (was 900s/300s) - Add --timeout_method thread for better timeout handling - Add --timeout-verbose to see which tests are slow - Reduce retries from 3 to 2 and delay from 10s to 5s This ensures individual test timeouts are shorter than the job timeout, providing better visibility when tests hang. With 60s timeout and 2 retries, worst case per test is ~180s.

@overload

* WIP * big update to new ResponseStream model * fixed tests and typing * fixed tests and typing * fixed tools typevar import * fix * mypy fix * mypy fixes and some cleanup * fix missing quoted names * and client * fix imports agui * fix anthropic override * fix agui * fix ag ui * fix import * fix anthropic types * fix mypy * refactoring * updated typing * fix 3.11 * fixes * redid layering of chat clients and agents * redid layering of chat clients and agents * Fix lint, type, and test issues after rebase - Add @overload decorators to AgentProtocol.run() for type compatibility - Add missing docstring params (middleware, function_invocation_configuration) - Fix TODO format (TD002) by adding author tags - Fix broken observability tests from upstream: - Replace non-existent use_instrumentation with direct instantiation - Replace non-existent use_agent_instrumentation with AgentTelemetryLayer mixin - Fix get_streaming_response to use get_response(stream=True) - Add AgentInitializationError import - Update streaming exception tests to match actual behavior * Fix AgentExecutionException import error in test_agents.py - Replace non-existent AgentExecutionException with AgentRunException * Fix test import and asyncio deprecation issues - Add 'tests' to pythonpath in ag-ui pyproject.toml for utils_test_ag_ui import - Replace deprecated asyncio.get_event_loop().run_until_complete with asyncio.run * Fix azure-ai test failures - Update _prepare_options patching to use correct class path - Fix test_to_azure_ai_agent_tools_web_search_missing_connection to clear env vars * Convert ag-ui utils_test_ag_ui.py to conftest.py - Move test utilities to conftest.py for proper pytest discovery - Update all test imports to use conftest instead of utils_test_ag_ui - Remove old utils_test_ag_ui.py file - Revert pythonpath change in pyproject.toml * fix: use relative imports for ag-ui test utilities * fix agui * Rename Bare*Client to Raw*Client and BaseChatClient - Renamed BareChatClient to BaseChatClient (abstract base class) - Renamed BareOpenAIChatClient to RawOpenAIChatClient - Renamed BareOpenAIResponsesClient to RawOpenAIResponsesClient - Renamed BareAzureAIClient to RawAzureAIClient - Added warning docstrings to Raw* classes about layer ordering - Updated README in samples/getting_started/agents/custom with layer docs - Added test for span ordering with function calling * Fix layer ordering: FunctionInvocationLayer before ChatTelemetryLayer This ensures each inner LLM call gets its own telemetry span, resulting in the correct span sequence: chat -> execute_tool -> chat Updated all production clients and test mocks to use correct ordering: - ChatMiddlewareLayer (first) - FunctionInvocationLayer (second) - ChatTelemetryLayer (third) - BaseChatClient/Raw...Client (fourth) * Remove run_stream usage * Fix conversation_id propagation * Update uv.lock with latest dependencies * Python: Add BaseAgent implementation for Claude Agent SDK (#3509) * Added ClaudeAgent implementation * Updated streaming logic * Small updates * Small update * Fixes * Small fix * Naming improvements * Updated imports * Addressed comments * Updated package versions * Update Claude agent connector layering * fix test and plugin * Store function middleware in invocation layer * Fix telemetry streaming and ag-ui tests * Remove legacy ag-ui tests folder * updates * Remove terminate flag from FunctionInvocationContext, use MiddlewareTermination instead - Remove terminate attribute from FunctionInvocationContext - Add result attribute to MiddlewareTermination to carry function results - FunctionMiddlewarePipeline.execute() now lets MiddlewareTermination propagate - _auto_invoke_function captures context.result in exception before re-raising - _try_execute_function_calls catches MiddlewareTermination and sets should_terminate - Fix handoff middleware to append to chat_client.function_middleware directly - Update tests to use raise MiddlewareTermination instead of context.terminate - Add middleware flow documentation in samples/concepts/tools/README.md - Fix ag-ui to use FunctionMiddlewarePipeline instead of removed create_function_middleware_pipeline * fix: remove references to removed terminate flag in purview tests, add type ignore * fix: move _test_utils.py from package to test folder * fix: call get_final_response() to trigger context provider notification in streaming test * fix: correct broken links in tools README * docs: clarify default middleware behavior in summary table * fix: ensure inner stream result hooks are called when using map()/from_awaitable() * Fix mypy type errors * Address PR review comments on observability.py - Remove TODO comment about unconsumed streams, add explanatory note instead - Remove redundant _close_span cleanup hook (already called in _finalize_stream) - Clarify behavior: cleanup hooks run after stream iteration, if stream is not consumed the span remains open until garbage collected * Remove gen_ai.client.operation.duration from span attributes Duration is a metrics-only attribute per OpenTelemetry semantic conventions. It should be recorded to the histogram but not set as a span attribute. * Remove duration from _get_response_attributes, pass directly to _capture_response Duration is a metrics-only attribute. It's now passed directly to _capture_response instead of being included in the attributes dict that gets set on the span. * Remove redundant _close_span cleanup hook in AgentTelemetryLayer _finalize_stream already calls _close_span() in its finally block, so adding it as a separate cleanup hook is redundant. * Use weakref.finalize to close span when stream is garbage collected If a user creates a streaming response but never consumes it, the cleanup hooks won't run. Now we register a weak reference finalizer that will close the span when the stream object is garbage collected, ensuring spans don't leak in this scenario. * Fix _get_finalizers_from_stream to use _result_hooks attribute Renamed function to _get_result_hooks_from_stream and fixed it to look for the _result_hooks attribute which is the correct name in ResponseStream class. * Add missing asyncio import in test_request_info_mixin.py * Fix leftover merge conflict marker in image_generation sample * Update integration tests * Fix integration tests: increase max_iterations from 1 to 2 Tests with tool_choice options require at least 2 iterations: 1. First iteration to get function call and execute the tool 2. Second iteration to get the final text response With max_iterations=1, streaming tests would return early with only the function call/result but no final text content. * Fix duplicate function call error in conversation-based APIs When using conversation_id (for Responses/Assistants APIs), the server already has the function call message from the previous response. We should only send the new function result message, not all messages including the function call which would cause a duplicate ID error. Fix: When conversation_id is set, only send the last message (the tool result) instead of all response.messages. * Add regression test for conversation_id propagation between tool iterations Port test from PR #3664 with updates for new streaming API pattern. Tests that conversation_id is properly updated in options dict during function invocation loop iterations. * Fix tool_choice=required to return after tool execution When tool_choice is 'required', the user's intent is to force exactly one tool call. After the tool executes, return immediately with the function call and result - don't continue to call the model again. This fixes integration tests that were failing with empty text responses because with tool_choice=required, the model would keep returning function calls instead of text. Also adds regression tests for: - conversation_id propagation between tool iterations (from PR #3664) - tool_choice=required returns after tool execution * Document tool_choice behavior in tools README - Add table explaining tool_choice values (auto, none, required) - Explain why tool_choice=required returns immediately after tool execution - Add code example showing the difference between required and auto - Update flow diagram to show the early return path for tool_choice=required * Fix tool_choice=None behavior - don't default to 'auto' Remove the hardcoded default of 'auto' for tool_choice in ChatAgent init. When tool_choice is not specified (None), it will now not be sent to the API, allowing the API's default behavior to be used. Users who want tool_choice='auto' can still explicitly set it either in default_options or at runtime. Fixes #3585 * Fix tool_choice=none should not remove tools In OpenAI Assistants client, tools were not being sent when tool_choice='none'. This was incorrect - tool_choice='none' means the model won't call tools, but tools should still be available in the request (they may be used later in the conversation). Fixes #3585 * Add test for tool_choice=none preserving tools Adds a regression test to ensure that when tool_choice='none' is set but tools are provided, the tools are still sent to the API. This verifies the fix for #3585. * Fix tool_choice=none should not remove tools in all clients Apply the same fix to OpenAI Responses client and Azure AI client: - OpenAI Responses: Remove else block that popped tool_choice/parallel_tool_calls - Azure AI: Remove tool_choice != 'none' check when adding tools When tool_choice='none', the model won't call tools, but tools should still be sent to the API so they're available for future turns. Also update README to clarify tool_choice=required supports multiple tools. Fixes #3585 * Keep tool_choice even when tools is None Move tool_choice processing outside of the 'if tools' block in OpenAI Responses client so tool_choice is sent to the API even when no tools are provided. * Update test to match new parallel_tool_calls behavior Changed test_prepare_options_removes_parallel_tool_calls_when_no_tools to test_prepare_options_preserves_parallel_tool_calls_when_no_tools to reflect that parallel_tool_calls is now preserved even when no tools are present, consistent with the tool_choice behavior. * Fix ChatMessage API and Role enum usage after rebase - Update ChatMessage instantiation to use keyword args (role=, text=, contents=) - Fix Role enum comparisons to use .value for string comparison - Add created_at to AgentResponse in error handling - Fix AgentResponse.from_updates -> from_agent_run_response_updates - Fix DurableAgentStateMessage.from_chat_message to convert Role enum to string - Add Role import where needed * Fix additional ChatMessage API and method name changes - Fix ChatMessage usage in workflow files (use text= instead of contents= for strings) - Fix AgentResponse.from_updates -> from_agent_run_response_updates in workflow files - Fix test files for ChatMessage and Role enum usage * Fix remaining ChatMessage API usage in test files * Fix more ChatMessage and Role API changes in source and test files - Fix ChatMessage in _magentic.py replan method - Fix Role enum comparison in test assertions - Fix remaining test files with old ChatMessage syntax * Fix ChatMessage and Role API changes across packages - Add Role import where missing - Fix ChatMessage signature: positional args to keyword args (role=, text=, contents=) - Fix Role enum comparisons: .role.value instead of .role string - Fix FinishReason enum usage in ag-ui event converters - Rename AgentResponse.from_updates to from_agent_run_response_updates in ag-ui Fixes API compatibility after Types API Review improvements merge * Fix ChatMessage and Role API changes in github_copilot tests * Fix ChatMessage and Role API changes in redis and github_copilot packages - Fix redis provider: Role enum comparison using .value - Fix redis tests: ChatMessage signature and Role comparisons - Fix github_copilot tests: ChatMessage signature and Role comparisons - Update docstring examples in redis chat message store * Fix ChatMessage and Role API changes in devui package - Fix executor: ChatMessage signature change - Fix conversations: Role enum to string conversion in two places - Fix tests: ChatMessage signatures and Role comparisons * Fix ChatMessage and Role API changes in a2a and lab packages - Fix a2a tests: Role comparisons and ChatMessage signatures - Fix lab tau2 source: Role enum comparison in flip_messages, log_messages, sliding_window - Fix lab tau2 tests: ChatMessage signatures and Role comparisons * Remove duplicate test files from ag-ui/tests (tests are in ag_ui_tests) * Fix ChatMessage and Role API changes across packages After rebasing on upstream/main which merged PR #3647 (Types API Review improvements), fix all packages to use the new API: - ChatMessage: Use keyword args (role=, text=, contents=) instead of positional args - Role: Compare using .value attribute since it's now an enum Packages fixed: - ag-ui: Fixed Role value extraction bugs in _message_adapters.py - anthropic: Fixed ChatMessage and Role comparisons in tests - azure-ai: Fixed Role comparison in _client.py - azure-ai-search: Fixed ChatMessage and Role in source/tests - bedrock: Fixed ChatMessage signatures in tests - chatkit: Fixed ChatMessage and Role in source/tests - copilotstudio: Fixed ChatMessage and Role in tests - declarative: Fixed ChatMessage in _executors_agents.py - mem0: Fixed ChatMessage and Role in source/tests - purview: Fixed ChatMessage in source/tests * Fix mypy errors for ChatMessage and Role API changes - durabletask: Use str() fallback in role value extraction - core: Fix ChatMessage in _orchestrator_helpers.py to use keyword args - core: Add type ignore for _conversation_state.py contents deserialization - ag-ui: Fix type ignore comments (call-overload instead of arg-type) - azure-ai-search: Fix get_role_value type hint to accept Any - lab: Move get_role_value to module level with Any type hint * Improve CI test timeout configuration - Increase job timeout from 10 to 15 minutes - Reduce per-test timeout to 60s (was 900s/300s) - Add --timeout_method thread for better timeout handling - Add --timeout-verbose to see which tests are slow - Reduce retries from 3 to 2 and delay from 10s to 5s This ensures individual test timeouts are shorter than the job timeout, providing better visibility when tests hang. With 60s timeout and 2 retries, worst case per test is ~180s. --------- Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>

Copilot AI review requested due to automatic review settings January 22, 2026 17:34

markwallace-microsoft added documentation Improvements or additions to documentation python labels Jan 22, 2026

Copilot started reviewing on behalf of eavanvalkenburg January 22, 2026 17:35 View session

Copilot AI reviewed Jan 22, 2026

View reviewed changes

eavanvalkenburg force-pushed the python_single_response branch 3 times, most recently from 07afd46 to dd65afa Compare January 23, 2026 10:46

eavanvalkenburg changed the title ~~Python: [BREAKING} Python single response~~ Python: [BREAKING] Moved to a single get_response and run API Jan 23, 2026

eavanvalkenburg force-pushed the python_single_response branch 4 times, most recently from 32f0473 to 5c78d91 Compare January 30, 2026 05:03

markwallace-microsoft mentioned this pull request Jan 30, 2026

Python: NET: [Feature] Simplify AIAgent run methods #3515

Open

eavanvalkenburg requested a review from a team as a code owner January 30, 2026 16:25

TaoChenOSU reviewed Jan 30, 2026

View reviewed changes

markwallace-microsoft added .NET workflows Related to Workflows in agent-framework lab Agent Framework Lab labels Feb 1, 2026

github-actions bot changed the title ~~Python: [BREAKING] Moved to a single get_response and run API~~ .NET: Python: [BREAKING] Moved to a single get_response and run API Feb 1, 2026

eavanvalkenburg force-pushed the python_single_response branch from ebfc3b0 to 92995e6 Compare February 1, 2026 14:58

eavanvalkenburg removed the .NET label Feb 1, 2026

eavanvalkenburg changed the title ~~.NET: Python: [BREAKING] Moved to a single get_response and run API~~ Python: [BREAKING] Moved to a single get_response and run API Feb 1, 2026

moonbox3 reviewed Feb 3, 2026

View reviewed changes

python/packages/core/agent_framework/observability.py Outdated Show resolved Hide resolved

python/packages/core/agent_framework/observability.py Outdated Show resolved Hide resolved

python/packages/core/agent_framework/observability.py Outdated Show resolved Hide resolved

eavanvalkenburg force-pushed the python_single_response branch from a8f7c92 to a5dadf8 Compare February 3, 2026 15:36

dmytrostruk approved these changes Feb 4, 2026

View reviewed changes

python/packages/core/agent_framework/_tools.py Outdated Show resolved Hide resolved

eavanvalkenburg force-pushed the python_single_response branch from 92df8e3 to a99fdba Compare February 4, 2026 08:08

eavanvalkenburg enabled auto-merge February 4, 2026 08:23

eavanvalkenburg added 11 commits February 4, 2026 12:21

Add test for tool_choice=none preserving tools

0f416c0

Adds a regression test to ensure that when tool_choice='none' is set but tools are provided, the tools are still sent to the API. This verifies the fix for microsoft#3585.

Keep tool_choice even when tools is None

f9f5bd7

Move tool_choice processing outside of the 'if tools' block in OpenAI Responses client so tool_choice is sent to the API even when no tools are provided.

Fix additional ChatMessage API and method name changes

d3daf0f

- Fix ChatMessage usage in workflow files (use text= instead of contents= for strings) - Fix AgentResponse.from_updates -> from_agent_run_response_updates in workflow files - Fix test files for ChatMessage and Role enum usage

Fix remaining ChatMessage API usage in test files

c824cc3

Fix more ChatMessage and Role API changes in source and test files

11bd057

- Fix ChatMessage in _magentic.py replan method - Fix Role enum comparison in test assertions - Fix remaining test files with old ChatMessage syntax

eavanvalkenburg force-pushed the python_single_response branch from 04fa714 to 23b22c0 Compare February 4, 2026 12:58

eavanvalkenburg added 7 commits February 4, 2026 14:00

Fix ChatMessage and Role API changes in github_copilot tests

32aa605

Fix ChatMessage and Role API changes in devui package

0e56ecf

- Fix executor: ChatMessage signature change - Fix conversations: Role enum to string conversion in two places - Fix tests: ChatMessage signatures and Role comparisons

Fix ChatMessage and Role API changes in a2a and lab packages

9ebb1e3

- Fix a2a tests: Role comparisons and ChatMessage signatures - Fix lab tau2 source: Role enum comparison in flip_messages, log_messages, sliding_window - Fix lab tau2 tests: ChatMessage signatures and Role comparisons

Remove duplicate test files from ag-ui/tests (tests are in ag_ui_tests)

4cefbc0

eavanvalkenburg enabled auto-merge February 4, 2026 16:05

eavanvalkenburg added the breaking change Introduces changes that are not backward compatible and may require updates to dependent code. label Feb 4, 2026

markwallace-microsoft approved these changes Feb 4, 2026

View reviewed changes

eavanvalkenburg added this pull request to the merge queue Feb 4, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 4, 2026

eavanvalkenburg force-pushed the python_single_response branch from 5031785 to d42c470 Compare February 4, 2026 18:07

dmytrostruk approved these changes Feb 4, 2026

View reviewed changes

eavanvalkenburg added this pull request to the merge queue Feb 4, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 4, 2026

Python: [BREAKING] Moved to a single get_response and run API #3379

Are you sure you want to change the base?

Python: [BREAKING] Moved to a single get_response and run API #3379

Uh oh!

Conversation

eavanvalkenburg commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

Contribution Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

markwallace-microsoft commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eavanvalkenburg commented Feb 2, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

eavanvalkenburg commented Jan 22, 2026 •

edited

Loading

markwallace-microsoft commented Jan 23, 2026 •

edited

Loading