Python: Fix file_search citations breaking assistant history roundtrip by moonbox3 · Pull Request #5557 · microsoft/agent-framework

moonbox3 · 2026-04-29T02:15:18Z

Motivation and Context

Closes #5556. After the RC5 → 1.0 migration to the Responses API, multi-agent flows (SequentialBuilder, GroupChatBuilder) break with a 400 from the API whenever one agent uses file_search and its history is forwarded to another agent. The root cause is in OpenAIResponsesClient (inherited by FoundryChatClient):

Outbound _prepare_content_for_openai maps hosted_file → input_file for any role, but input_file is an input-only content type that the Responses API rejects inside an assistant message.
Streaming attaches file_citation / container_file_citation / file_path annotations as standalone HostedFileContent items, while the non-streaming path attaches them as text annotations. The asymmetry means streaming users always trip (1).
Outbound output_text hardcodes "annotations": [], silently dropping citation context on every roundtrip even when (1) wouldn't fire.

This worked in RC5 because Chat Completions used flat text annotations and had no input/output content-type schema split.

Description

Three coordinated fixes in packages/openai/agent_framework_openai/_chat_client.py:

Skip hosted_file for assistant role in _prepare_content_for_openai. Hosted-file content on an assistant message is a citation reference, not a replayable input file; dropping it stops the 400.
Streaming attaches citations as text annotations, matching the non-streaming path. file_citation, container_file_citation, and file_path now produce Content.from_text(text="", annotations=[Annotation(type="citation", ...)]) instead of a standalone HostedFileContent. URL citations (which already used this pattern) are unchanged.
output_text preserves annotations on roundtrip via a new _annotations_to_output_text helper that converts framework Annotation objects back to Responses API annotation dicts (file_citation, url_citation, container_file_citation, file_path).

Built TDD-style: 4 new behavior tests plus an end-to-end regression test that exercises the exact streaming-citation → assistant-history-forwarding flow from the bug report. 3 existing streaming-annotation tests were updated to assert the new (consistent) behavior — they previously asserted the buggy split-content shape.

Verified locally: pytest packages/openai/tests/openai/, pytest packages/foundry/tests/foundry/, pytest packages/core/tests/, mypy, and ruff all clean.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

The Responses API rejects 'input_file' inside an assistant message, but the SDK was emitting it whenever an assistant Message contained a hosted_file content (which is what file_search citations become). Three coordinated fixes: 1. _prepare_content_for_openai now skips hosted_file for the assistant role instead of mapping to input_file (which the API rejects there). 2. The streaming response.output_text.annotation.added handler attaches file_citation, container_file_citation, and file_path as annotations on text content, matching the non-streaming path. Previously streaming produced standalone HostedFileContent items that always tripped (1). 3. output_text serialization preserves Annotation objects on roundtrip via a new _annotations_to_output_text helper instead of hardcoding 'annotations' to []. file_search citations now survive multi-agent forwarding. Closes microsoft#5556.

moonbox3 · 2026-04-29T02:17:53Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/openai/agent_framework_openai
_chat_client.py	968	123	87%	263, 276, 601–604, 608–609, 615–616, 651–657, 678, 686, 709, 827, 926, 985, 987, 989, 991, 1057, 1071, 1151, 1161, 1166, 1209, 1325, 1506, 1511, 1515–1517, 1521–1522, 1588, 1624, 1630, 1640, 1646, 1651, 1657, 1662–1663, 1682, 1772, 1794–1795, 1810–1811, 1829–1830, 1873, 2039, 2077–2078, 2094, 2096, 2176–2184, 2214, 2324, 2359, 2374, 2394–2404, 2417, 2428–2432, 2446, 2460–2471, 2480, 2512–2515, 2525–2526, 2537–2539, 2553–2555, 2565–2566, 2572, 2587
TOTAL	30554	3562	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
6139	30 💤	0 ❌	0 🔥	1m 46s ⏱️

Copilot

Pull request overview

Fixes a Responses API incompatibility where file_search citations (and other hosted-file citation artifacts) could be serialized back into assistant history as input_file, causing 400s when multi-agent workflows forward assistant messages.

Changes:

Preserve assistant output_text.annotations on serialization by converting framework Annotation(type="citation") back into Responses API annotation dicts.
Prevent hosted_file contents in assistant messages from roundtripping as input_file by dropping them during outbound preparation.
Make streaming citation events attach as text annotations (via empty text Content) rather than standalone HostedFileContent, aligning streaming with non-streaming behavior; update/add regression tests.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
python/packages/openai/agent_framework_openai/_chat_client.py	Adds annotation roundtrip helper, drops assistant hosted_file on outbound, and changes streaming citation handling to produce text annotations.
python/packages/openai/tests/openai/test_openai_chat_client.py	Adds/updates tests to cover citation roundtrip behavior and the streaming→history regression.

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 90%

✓ Correctness

This PR fixes a roundtrip bug where streaming file_search citations were stored as HostedFileContent, then serialized as input_file items in assistant history — rejected by the Responses API. The fix correctly converts these to text annotations (matching the non-streaming path) and adds a helper to reconstruct API-format annotations on output. The branching logic in _annotations_to_output_text correctly maps all four annotation variants (file_path, file_citation, url_citation, container_file_citation) and the priority ordering of the elif chain prevents misclassification. Dropping hosted_file for assistant role returns {} which is properly filtered by the existing if prepared_content: guard in _prepare_message_for_openai. Tests are comprehensive and cover the main code paths.

✓ Security Reliability

This diff fixes a real bug where streaming file_search citations were represented as HostedFileContent, which then serialized as input_file in assistant history and was rejected by the Responses API. The fix converts these to text annotations that roundtrip cleanly. The new _annotations_to_output_text helper safely reshapes Annotation TypedDicts (all fields optional, accessed via .get()) into API-compatible dicts. The empty-dict guard for hosted_file on assistant messages is correctly filtered by existing calers (line 1352: if prepared_content: — empty dict is falsy). The getattr(content, "annotations", None) is slightly redundant since content.annotations is unconditionally set in Content.__init__, but is harmlessly defensive. No injection risks, resource leaks, secrets exposure, or unhandled failure modes found.

✓ Test Coverage

The PR adds four new tests and updates three existing streaming annotation tests to reflect the behavioral change from Content.from_hosted_file to Content.from_text with citation annotations. Test coverage is solid for the major paths: file_citation, url_citation, and container_file_citation roundtrips are all verified, as is the hosted_file → {} guard for assistant role. However, the file_path branch of _annotations_to_output_text (the elif file_id: path on diff line ~300) has no roundtrip test, even though it is one of the three streaming annotation types that changed representation in this PR. The streaming-parse side is tested (test_streaming_annotation_added_with_file_path), but no test verifies that the resulting annotation serializes back to {"type": "file_path", "file_id": ...} via _annotations_to_output_text.

✓ Design Approach

The change fixes the immediate input_file rejection, but the streaming-side approach still breaks the underlying model of citations: it now emits annotation-added events as separate empty text contents instead of attaching them to the streamed text they annotate. Since assistant history serialization writes each Content as its own output_text item, the annotation offsets are no longer relative to the actual text, unlike the non-streaming path which keeps text and annotations together. That means the PR avoids one API error while still losing the text/citation relationship during round-trip.

Automated review by moonbox3's agents

- _annotations_to_output_text: fan out one entry per annotated_region for url_citation/container_file_citation (Annotation.annotated_regions is a Sequence; the API form carries one start/end per entry). - Validate region span bounds are ints before emitting; skip otherwise. - Add test for the file_path branch (annotation with file_id only). - Add test verifying streamed citation events coalesce onto surrounding text via _finalize_response so span indices reference the merged text, not the empty-text streaming carrier.

moonbox3 · 2026-04-29T02:25:24Z

Pushed 4c67642 addressing all four review comments:

annotated_regions fan-out: _annotations_to_output_text now emits one Responses API entry per region for url_citation / container_file_citation, instead of collapsing to regions[0].
Span validation: regions whose start_index / end_index are not ints are skipped, so the helper can't produce malformed payloads when fed partial regions.
file_path test: added test_assistant_text_preserves_file_path_annotation covering the file_id-only branch that wasn't exercised before.
Streaming citation attachment: the concern that span indices would reference text == "" doesn't apply once _finalize_response runs — _coalesce_text_content merges consecutive text contents and unions their annotations, so the citation lands on the merged assistant text. Added test_streamed_file_citation_coalesces_onto_surrounding_text to lock that behavior in. Verified locally with the full chunk → ChatResponse.from_updates path: empty-text + annotation update gets merged into the preceding "Hello world." text content with the annotation attached.

* Python: bump package versions for 1.2.2 release PATCH bump (1.2.1 -> 1.2.2) for the released cohort. Five PRs land in this window: - agent-framework-openai: fix file_search citations breaking the assistant- message history roundtrip (#5557) — drives the released-tier PATCH - agent-framework-orchestrations: [BREAKING] standardize orchestration terminal outputs as AgentResponse (#5301) - agent-framework-core, agent-framework-declarative: preserve Workflow.run() shared state across calls, accept list[Message] in declarative start executor, and coerce Enum values when serializing PowerFx symbols (#5531) - agent-framework-foundry-hosting: add hosted Durable Workflow support (#5531) - agent-framework-azure-contentunderstanding: new alpha package — Azure AI Content Understanding context provider (#4829) - dependencies: workspace package dependency refresh (#5555) Per lockstep convention, all 21 beta packages stamp 1.0.0b260429 and all 4 alpha packages (now including the new contentunderstanding) stamp 1.0.0a260429. Date stamp reflects 2026-04-29 Pacific. Every non-core package floor on agent-framework-core is raised to >=1.2.2; the new contentunderstanding package's stale >=1.0.0 floor is brought into line. Two follow-on fixes bundled to keep validate-dependency-bounds-test green at lowest-direct resolution: - Bump agent-framework-azure-contentunderstanding's azure-ai-content understanding lower bound from >=1.0.0 to >=1.0.1 (1.0.0 ships without proper typing — pyright reports 65 unknown-type errors) - Add pyright ignore comments to core/foundry/__init__.pyi for the new alpha package's type-stub imports, since alpha packages are not in core's [all] extra and therefore aren't installed at lowest-direct * Python: add #5552 to 1.2.2 CHANGELOG Add the streaming-span observability fix to the Fixed section. PR is on upstream/main but not yet pulled into origin/main; the code itself will land via the PR merge. * Python: address PR #5561 review feedback on dependency bounds Two packaging fixes flagged in review: 1. agent-framework-azure-contentunderstanding: add agent-framework-foundry as a runtime dependency. The package's README directs users to `pip install agent-framework-azure-contentunderstanding --pre` and the basic example imports `FoundryChatClient` from `agent_framework.foundry`, so the documented install path was failing with ImportError. Pulling agent-framework-foundry into deps makes the advertised entry path self-contained. 2. agent-framework-foundry: bump agent-framework-openai lower bound from >=1.1.0 to >=1.2.2,<2. Foundry imports private modules from agent_framework_openai (`_chat_client.py:22`, `_agent.py:34`), so resolvers were free to pair foundry==1.2.2 with older OpenAI versions that lack this release's coordinated Responses/history fix. Lockstep the floor with the released cohort to prevent mismatched installs. Both changes pass `validate-dependency-bounds-test` lower + upper at their respective packages.

Copilot AI review requested due to automatic review settings April 29, 2026 02:15

moonbox3 added the python label Apr 29, 2026

Copilot started reviewing on behalf of moonbox3 April 29, 2026 02:15 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated

github-actions Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread python/packages/openai/tests/openai/test_openai_chat_client.py

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py

moonbox3 self-assigned this Apr 29, 2026

moonbox3 added the agents Issues related to single agents label Apr 29, 2026

moonbox3 added this to Agent Framework Apr 29, 2026

TaoChenOSU approved these changes Apr 29, 2026

View reviewed changes

moonbox3 enabled auto-merge April 29, 2026 05:38

moonbox3 requested a review from eavanvalkenburg April 29, 2026 06:41

eavanvalkenburg approved these changes Apr 29, 2026

View reviewed changes

moonbox3 added this pull request to the merge queue Apr 29, 2026

Merged via the queue into microsoft:main with commit 46ab47b Apr 29, 2026
33 checks passed

github-project-automation Bot moved this to Done in Agent Framework Apr 29, 2026

moonbox3 mentioned this pull request Apr 29, 2026

Python: bump package versions for 1.2.2 release #5561

Merged

This was referenced Apr 29, 2026

Bump the microsoft group with 7 updates razeone/CloudEngAgent#27

Open

Bump the dotnet group with 1 update dotnet/docs#53477

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Fix file_search citations breaking assistant history roundtrip#5557

Python: Fix file_search citations breaking assistant history roundtrip#5557
moonbox3 merged 2 commits intomicrosoft:mainfrom
moonbox3:5556-fix

moonbox3 commented Apr 29, 2026

Uh oh!

moonbox3 commented Apr 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

moonbox3 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

moonbox3 commented Apr 29, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

moonbox3 commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Design Approach

Uh oh!

Uh oh!

Uh oh!

moonbox3 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

moonbox3 commented Apr 29, 2026 •

edited

Loading