Skip to content

Python: Fix file_search citations breaking assistant history roundtrip#5557

Merged
moonbox3 merged 2 commits intomicrosoft:mainfrom
moonbox3:5556-fix
Apr 29, 2026
Merged

Python: Fix file_search citations breaking assistant history roundtrip#5557
moonbox3 merged 2 commits intomicrosoft:mainfrom
moonbox3:5556-fix

Conversation

@moonbox3
Copy link
Copy Markdown
Contributor

Motivation and Context

Closes #5556. After the RC5 → 1.0 migration to the Responses API, multi-agent flows (SequentialBuilder, GroupChatBuilder) break with a 400 from the API whenever one agent uses file_search and its history is forwarded to another agent. The root cause is in OpenAIResponsesClient (inherited by FoundryChatClient):

  1. Outbound _prepare_content_for_openai maps hosted_file → input_file for any role, but input_file is an input-only content type that the Responses API rejects inside an assistant message.
  2. Streaming attaches file_citation / container_file_citation / file_path annotations as standalone HostedFileContent items, while the non-streaming path attaches them as text annotations. The asymmetry means streaming users always trip (1).
  3. Outbound output_text hardcodes "annotations": [], silently dropping citation context on every roundtrip even when (1) wouldn't fire.

This worked in RC5 because Chat Completions used flat text annotations and had no input/output content-type schema split.

Description

Three coordinated fixes in packages/openai/agent_framework_openai/_chat_client.py:

  • Skip hosted_file for assistant role in _prepare_content_for_openai. Hosted-file content on an assistant message is a citation reference, not a replayable input file; dropping it stops the 400.
  • Streaming attaches citations as text annotations, matching the non-streaming path. file_citation, container_file_citation, and file_path now produce Content.from_text(text="", annotations=[Annotation(type="citation", ...)]) instead of a standalone HostedFileContent. URL citations (which already used this pattern) are unchanged.
  • output_text preserves annotations on roundtrip via a new _annotations_to_output_text helper that converts framework Annotation objects back to Responses API annotation dicts (file_citation, url_citation, container_file_citation, file_path).

Built TDD-style: 4 new behavior tests plus an end-to-end regression test that exercises the exact streaming-citation → assistant-history-forwarding flow from the bug report. 3 existing streaming-annotation tests were updated to assert the new (consistent) behavior — they previously asserted the buggy split-content shape.

Verified locally: pytest packages/openai/tests/openai/, pytest packages/foundry/tests/foundry/, pytest packages/core/tests/, mypy, and ruff all clean.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

The Responses API rejects 'input_file' inside an assistant message, but the
SDK was emitting it whenever an assistant Message contained a hosted_file
content (which is what file_search citations become). Three coordinated fixes:

1. _prepare_content_for_openai now skips hosted_file for the assistant role
   instead of mapping to input_file (which the API rejects there).

2. The streaming response.output_text.annotation.added handler attaches
   file_citation, container_file_citation, and file_path as annotations on
   text content, matching the non-streaming path. Previously streaming
   produced standalone HostedFileContent items that always tripped (1).

3. output_text serialization preserves Annotation objects on roundtrip via a
   new _annotations_to_output_text helper instead of hardcoding 'annotations'
   to []. file_search citations now survive multi-agent forwarding.

Closes microsoft#5556.
Copilot AI review requested due to automatic review settings April 29, 2026 02:15
@moonbox3
Copy link
Copy Markdown
Contributor Author

moonbox3 commented Apr 29, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/openai/agent_framework_openai
   _chat_client.py96812387%263, 276, 601–604, 608–609, 615–616, 651–657, 678, 686, 709, 827, 926, 985, 987, 989, 991, 1057, 1071, 1151, 1161, 1166, 1209, 1325, 1506, 1511, 1515–1517, 1521–1522, 1588, 1624, 1630, 1640, 1646, 1651, 1657, 1662–1663, 1682, 1772, 1794–1795, 1810–1811, 1829–1830, 1873, 2039, 2077–2078, 2094, 2096, 2176–2184, 2214, 2324, 2359, 2374, 2394–2404, 2417, 2428–2432, 2446, 2460–2471, 2480, 2512–2515, 2525–2526, 2537–2539, 2553–2555, 2565–2566, 2572, 2587
TOTAL30554356288% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
6139 30 💤 0 ❌ 0 🔥 1m 46s ⏱️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Responses API incompatibility where file_search citations (and other hosted-file citation artifacts) could be serialized back into assistant history as input_file, causing 400s when multi-agent workflows forward assistant messages.

Changes:

  • Preserve assistant output_text.annotations on serialization by converting framework Annotation(type="citation") back into Responses API annotation dicts.
  • Prevent hosted_file contents in assistant messages from roundtripping as input_file by dropping them during outbound preparation.
  • Make streaming citation events attach as text annotations (via empty text Content) rather than standalone HostedFileContent, aligning streaming with non-streaming behavior; update/add regression tests.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
python/packages/openai/agent_framework_openai/_chat_client.py Adds annotation roundtrip helper, drops assistant hosted_file on outbound, and changes streaming citation handling to produce text annotations.
python/packages/openai/tests/openai/test_openai_chat_client.py Adds/updates tests to cover citation roundtrip behavior and the streaming→history regression.

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated
Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 90%

✓ Correctness

This PR fixes a roundtrip bug where streaming file_search citations were stored as HostedFileContent, then serialized as input_file items in assistant history — rejected by the Responses API. The fix correctly converts these to text annotations (matching the non-streaming path) and adds a helper to reconstruct API-format annotations on output. The branching logic in _annotations_to_output_text correctly maps all four annotation variants (file_path, file_citation, url_citation, container_file_citation) and the priority ordering of the elif chain prevents misclassification. Dropping hosted_file for assistant role returns {} which is properly filtered by the existing if prepared_content: guard in _prepare_message_for_openai. Tests are comprehensive and cover the main code paths.

✓ Security Reliability

This diff fixes a real bug where streaming file_search citations were represented as HostedFileContent, which then serialized as input_file in assistant history and was rejected by the Responses API. The fix converts these to text annotations that roundtrip cleanly. The new _annotations_to_output_text helper safely reshapes Annotation TypedDicts (all fields optional, accessed via .get()) into API-compatible dicts. The empty-dict guard for hosted_file on assistant messages is correctly filtered by existing calers (line 1352: if prepared_content: — empty dict is falsy). The getattr(content, "annotations", None) is slightly redundant since content.annotations is unconditionally set in Content.__init__, but is harmlessly defensive. No injection risks, resource leaks, secrets exposure, or unhandled failure modes found.

✓ Test Coverage

The PR adds four new tests and updates three existing streaming annotation tests to reflect the behavioral change from Content.from_hosted_file to Content.from_text with citation annotations. Test coverage is solid for the major paths: file_citation, url_citation, and container_file_citation roundtrips are all verified, as is the hosted_file{} guard for assistant role. However, the file_path branch of _annotations_to_output_text (the elif file_id: path on diff line ~300) has no roundtrip test, even though it is one of the three streaming annotation types that changed representation in this PR. The streaming-parse side is tested (test_streaming_annotation_added_with_file_path), but no test verifies that the resulting annotation serializes back to {"type": "file_path", "file_id": ...} via _annotations_to_output_text.

✓ Design Approach

The change fixes the immediate input_file rejection, but the streaming-side approach still breaks the underlying model of citations: it now emits annotation-added events as separate empty text contents instead of attaching them to the streamed text they annotate. Since assistant history serialization writes each Content as its own output_text item, the annotation offsets are no longer relative to the actual text, unlike the non-streaming path which keeps text and annotations together. That means the PR avoids one API error while still losing the text/citation relationship during round-trip.


Automated review by moonbox3's agents

Comment thread python/packages/openai/tests/openai/test_openai_chat_client.py
Comment thread python/packages/openai/agent_framework_openai/_chat_client.py
@moonbox3 moonbox3 self-assigned this Apr 29, 2026
@moonbox3 moonbox3 added the agents Issues related to single agents label Apr 29, 2026
- _annotations_to_output_text: fan out one entry per annotated_region for
  url_citation/container_file_citation (Annotation.annotated_regions is a
  Sequence; the API form carries one start/end per entry).
- Validate region span bounds are ints before emitting; skip otherwise.
- Add test for the file_path branch (annotation with file_id only).
- Add test verifying streamed citation events coalesce onto surrounding
  text via _finalize_response so span indices reference the merged text,
  not the empty-text streaming carrier.
@moonbox3
Copy link
Copy Markdown
Contributor Author

Pushed 4c67642 addressing all four review comments:

  • annotated_regions fan-out: _annotations_to_output_text now emits one Responses API entry per region for url_citation / container_file_citation, instead of collapsing to regions[0].
  • Span validation: regions whose start_index / end_index are not ints are skipped, so the helper can't produce malformed payloads when fed partial regions.
  • file_path test: added test_assistant_text_preserves_file_path_annotation covering the file_id-only branch that wasn't exercised before.
  • Streaming citation attachment: the concern that span indices would reference text == "" doesn't apply once _finalize_response runs — _coalesce_text_content merges consecutive text contents and unions their annotations, so the citation lands on the merged assistant text. Added test_streamed_file_citation_coalesces_onto_surrounding_text to lock that behavior in. Verified locally with the full chunk → ChatResponse.from_updates path: empty-text + annotation update gets merged into the preceding "Hello world." text content with the annotation attached.

@moonbox3 moonbox3 enabled auto-merge April 29, 2026 05:38
@moonbox3 moonbox3 added this pull request to the merge queue Apr 29, 2026
Merged via the queue into microsoft:main with commit 46ab47b Apr 29, 2026
33 checks passed
moonbox3 added a commit that referenced this pull request Apr 29, 2026
* Python: bump package versions for 1.2.2 release

PATCH bump (1.2.1 -> 1.2.2) for the released cohort. Five PRs land in this
window:

- agent-framework-openai: fix file_search citations breaking the assistant-
  message history roundtrip (#5557) — drives the released-tier PATCH
- agent-framework-orchestrations: [BREAKING] standardize orchestration
  terminal outputs as AgentResponse (#5301)
- agent-framework-core, agent-framework-declarative: preserve Workflow.run()
  shared state across calls, accept list[Message] in declarative start
  executor, and coerce Enum values when serializing PowerFx symbols (#5531)
- agent-framework-foundry-hosting: add hosted Durable Workflow support
  (#5531)
- agent-framework-azure-contentunderstanding: new alpha package — Azure AI
  Content Understanding context provider (#4829)
- dependencies: workspace package dependency refresh (#5555)

Per lockstep convention, all 21 beta packages stamp 1.0.0b260429 and all 4
alpha packages (now including the new contentunderstanding) stamp
1.0.0a260429. Date stamp reflects 2026-04-29 Pacific. Every non-core package
floor on agent-framework-core is raised to >=1.2.2; the new
contentunderstanding package's stale >=1.0.0 floor is brought into line.

Two follow-on fixes bundled to keep validate-dependency-bounds-test green
at lowest-direct resolution:
- Bump agent-framework-azure-contentunderstanding's azure-ai-content
  understanding lower bound from >=1.0.0 to >=1.0.1 (1.0.0 ships without
  proper typing — pyright reports 65 unknown-type errors)
- Add pyright ignore comments to core/foundry/__init__.pyi for the new
  alpha package's type-stub imports, since alpha packages are not in
  core's [all] extra and therefore aren't installed at lowest-direct

* Python: add #5552 to 1.2.2 CHANGELOG

Add the streaming-span observability fix to the Fixed section. PR is on
upstream/main but not yet pulled into origin/main; the code itself will
land via the PR merge.

* Python: address PR #5561 review feedback on dependency bounds

Two packaging fixes flagged in review:

1. agent-framework-azure-contentunderstanding: add agent-framework-foundry
   as a runtime dependency. The package's README directs users to
   `pip install agent-framework-azure-contentunderstanding --pre` and the
   basic example imports `FoundryChatClient` from `agent_framework.foundry`,
   so the documented install path was failing with ImportError. Pulling
   agent-framework-foundry into deps makes the advertised entry path
   self-contained.

2. agent-framework-foundry: bump agent-framework-openai lower bound from
   >=1.1.0 to >=1.2.2,<2. Foundry imports private modules from
   agent_framework_openai (`_chat_client.py:22`, `_agent.py:34`), so
   resolvers were free to pair foundry==1.2.2 with older OpenAI versions
   that lack this release's coordinated Responses/history fix. Lockstep the
   floor with the released cohort to prevent mismatched installs.

Both changes pass `validate-dependency-bounds-test` lower + upper at
their respective packages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Issues related to single agents python

Projects

Status: Done

4 participants