Skip to content

Python: Handle url_citation annotations in FoundryChatClient streaming responses#5071

Merged
giles17 merged 5 commits intomicrosoft:mainfrom
giles17:agent/fix-5029-1
Apr 16, 2026
Merged

Python: Handle url_citation annotations in FoundryChatClient streaming responses#5071
giles17 merged 5 commits intomicrosoft:mainfrom
giles17:agent/fix-5029-1

Conversation

@giles17
Copy link
Copy Markdown
Contributor

@giles17 giles17 commented Apr 2, 2026

Motivation and Context

When using FoundryChatClient with SharePoint grounding, url_citation annotations present in the streamed response were silently dropped — inline citation markers appeared in the text but no URL metadata was surfaced to consumers, making citations unusable.

Fixes #5029

Description

The streaming handler for response.output_text.annotation.added events handled file_path, file_citation, and container_file_citation but had no branch for url_citation, causing it to fall through to a debug log with no content emitted. The fix adds a url_citation case that constructs an Annotation with type="citation", title, url, and TextSpanRegion-based annotated_regions, mirroring the logic already present in the non-streaming completed-response path. The existing test for unknown annotation types was updated to use a genuinely unknown type instead of url_citation, and new tests verify both the happy path and the missing-URL guard.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by giles17's agent

Add url_citation branch to the streaming annotation handler in
_parse_chunk_from_openai, mirroring the existing non-streaming path.
The handler creates an Annotation with type='citation', title, url,
and annotated_regions (TextSpanRegion), wrapped in Content.from_text.

Update test_streaming_annotation_added_with_unknown_type to use a
truly unknown type, and add new tests for url_citation (with and
without url).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 2, 2026 20:30
@giles17 giles17 self-assigned this Apr 2, 2026
Copy link
Copy Markdown
Contributor Author

@giles17 giles17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 92%

✓ Correctness

The fix correctly adds url_citation handling to the streaming annotation path, mirroring the non-streaming path at line 1615. The implementation properly uses Content.from_text with an Annotation (instead of Content.from_hosted_file), defensively guards against missing URL, and conditionally builds annotated_regions. Tests are well-structured covering happy path, missing-URL edge case, and a properly updated unknown-type test. One minor inconsistency: the other three streaming annotation branches (file_path, file_citation, container_file_citation) all store annotation_index in additional_properties, but the new url_citation branch omits it. This is not a bug (the non-streaming path also omits it for url_citation), but it breaks the pattern established by the other streaming branches and could surprise consumers that rely on annotation_index for ordering.

✓ Security Reliability

The change adds url_citation handling in the streaming annotation path, mirroring the existing non-streaming path. The implementation is sound from a security/reliability perspective: the URL is treated as opaque data (stored, not dereferenced), the required url field is guarded with a truthiness check before processing (consistent with how other annotation types guard on ann_file_id), and str(ann_url) ensures type safety. The _get_ann_value helper already handles both dict and object annotation formats safely. Test coverage is good, including the missing-URL edge case and the updated unknown-type test.

✗ Test Coverage

The test coverage for the new url_citation streaming branch is good overall: there's a happy-path test with full assertions on type, title, url, and annotated_regions, a test for the missing-url guard, and the unknown-type test was properly updated. However, there is one meaningful uncovered branch: the code conditionally omits annotated_regions when start_index or end_index is None (line 2429), but no test exercises that path. A test for url_citation with a URL but no start/end indices would close this gap.

✗ Design Approach

The fix correctly adds url_citation handling to the streaming annotation path, mirroring the non-streaming path's logic. The approach is sound and the two new tests are well-structured. One design inconsistency stands out: every other streaming annotation type (file_path, file_citation, container_file_citation) stores annotation_index from the event in additional_properties, but the new url_citation branch omits it entirely. This breaks parity with the established streaming-path pattern and means consumers who rely on annotation_index to correlate annotations with their position in the streamed text will not receive it for url_citation annotations — even though the data is available on the same event object.

Flagged Issues

  • The url_citation streaming handler omits "annotation_index": event.annotation_index in additional_properties, unlike every other annotation type in the same streaming handler (file_path, file_citation, container_file_citation). This breaks parity with the established pattern and means consumers relying on annotation_index to order or correlate streaming annotations will silently get no value for url_citation annotations.
  • Missing test for the url_citation branch where url is present but start_index/end_index are absent. The conditional on line 2429 (if ann_start is not None and ann_end is not None) is never exercised by the current tests, leaving the no-region annotation path uncovered.

Suggestions

  • The non-streaming path (lines 1615–1630) unconditionally accesses annotation.url and annotation.start_index/annotation.end_index without None guards, while the streaming path defensively checks them. Consider adding similar guards to the non-streaming path for consistency, though that is outside this diff's scope.
  • Consider adding a test where title is None to verify the or " fallback on line 2425 produces an empty-string title.
  • Once annotation_index is added to the implementation, the test test_streaming_annotation_added_with_url_citation should assert that annotation_index is present in additional_properties to lock in the behaviour.

Automated review by giles17's agents

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py
Comment thread python/packages/openai/tests/openai/test_openai_chat_client.py
@markwallace-microsoft
Copy link
Copy Markdown
Contributor

markwallace-microsoft commented Apr 2, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/openai/agent_framework_openai
   _chat_client.py87912386%522–525, 529–530, 536–537, 547–548, 555, 570–576, 597, 605, 628, 746, 845, 904, 906, 908, 910, 976, 990, 1070, 1080, 1085, 1128, 1244, 1425, 1430, 1434–1436, 1440–1441, 1507, 1536, 1542, 1552, 1558, 1563, 1569, 1574–1575, 1636, 1658–1659, 1674–1675, 1693–1694, 1737, 1900, 1938–1939, 1955, 1957, 2036–2044, 2074, 2181, 2216, 2231, 2251–2261, 2274, 2285–2289, 2303, 2317–2328, 2337, 2369–2372, 2380–2381, 2383–2385, 2399–2401, 2411–2412, 2418, 2433
TOTAL27679320088% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5596 20 💤 0 ❌ 0 🔥 1m 31s ⏱️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes streaming support for OpenAI Responses API response.output_text.annotation.added events so url_citation annotations (e.g., SharePoint grounding citations) are surfaced to consumers instead of being silently dropped.

Changes:

  • Add a url_citation branch in the streaming annotation handler to emit a text Content containing a citation Annotation (with URL + optional TextSpanRegion).
  • Add unit tests covering the url_citation happy path and the missing-URL guard, and update the “unknown type” test to use a truly unknown annotation type.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
python/packages/openai/agent_framework_openai/_chat_client.py Parses streaming url_citation annotations into citation annotations on text content updates.
python/packages/openai/tests/openai/test_openai_chat_client.py Adds/adjusts tests to validate streaming url_citation handling and unknown-type behavior.

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated
…on annotations silently dropped in Foundry streaming (SharePoint grounding citations lost)
Copy link
Copy Markdown
Contributor Author

@giles17 giles17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 92%

✓ Correctness

The diff correctly adds additional_properties and changes raw_representation from event to annotation (i.e., event.annotation) in the streaming url_citation handler. This aligns with the non-streaming path (line 1628) which also uses raw_representation=annotation. The new tests properly cover the happy path with indices, the no-URL edge case, the no-indices edge case, and the unknown-type fallback. No correctness issues found.

✓ Security Reliability

The diff correctly fixes the streaming url_citation handler to store raw_representation=annotation (matching the non-streaming path at line 1628) instead of raw_representation=event, and adds additional_properties with the annotation_index. The change is consistent with the existing non-streaming code path. Tests cover the happy path (with indices), missing-url edge case, missing-indices edge case, and unknown-type fallback. No security or reliability concerns found.

✓ Test Coverage

The diff adds two new test assertions to the existing test_streaming_annotation_added_with_url_citation test (checking additional_properties and raw_representation) and introduces a new test_streaming_annotation_added_with_url_citation_no_indices test covering the edge case where start_index/end_index are absent. The test coverage is thorough: the happy path with indices, missing URL (ignored), missing indices (no annotated_regions), and unknown type (ignored) are all covered. The new assertions are meaningful and verify the behavioral change in the production code (raw_representation now stores the annotation dict instead of the event, and additional_properties includes annotation_index). The no_indices test correctly verifies that annotated_regions is absent rather than empty, matching the conditional logic in production code. No issues found.

✓ Design Approach

The diff is a small, focused fix: it corrects raw_representation on the Annotation object from event (the streaming event) to annotation (i.e., event.annotation, the annotation payload), making the streaming path consistent with the non-streaming path at line 1628. It also adds additional_properties={'annotation_index': event.annotation_index} for streaming-specific metadata, and adds a well-targeted test for the no-indices edge case. The design approach is correct and consistent with the existing codebase conventions. No blocking issues were found.

Suggestions

  • Consider adding a test assertion for raw_representation on the Content object itself (i.e., response.contents[0].raw_representation == event) in test_streaming_annotation_added_with_url_citation, since Content.from_text passes raw_representation=event while the Annotation now gets raw_representation=annotation. This would verify both levels are set correctly.
  • For consistency with the non-streaming path (line 1628), consider whether the annotated_regions conditional guard (if ann_start is not None and ann_end is not None) is truly needed — the non-streaming path accesses those fields unconditionally, suggesting they are always present for url_citation. Keeping the guard is harmless but may silently produce an incomplete Annotation if the API omits them unexpectedly.

Automated review by giles17's agents

@giles17 giles17 changed the title Python: Handle url_citation annotations in streaming responses Python: Handle url_citation annotations in FoundryChatClient streaming responses Apr 3, 2026
@giles17 giles17 enabled auto-merge April 14, 2026 05:52
Comment thread python/packages/openai/agent_framework_openai/_chat_client.py
@giles17 giles17 added this pull request to the merge queue Apr 16, 2026
Merged via the queue into microsoft:main with commit 435c66e Apr 16, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: url_citation annotations silently dropped in Foundry streaming (SharePoint grounding citations lost)

5 participants