[Content Understanding] Update to_llm_input page markers and filter telemetry warnings by chienyuanchang · Pull Request #47326 · Azure/azure-sdk-for-python

chienyuanchang · 2026-06-03T20:36:22Z

Description

Updates the azure-ai-contentunderstanding to_llm_input() helper to align its rendered output with the upcoming service page-marker format and to remove non-user-facing telemetry from RAI warning output.

Changes made:

Updated SDK-injected document page markers from  to .
Added duplicate-marker defense: if service markdown already contains <!-- InputPageNumber:, to_llm_input() does not inject additional page markers.
Filtered service-emitted internal telemetry warnings whose message starts with LLMStats: from the rendered rai_warnings front matter.
Preserved LLMStats: text when it appears in the document markdown body; only structured warnings are filtered.
Updated unit tests and sample tests for the new marker format and warning-filter behavior.
Updated CHANGELOG.md.

Relevant issues / context:

Agent Framework feedback that prompted the LLMStats: filtering: Python: Adopt azure-ai-contentunderstanding to_llm_input in CU context provider microsoft/agent-framework#5796

This PR is not based on regenerated SDK code from a new API spec.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
- No public API signatures are changed. This only changes rendered text produced by the preview to_llm_input() helper.
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message.

Testing Guidelines

Pull request includes test coverage for the included changes.

Testing performed:

cd sdk/contentunderstanding/azure-ai-contentunderstanding

.venv/bin/python -m pytest tests/test_to_llm_input.py -q
# 84 passed

AZURE_TEST_RUN_LIVE=true .venv/bin/python -m pytest tests/samples/test_sample_to_llm_input.py::TestSampleToLlmInput::test_to_llm_input_multi_page_content_range -q -s
# 1 passed

Copilot

Pull request overview

This PR updates the azure-ai-contentunderstanding to_llm_input() helper output format to align with an upcoming service page-marker convention and to suppress non-user-facing telemetry warnings from the rendered rai_warnings YAML front matter.

Changes:

Switched SDK-injected page markers from  to , and avoided injecting markers when the service markdown already includes InputPageNumber markers.
Filtered service warning messages that begin with LLMStats: (after leading whitespace) from the rendered rai_warnings block.
Updated unit tests and sample tests to validate the new marker format and warning filtering, and bumped package version/changelog.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/test_to_llm_input.py	Updates assertions for the new `InputPageNumber` marker format and adds coverage for `LLMStats:` warning filtering and duplicate-marker defense.
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/samples/test_sample_to_llm_input.py	Updates sample test expectations to the new page marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/samples/test_sample_to_llm_input_async.py	Updates async sample test expectations to the new page marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/README.md	Adds `1.2.0b2` to the SDK-to-service-version compatibility table.
sdk/contentunderstanding/azure-ai-contentunderstanding/CHANGELOG.md	Adds an unreleased `1.2.0b2` entry documenting the marker change and telemetry-warning filtering.
sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_version.py	Bumps the package version to `1.2.0b2`.
sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_helpers.py	Implements `InputPageNumber` marker injection + duplicate-marker bypass, and filters `LLMStats:` entries from rendered RAI warnings.

first version

d46eef2

github-actions Bot added the Cognitive - Content Understanding label Jun 3, 2026

chienyuanchang added 2 commits June 3, 2026 13:41

update version

8f725dd

Merge branch 'main' into cu-sdk/llm-input-helper-update

a637834

chienyuanchang marked this pull request as ready for review June 3, 2026 20:42

Copilot AI review requested due to automatic review settings June 3, 2026 20:42

chienyuanchang requested review from bojunehsu, changjian-wang and yungshinlintw as code owners June 3, 2026 20:42

Copilot started reviewing on behalf of chienyuanchang June 3, 2026 20:43 View session

Copilot AI reviewed Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Content Understanding] Update to_llm_input page markers and filter telemetry warnings#47326

[Content Understanding] Update to_llm_input page markers and filter telemetry warnings#47326
chienyuanchang wants to merge 3 commits into
mainfrom
cu-sdk/llm-input-helper-update

chienyuanchang commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chienyuanchang commented Jun 3, 2026

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants