Skip to content

[Content Understanding] Update to_llm_input page markers and filter telemetry warnings#47326

Open
chienyuanchang wants to merge 3 commits into
mainfrom
cu-sdk/llm-input-helper-update
Open

[Content Understanding] Update to_llm_input page markers and filter telemetry warnings#47326
chienyuanchang wants to merge 3 commits into
mainfrom
cu-sdk/llm-input-helper-update

Conversation

@chienyuanchang
Copy link
Copy Markdown
Member

Description

Updates the azure-ai-contentunderstanding to_llm_input() helper to align its rendered output with the upcoming service page-marker format and to remove non-user-facing telemetry from RAI warning output.

Changes made:

  • Updated SDK-injected document page markers from <!-- page N --> to <!-- InputPageNumber: N -->.
  • Added duplicate-marker defense: if service markdown already contains <!-- InputPageNumber:, to_llm_input() does not inject additional page markers.
  • Filtered service-emitted internal telemetry warnings whose message starts with LLMStats: from the rendered rai_warnings front matter.
  • Preserved LLMStats: text when it appears in the document markdown body; only structured warnings are filtered.
  • Updated unit tests and sample tests for the new marker format and warning-filter behavior.
  • Updated CHANGELOG.md.

Relevant issues / context:

This PR is not based on regenerated SDK code from a new API spec.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
    • No public API signatures are changed. This only changes rendered text produced by the preview to_llm_input() helper.
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Testing performed:

cd sdk/contentunderstanding/azure-ai-contentunderstanding

.venv/bin/python -m pytest tests/test_to_llm_input.py -q
# 84 passed

AZURE_TEST_RUN_LIVE=true .venv/bin/python -m pytest tests/samples/test_sample_to_llm_input.py::TestSampleToLlmInput::test_to_llm_input_multi_page_content_range -q -s
# 1 passed

@chienyuanchang chienyuanchang marked this pull request as ready for review June 3, 2026 20:42
Copilot AI review requested due to automatic review settings June 3, 2026 20:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the azure-ai-contentunderstanding to_llm_input() helper output format to align with an upcoming service page-marker convention and to suppress non-user-facing telemetry warnings from the rendered rai_warnings YAML front matter.

Changes:

  • Switched SDK-injected page markers from <!-- page N --> to <!-- InputPageNumber: N -->, and avoided injecting markers when the service markdown already includes InputPageNumber markers.
  • Filtered service warning messages that begin with LLMStats: (after leading whitespace) from the rendered rai_warnings block.
  • Updated unit tests and sample tests to validate the new marker format and warning filtering, and bumped package version/changelog.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/test_to_llm_input.py Updates assertions for the new InputPageNumber marker format and adds coverage for LLMStats: warning filtering and duplicate-marker defense.
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/samples/test_sample_to_llm_input.py Updates sample test expectations to the new page marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/samples/test_sample_to_llm_input_async.py Updates async sample test expectations to the new page marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/README.md Adds 1.2.0b2 to the SDK-to-service-version compatibility table.
sdk/contentunderstanding/azure-ai-contentunderstanding/CHANGELOG.md Adds an unreleased 1.2.0b2 entry documenting the marker change and telemetry-warning filtering.
sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_version.py Bumps the package version to 1.2.0b2.
sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_helpers.py Implements InputPageNumber marker injection + duplicate-marker bypass, and filters LLMStats: entries from rendered RAI warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants