Skip to content

Fix LangGraph content-block serialization in gen_ai.output.messages (#189)#193

Merged
JacksonWeber merged 4 commits into
microsoft:mainfrom
JacksonWeber:jacksonweber/fix-189-langgraph-content-blocks
Jun 8, 2026
Merged

Fix LangGraph content-block serialization in gen_ai.output.messages (#189)#193
JacksonWeber merged 4 commits into
microsoft:mainfrom
JacksonWeber:jacksonweber/fix-189-langgraph-content-blocks

Conversation

@JacksonWeber

@JacksonWeber JacksonWeber commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #189.

LangChain/LangGraph AIMessage.content may be either a plain string or a list of content-block dicts (e.g. [{"type": "text", "text": "...", "phase": "final_answer", "id": "msg_..."}]). The previous _langchain_content helper called str(c) on this value, producing a Python-repr blob with single quotes and leaked phase/index/id keys inside what the GenAI semconv requires to be a plain TextPart.content string. This broke Foundry's cloud trace evaluators (builtin.coherence, builtin.fluency, etc.) which read the assistant text out of gen_ai.output.messages[*].parts[*].content.

Verification

Serialized a real LangGraph-shaped output and validated against the upstream gen-ai-output-messages.json schema:

[
  {
    "role": "assistant",
    "parts": [
      { "type": "text", "content": "# One-Day Food Walk in Vancouver\n## Assumptions" },
      { "type": "tool_call", "id": "tool_1", "name": "search", "arguments": "{\"q\":\"food\"}" }
    ],
    "finish_reason": "stop"
  }
]

All spec constraints satisfied: role ∈ allowed enum, TextPart.content is a plain string with no repr blob or leaked metadata, ToolCallRequestPart has const type: "tool_call" + required name, finish_reason in allowed enum.

tests/langchain/test_utils.py: 128/128 pass (was 125; +3 new).

…icrosoft#189)

LangChain/LangGraph AIMessage.content may be a list of content-block dicts (e.g. [{'type': 'text', 'text': '...', 'phase': 'final_answer', 'id': '...'}]). The previous _langchain_content helper called str(c) on this value, producing a Python-repr blob with single quotes and leaked phase/index/id keys inside what the GenAI semconv requires to be a plain TextPart.content string.

Changes:

- New _flatten_lc_content_blocks helper concatenates the text of every type=='text' block (joined with newline) into a plain string.

- _langchain_content now delegates to that helper for both BaseMessage and dict-shaped messages.

- _langchain_tool_calls additionally harvests {'type': 'tool_use', ...} entries embedded in list-shaped content as ToolCallRequest parts so they surface as spec-typed parts instead of being dropped.

- Three regression tests covering the exact issue shape, multi-block text join, and embedded tool_use harvest.

- CHANGELOG entry under Unreleased.

Verified the serialized output against the upstream gen-ai-output-messages.json schema.

Fixes microsoft#189

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 8, 2026 17:15
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Performance comparison

Threshold: regressions >15.0% on gating scenarios fail the build. Higher ops/s is better; positive Δ means the PR is slower.

Scenario Gating Baseline (ops/s) Candidate (ops/s) Δ % Status
azure_monitor_log yes 25,905.4 26,300.6 -1.50%
azure_monitor_span yes 154,035.7 155,472.6 -0.92%
otel_log no 31,259.8 32,460.2 -3.70%
otel_span no 34,042.6 34,112.2 -0.20%

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes LangChain/LangGraph AIMessage.content serialization for GenAI semantic conventions by normalizing list-shaped “content blocks” into spec-compliant output message parts, preventing Python repr blobs from being emitted in gen_ai.output.messages.

Changes:

  • Add _flatten_lc_content_blocks() to extract/concatenate text from LangGraph-style content-block lists for TextPart.content.
  • Extend tool-call extraction to also harvest tool_use blocks embedded inside list-shaped content.
  • Add regression tests covering content-block flattening, multi-block joining, and embedded tool_use handling; document the fix in CHANGELOG.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/langchain/test_utils.py Adds regression tests to ensure LangGraph content blocks serialize to spec-compliant text/tool-call parts.
src/microsoft/opentelemetry/_genai/_langchain/_utils.py Implements content-block flattening and harvesting of embedded tool_use blocks into tool-call parts.
CHANGELOG.md Notes the fix under “Unreleased”.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/microsoft/opentelemetry/_genai/_langchain/_utils.py
Comment thread CHANGELOG.md Outdated
…ELOG heading level

- _langchain_tool_calls now copies raw_calls before appending harvested tool_use blocks, so it never mutates BaseMessage.tool_calls or additional_kwargs['tool_calls'].

- New test_extraction_does_not_mutate_input_message guards against regressions.

- CHANGELOG Unreleased heading switched from ## to # to match the surrounding release headings.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread CHANGELOG.md Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Comment thread src/microsoft/opentelemetry/_genai/_langchain/_utils.py Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JacksonWeber JacksonWeber requested a review from rads-1996 June 8, 2026 18:25

@rads-1996 rads-1996 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JacksonWeber JacksonWeber merged commit 1eb2308 into microsoft:main Jun 8, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LangGraph hosted agent: gen_ai.output.messages text part content is a stringified Python list, not GenAI-spec text

3 participants