Skip to content

fix(traces-observer): render gen_ai parts[] tool calls and tool responses#884

Merged
nadheesh merged 1 commit into
wso2:mainfrom
RAVEENSR:fix/observer-otel-message-parts
May 16, 2026
Merged

fix(traces-observer): render gen_ai parts[] tool calls and tool responses#884
nadheesh merged 1 commit into
wso2:mainfrom
RAVEENSR:fix/observer-otel-message-parts

Conversation

@RAVEENSR
Copy link
Copy Markdown
Member

@RAVEENSR RAVEENSR commented May 15, 2026

Related to #840

What

Per-span Input Messages / Output Messages bubbles in the Console rendered empty (role chip only, no body) for any LLM/agent span emitted by opentelemetry-instrumentation-openai 0.60.0 or opentelemetry-instrumentation-openai-agents — i.e. anything using the OpenAI Agents SDK, or pure-OTel OpenAI auto-instrumentation. The data was present on the span (gen_ai.input.messages / gen_ai.output.messages were populated), so this looked like a UI bug, but the data was being dropped server-side in parseOTELMessage.

Concretely, those instrumentors emit each message with bodies nested under parts: [{type, …}] instead of a top-level content/toolCalls field:

[
  {"role": "user",      "parts": [{"type": "text", "content": "search flights ZRH → LHR"}]},
  {"role": "assistant", "parts": [{"type": "tool_call", "id": "...", "name": "search_flights", "arguments": {...}}]},
  {"role": "tool",      "parts": [{"type": "tool_call_response", "id": "...", "response": "Error: ..."}]}
]

parseOTELMessage only read rawMsg["content"] / rawMsg["toolCalls"] at the top level, so role was captured but everything else was lost. Result: each turn became {role: "user|assistant|tool", content: ""}, which rendered as the empty bubbles people kept reporting.

Why this regressed

The pre-#181 implementation (introduced in #109) had a parts walker that handled text / tool_call / tool_call_response. #181 ("Add ballerina trace parsing") refactored parseOTELMessage to add recursive JSON parsing and stricter input handling — and in doing so inadvertently dropped the entire parts walker. The PR body never called out removing tool_call support, and no follow-up issue/PR re-introduced it. So this is a regression-restore, not new scope.

The user-visible breakage only surfaced recently because (a) LangChain-instrumented spans were emitting Traceloop-flat attributes (gen_ai.prompt.N.*) which extractTraceloopPromptMessages still handles, and (b) the bare openai.chat leaf span was the only one emitting the new parts[] shape, so a healthy LangChain trace tree masked the problem. Once wrapt 2.x broke the LangChain instrumentor (#881), the bare openai.chat became the only span emitted, and the empty-bubble bug became impossible to miss.

Changes

  • traces-observer-service/opensearch/process.goparseOTELMessage now walks rawMsg["parts"] when present and switches on type:
    • "" | "text" → append content to a strings.Builder; written to msg.Content as a fallback when no top-level content was set.
    • "tool_call" → build a ToolCall{ID, Name, Arguments} and append to msg.ToolCalls. arguments is accepted as either a string or an object (object → json.Marshal).
    • "tool_call_response" → surface the response text into msg.Content for role="tool" messages. Accepts both the current OpenLLMetry field name response and the older result, since the field was renamed between versions and we want this to remain robust across the matrix.
    • Other types (image, audio, ...) are intentionally skipped.
  • Top-level content / toolCalls reads are unchanged and still run at higher priority, so old-shape spans (and the existing top-level test cases) render exactly the same.
  • No console changes needed — console/workspaces/pages/traces/src/subComponents/spanDetails/Overview.tsx already renders message.toolCalls[] as a card with name + monospaced arguments and message.content via MarkdownView when role is set.

Tests

go test ./... green in traces-observer-service. Added under TestExtractPromptMessages:

  • OTEL format with gen_ai parts[] — basic text-only parts.
  • OTEL format with gen_ai parts[] - skips non-text partstype:"image" is correctly skipped instead of having its binary payload concatenated into text.
  • OTEL format with parts[] tool_call — assistant turn with only a tool_call part → empty Content, one ToolCalls entry with the right id/name/arguments (object form JSON-marshalled deterministically).
  • OTEL format with parts[] tool_call_response - 'response' field — current OpenLLMetry field name surfaces into Content.
  • OTEL format with parts[] tool_call_response - legacy 'result' field — older field name still works.
  • OTEL format with parts[] mixed text + tool_call — both Content and ToolCalls populated together.

Existing baseline cases (OTEL format gen_ai.input.messages, Traceloop format gen_ai.prompt.*) untouched and still pass.

Test plan

  • Unit: go test ./opensearch/... -run TestExtractPromptMessages -v — all 8 sub-tests green.
  • Local end-to-end: rebuilt amp-traces-observer:0.0.0-dev, loaded into k3d, rolled the deployment, sent a multi-tool-call chat to a Python LangGraph agent, and queried /api/v1/traces/<id>/spans/<id> directly. Returned conversation includes populated content for user/assistant-text/tool turns and populated toolCalls for assistant turns that invoked tools (raw observer response, abbreviated):
    [0] role=user       content='search flights from Zurich to London tomorrow'   toolCalls=0
    [1] role=assistant  content=''                                                  toolCalls=1
            -> search_flights {"arrival_airport":"London","departure_airport":"Zurich",...}
    [2] role=tool       content="Error: ValueError('Either DATABASE_URL ...')"      toolCalls=0
    [3] role=assistant  content='I encountered an error ...'                        toolCalls=1
            -> tavily_search_results_json {"query":"Swiss Airlines flights ..."}
    [4] role=tool       content="HTTPError('401 Client Error: Unauthorized ...')"  toolCalls=0
    
  • Browser: refreshed the Trace Details panel — assistant tool-call turns now render a card with the tool name + JSON arguments, tool response turns render the response text, and previously-broken Input/Output bubbles are no longer empty.
  • Regression: existing top-level-shape spans still render via the unchanged code path.

Release

No new instrumentation image release and no new amp-instrumentation PyPI release are needed — agents emit the data correctly, this is purely an observer rendering fix. The amp-traces-observer image is in release-config.json images and rebuilds on every product release, so the fix ships with the next AMP release of the observer chart.

…nses

opentelemetry-instrumentation-openai (0.60.0) and
opentelemetry-instrumentation-openai-agents emit
`gen_ai.input.messages` / `gen_ai.output.messages` with each message
nested under `parts: [{type, ...}]`, not as a top-level `content` field
on each message. parseOTELMessage only read the top-level form, so role
was captured but content/tool-calls were dropped, and the Console's
per-span Input Messages / Output Messages panel rendered empty bubbles
for any agent using the OpenAI Agents SDK or any pure-OTel OpenAI
instrumentation.

This is a regression wso2#181 ("Add ballerina trace parsing") inadvertently
introduced when it refactored parseOTELMessage and dropped the `parts`
walker that wso2#109 had originally added. The refactor's PR body never
called out removing tool_call handling, and no follow-up issue/PR
re-introduced it.

Add a parts walker in parseOTELMessage that handles all three branches
the pre-wso2#181 code did:

- `type: "text"` → append `content` to msg.Content
- `type: "tool_call"` → append {id, name, arguments} to msg.ToolCalls
  (JSON-marshal arguments when emitted as an object)
- `type: "tool_call_response"` → surface the response text as msg.Content
  for role="tool" messages; accept both the current OpenLLMetry field
  name `response` and the older `result` for version-drift safety

Top-level content / toolCalls reads are preserved at higher priority, so
old-shape spans render unchanged. Console-side rendering needs no
change: Overview.tsx already renders `message.toolCalls[]` as a card
with `name` + monospaced `arguments`, and `message.content` as
Markdown when `role` is set.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fac13d50-b5f6-403a-8337-f345f91c5f39

📥 Commits

Reviewing files that changed from the base of the PR and between b649880 and 925e40a.

📒 Files selected for processing (2)
  • traces-observer-service/opensearch/process.go
  • traces-observer-service/opensearch/process_test.go

📝 Walkthrough

Walkthrough

This PR extends message parsing in the traces observer service to support OpenTelemetry GenAI messages using the newer parts array representation. The implementation assembles text content from parts, extracts tool-call structures, and handles legacy field names, with comprehensive regression test coverage.

Changes

OTEL GenAI Parts Array Message Parsing

Layer / File(s) Summary
Parts array parsing in parseOTELMessage
traces-observer-service/opensearch/process.go
parseOTELMessage detects and processes parts arrays in newer OTEL GenAI shapes: concatenates text parts, extracts tool-calls with JSON-marshaled arguments, handles tool-call-response content, and maintains backward compatibility with prior extraction logic.
Regression test coverage for parts array parsing
traces-observer-service/opensearch/process_test.go
TestExtractPromptMessages gains sub-tests verifying text extraction from parts, non-text part skipping, tool-call population, and tool-call-response handling with legacy result field support.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A GenAI message found a fancy new dress,
With parts arrays bundled to handle with finesse,
Text, tools, and responses all dance in a row,
While tests stand as witness to all that we know!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding support for rendering OTEL GenAI parts[] tool calls and tool responses in the observer service.
Description check ✅ Passed The description provides comprehensive context on the regression, root cause, implementation details, and testing, though it deviates from the template structure by combining sections into a narrative format rather than following discrete template sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nadheesh nadheesh merged commit e4d9e0e into wso2:main May 16, 2026
7 checks passed
RAVEENSR pushed a commit to RAVEENSR/agent-manager that referenced this pull request May 27, 2026
…arts

fix(traces-observer): render gen_ai parts[] tool calls and tool responses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants