fix(traces-observer): render gen_ai parts[] tool calls and tool responses#884
Conversation
…nses
opentelemetry-instrumentation-openai (0.60.0) and
opentelemetry-instrumentation-openai-agents emit
`gen_ai.input.messages` / `gen_ai.output.messages` with each message
nested under `parts: [{type, ...}]`, not as a top-level `content` field
on each message. parseOTELMessage only read the top-level form, so role
was captured but content/tool-calls were dropped, and the Console's
per-span Input Messages / Output Messages panel rendered empty bubbles
for any agent using the OpenAI Agents SDK or any pure-OTel OpenAI
instrumentation.
This is a regression wso2#181 ("Add ballerina trace parsing") inadvertently
introduced when it refactored parseOTELMessage and dropped the `parts`
walker that wso2#109 had originally added. The refactor's PR body never
called out removing tool_call handling, and no follow-up issue/PR
re-introduced it.
Add a parts walker in parseOTELMessage that handles all three branches
the pre-wso2#181 code did:
- `type: "text"` → append `content` to msg.Content
- `type: "tool_call"` → append {id, name, arguments} to msg.ToolCalls
(JSON-marshal arguments when emitted as an object)
- `type: "tool_call_response"` → surface the response text as msg.Content
for role="tool" messages; accept both the current OpenLLMetry field
name `response` and the older `result` for version-drift safety
Top-level content / toolCalls reads are preserved at higher priority, so
old-shape spans render unchanged. Console-side rendering needs no
change: Overview.tsx already renders `message.toolCalls[]` as a card
with `name` + monospaced `arguments`, and `message.content` as
Markdown when `role` is set.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR extends message parsing in the traces observer service to support OpenTelemetry GenAI messages using the newer ChangesOTEL GenAI Parts Array Message Parsing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…arts fix(traces-observer): render gen_ai parts[] tool calls and tool responses
Related to #840
What
Per-span Input Messages / Output Messages bubbles in the Console rendered empty (role chip only, no body) for any LLM/agent span emitted by
opentelemetry-instrumentation-openai0.60.0 oropentelemetry-instrumentation-openai-agents— i.e. anything using the OpenAI Agents SDK, or pure-OTel OpenAI auto-instrumentation. The data was present on the span (gen_ai.input.messages/gen_ai.output.messageswere populated), so this looked like a UI bug, but the data was being dropped server-side inparseOTELMessage.Concretely, those instrumentors emit each message with bodies nested under
parts: [{type, …}]instead of a top-levelcontent/toolCallsfield:[ {"role": "user", "parts": [{"type": "text", "content": "search flights ZRH → LHR"}]}, {"role": "assistant", "parts": [{"type": "tool_call", "id": "...", "name": "search_flights", "arguments": {...}}]}, {"role": "tool", "parts": [{"type": "tool_call_response", "id": "...", "response": "Error: ..."}]} ]parseOTELMessageonly readrawMsg["content"]/rawMsg["toolCalls"]at the top level, so role was captured but everything else was lost. Result: each turn became{role: "user|assistant|tool", content: ""}, which rendered as the empty bubbles people kept reporting.Why this regressed
The pre-#181 implementation (introduced in #109) had a
partswalker that handledtext/tool_call/tool_call_response. #181 ("Add ballerina trace parsing") refactoredparseOTELMessageto add recursive JSON parsing and stricter input handling — and in doing so inadvertently dropped the entire parts walker. The PR body never called out removing tool_call support, and no follow-up issue/PR re-introduced it. So this is a regression-restore, not new scope.The user-visible breakage only surfaced recently because (a) LangChain-instrumented spans were emitting Traceloop-flat attributes (
gen_ai.prompt.N.*) whichextractTraceloopPromptMessagesstill handles, and (b) the bareopenai.chatleaf span was the only one emitting the newparts[]shape, so a healthy LangChain trace tree masked the problem. Once wrapt 2.x broke the LangChain instrumentor (#881), the bareopenai.chatbecame the only span emitted, and the empty-bubble bug became impossible to miss.Changes
traces-observer-service/opensearch/process.go—parseOTELMessagenow walksrawMsg["parts"]when present and switches ontype:"" | "text"→ appendcontentto astrings.Builder; written tomsg.Contentas a fallback when no top-levelcontentwas set."tool_call"→ build aToolCall{ID, Name, Arguments}and append tomsg.ToolCalls.argumentsis accepted as either a string or an object (object →json.Marshal)."tool_call_response"→ surface the response text intomsg.Contentforrole="tool"messages. Accepts both the current OpenLLMetry field nameresponseand the olderresult, since the field was renamed between versions and we want this to remain robust across the matrix.image,audio, ...) are intentionally skipped.content/toolCallsreads are unchanged and still run at higher priority, so old-shape spans (and the existing top-level test cases) render exactly the same.console/workspaces/pages/traces/src/subComponents/spanDetails/Overview.tsxalready rendersmessage.toolCalls[]as a card withname+ monospacedargumentsandmessage.contentviaMarkdownViewwhenroleis set.Tests
go test ./...green intraces-observer-service. Added underTestExtractPromptMessages:OTEL format with gen_ai parts[]— basic text-only parts.OTEL format with gen_ai parts[] - skips non-text parts—type:"image"is correctly skipped instead of having its binary payload concatenated into text.OTEL format with parts[] tool_call— assistant turn with only atool_callpart → emptyContent, oneToolCallsentry with the right id/name/arguments (object form JSON-marshalled deterministically).OTEL format with parts[] tool_call_response - 'response' field— current OpenLLMetry field name surfaces intoContent.OTEL format with parts[] tool_call_response - legacy 'result' field— older field name still works.OTEL format with parts[] mixed text + tool_call— bothContentandToolCallspopulated together.Existing baseline cases (
OTEL format gen_ai.input.messages,Traceloop format gen_ai.prompt.*) untouched and still pass.Test plan
go test ./opensearch/... -run TestExtractPromptMessages -v— all 8 sub-tests green.amp-traces-observer:0.0.0-dev, loaded into k3d, rolled the deployment, sent a multi-tool-call chat to a Python LangGraph agent, and queried/api/v1/traces/<id>/spans/<id>directly. Returned conversation includes populatedcontentfor user/assistant-text/tool turns and populatedtoolCallsfor assistant turns that invoked tools (raw observer response, abbreviated):Release
No new instrumentation image release and no new
amp-instrumentationPyPI release are needed — agents emit the data correctly, this is purely an observer rendering fix. Theamp-traces-observerimage is inrelease-config.jsonimagesand rebuilds on every product release, so the fix ships with the next AMP release of the observer chart.