feat(openai-agents): emit structured handoffs + output_type on agent spans by hansmire · Pull Request #4066 · traceloop/openllmetry

hansmire · 2026-04-29T15:55:58Z

Summary

The Agents SDK's AgentSpanData documents its handoffs field as list[str] (the names of handoff target agents). The current instrumentor assumed each entry was an Agent object and read .name, which silently produced "unknown" for every handoff when the SDK follows its documented contract. It also stored each handoff under a separately-indexed attribute (openai.agent.handoff0, ...1, ...2, ...) nested as a JSON blob, which trace UIs can't easily aggregate across.

It also dropped AgentSpanData.output_type entirely — which Braintrust's native Agents-SDK processor surfaces via _agent_log_data.

This PR:

Normalises handoffs extraction — strings pass through, Agent-like objects fall back to .name, unknown types produce "unknown".
Emits a unified gen_ai.agent.handoffs JSON-list attribute so backends can show the full handoff target list at a glance. The legacy per-index openai.agent.handoffN attributes are kept for back-compat with existing dashboards.
Emits gen_ai.agent.output_type from AgentSpanData.output_type.

Both emissions are defensive: absent/empty values skip the attribute entirely (no stringified "None" / empty-list snuck into metadata).

Before / After

In the Braintrust UI

Side-by-side shots of the agent-span Metadata panel for the same Super Agent run, before vs. after this PR:

Before — openai.agent.handoff0 renders as {name: "unknown", instructions: "No instructions"}. Neither gen_ai.agent.handoffs nor gen_ai.agent.output_type appears.

After — openai.agent.handoff0.name is the real agent name ("Super Agent" for this self-handoff case), gen_ai.agent.handoffs shows the full target list, and gen_ai.agent.output_type is captured.

Attribute-level diff

Same trace, attribute keys present on the Super Agent.agent span:

5 keys → 7 keys.

Tests

Extended the existing VCR-backed test_agent_with_function_tool_spans to pin gen_ai.agent.output_type == "str" and assert the absent-handoffs regression guard (no stringified None on an empty list).
Added a direct unit test test_agent_span_attributes_handoffs_from_agent_objects that pins the handoff-name normalisation for both list[str] (documented) and list[Agent] (legacy).

All 11 tests in tests/test_openai_agents.py pass locally. uv run ruff check clean.

End-to-end verification

Installed this branch into a downstream agent repo and re-ran an agent with a self-handoff (activate_code_interpreter handoff-to-self). Before the patch: openai.agent.handoff0 = {"name": "unknown", "instructions": "No instructions"}. After: gen_ai.agent.handoffs = ["Super Agent"], openai.agent.handoff0 = {"name": "Super Agent"}, gen_ai.agent.output_type = "str".

Notes

Part of a small series of openai-agents parity fixes (#4061 cached_tokens + reasoning_tokens, #4062 tool span type + duration, #4063 tool span input + output, #4065 LLM span response metadata). Each stands alone off main and can be merged in any order.

Scope check: AgentSpanData.tools is NOT currently populated by the Agents SDK's Runner (verified live — the field reads None even for agents with non-empty tool lists), so it's not a parity delta worth closing downstream. Both Braintrust's native processor and this instrumentor would benefit from an upstream SDK fix that plumbs the registered tool names into AgentSpanData.tools; I can file that separately if useful.

…spans The Agents SDK's `AgentSpanData` documents its `handoffs` field as ``list[str]`` (the names of handoff target agents). The current instrumentor assumed each entry was an ``Agent`` object and read a `.name` attribute, which silently produced ``"unknown"`` for every handoff when the SDK follows its documented contract. It also stored each handoff under a separately-indexed attribute name (``openai.agent.handoff0``, ``...1``, ``...2``, ...) nested as a JSON blob, which downstream UIs can't easily aggregate across. This patch: 1. Normalises `handoffs` extraction — strings pass through, Agent-like objects fall back to `.name`, unknown types produce ``"unknown"``. 2. Emits a unified ``gen_ai.agent.handoffs`` JSON-list attribute so backends can show the full handoff target list at a glance. The legacy per-index ``openai.agent.handoffN`` attributes are kept for back-compat with existing dashboards. 3. Emits ``gen_ai.agent.output_type`` capturing `AgentSpanData.output_type`, matching what Braintrust's native Agents-SDK processor logs via `_agent_log_data` and that this instrumentor was previously dropping. Both emissions are defensive: absent/empty values skip the attribute entirely (no stringified ``None`` / empty-list snuck into metadata). Tests extend the existing VCR-backed `test_agent_with_function_tool_spans` to pin `gen_ai.agent.output_type == "str"` on the WeatherAgent span and assert the absent-handoffs regression guard; a new direct unit test pins the handoff-name normalisation for both string and Agent-object inputs.

CLAassistant · 2026-04-29T15:56:06Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Max Hansmire seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-04-29T15:56:08Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bd9f0e6d-ca77-4be0-8841-0bc3015b0b62

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…tadata

hansmire pushed a commit to hansmire/openllmetry that referenced this pull request Apr 29, 2026

assets: add UI before/after screenshot for PR traceloop#4066 agent me…

45cc297

…tadata

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openai-agents): emit structured handoffs + output_type on agent spans#4066

feat(openai-agents): emit structured handoffs + output_type on agent spans#4066
hansmire wants to merge 1 commit intotraceloop:mainfrom
hansmire:fix/openai-agents-agent-metadata

hansmire commented Apr 29, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hansmire commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before / After

In the Braintrust UI

Attribute-level diff

Tests

End-to-end verification

Notes

Uh oh!

CLAassistant commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hansmire commented Apr 29, 2026 •

edited

Loading