Skip to content

fix: ensure call_llm tracing spans are always ended in multi-agent setups#4742

Open
atian8179 wants to merge 1 commit intogoogle:mainfrom
atian8179:fix/tracing-span-loss-in-multi-agent
Open

fix: ensure call_llm tracing spans are always ended in multi-agent setups#4742
atian8179 wants to merge 1 commit intogoogle:mainfrom
atian8179:fix/tracing-span-loss-in-multi-agent

Conversation

@atian8179
Copy link

Problem

In multi-agent setups using transfer_to_agent, call_llm tracing spans for parent agents are created but never exported. Only sub-agent spans appear in the trace backend.

Root Cause

_call_llm_with_tracing() uses tracer.start_as_current_span('call_llm') as a context manager around an async generator. When transfer_to_agent triggers sub-agent execution and the generator is later closed, GeneratorExit is raised inside the with block. The OTel context.detach() then raises ValueError (stale contextvars token after async context switch), preventing span.end() from being reached. Unended spans are never exported.

This is the same root cause as #501 and #1670 (fixed in base_agent.py), but base_llm_flow.py was not updated with the same pattern.

Solution

Replace start_as_current_span with explicit span lifecycle management:

span = tracer.start_span('call_llm')
ctx = trace.set_span_in_context(span)
token = otel_context.attach(ctx)
try:
    ...
finally:
    try:
        otel_context.detach(token)
    except ValueError:
        pass
    span.end()  # Always called

This ensures span.end() is always reached regardless of GeneratorExit or ValueError from context detach.

Fixes #4715

…tups

Replace start_as_current_span context manager with explicit span
management in _call_llm_with_tracing to guarantee span.end() is
called even when GeneratorExit is raised during async iteration.

When transfer_to_agent triggers sub-agent execution, the parent
agent's AsyncGenerator is closed, raising GeneratorExit inside the
context manager. The OTel context.detach() then raises ValueError
(stale contextvars token after async context switch), preventing
span.end() from being reached. Unended spans are never exported.

This fix uses try/finally with explicit span lifecycle to ensure
spans are always properly ended and exported.

Fixes google#4715
@google-cla
Copy link

google-cla bot commented Mar 6, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the OpenTelemetry tracing integration for multi-agent LLM flows. It ensures that tracing spans for parent agents are correctly exported by modifying the span management logic to explicitly handle the span lifecycle, preventing premature closure or unexported spans caused by asynchronous generator behavior and context switching errors.

Highlights

  • Tracing Span Export Fix: Resolved an issue where call_llm tracing spans for parent agents were not exported in multi-agent setups due to GeneratorExit and ValueError preventing span.end() from being called.
  • Explicit Span Management: Replaced the tracer.start_as_current_span context manager with explicit OpenTelemetry span lifecycle management, including tracer.start_span, trace.set_span_in_context, otel_context.attach, and a try...finally block to guarantee span.end() execution.
  • Robust Context Detachment: Implemented a try...except ValueError block around otel_context.detach(token) within the finally clause to handle stale contextvars tokens after async context switches, ensuring reliable span closure.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/flows/llm_flows/base_llm_flow.py
    • Updated LLM call tracing logic to ensure spans are always ended and exported.
    • Introduced explicit OpenTelemetry context and span management.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the tracing [Component] This issue is related to OpenTelemetry tracing label Mar 6, 2026
@adk-bot
Copy link
Collaborator

adk-bot commented Mar 6, 2026

Response from ADK Triaging Agent

Hello @atian8179, thank you for your contribution!

Before we can review this PR, could you please sign the Contributor License Agreement (CLA)? You can find more information at https://cla.developers.google.com/.

Also, could you please add a testing plan section to your PR description to describe how you've tested these changes?

This information will help us to review your PR more efficiently. Thanks!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively resolves a bug where tracing spans were not being correctly exported in multi-agent scenarios. The change from using tracer.start_as_current_span to explicit span lifecycle management with a try...finally block is the correct approach to ensure spans are always ended, even in the presence of GeneratorExit or ValueError during async operations. The implementation is robust and directly addresses the root cause described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tracing [Component] This issue is related to OpenTelemetry tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] call_llm spans not exported in multi-agent setups due to GeneratorExit breaking span.end() in _call_llm_async

2 participants