Skip to content

fix(agents): preserve text streamed before tool calls in output_key#5616

Open
vipin-v-nair wants to merge 1 commit intogoogle:mainfrom
vipin-v-nair:fix/5590-call-site-accumulation-d2
Open

fix(agents): preserve text streamed before tool calls in output_key#5616
vipin-v-nair wants to merge 1 commit intogoogle:mainfrom
vipin-v-nair:fix/5590-call-site-accumulation-d2

Conversation

@vipin-v-nair
Copy link
Copy Markdown

Fixes #5590

Problem

When an LlmAgent is configured with output_key and runs in
StreamingMode.SSE while making tool calls, only the model text on the
very last tool-free event is written to state[output_key]. Any text
the model emits before or between tool calls is silently dropped, even
though it streams to the user. Reporters measure 60–70% retention loss
in production. A live reproduction against gemini-2.5-pro on Vertex
shows the intro and progress narration discarded, with only the
conclusion text reaching output_key.

Root cause

LlmAgent.__maybe_save_output_to_state only writes
state_delta[output_key] on events where Event.is_final_response()
returns True. That gate returns False for any event carrying a
function_call or function_response part. Under streaming, Gemini
emits non-partial events that bundle text with a function_call
those events fail the gate, so their text is skipped. The post-
aggregation events that follow the tool calls are also rejected (they
contain function_response parts). Only the final tool-free event
passes, and state_delta[output_key] gets overwritten with just that
text. Every prior text segment from the same model turn is silently
discarded.

Fix

Add a new private helper LlmAgent.__maybe_accumulate_streaming_output
that receives the current event and a running accumulator string. It
returns the new accumulator value, and on applicable events also writes
the running value to state_delta[output_key]. The helper is invoked
from the call sites in _run_async_impl and _run_live_impl,
alongside (and after) the existing __maybe_save_output_to_state call.
Each call site holds a local output_accumulator: str that lives for
the duration of the invocation — no new instance state on the agent.

__maybe_save_output_to_state is byte-identical to main. The schema
path is preserved unchanged: when output_schema is set, the helper
short-circuits and the existing single-final-document validation path
runs as before. The function-response-only callback path is also
preserved (the helper short-circuits on empty text).

Tests added

  • tests/unittests/agents/test_llm_agent_streaming_output.py::test_run_async_accumulates_text_around_tool_cal ls
    Regression test for Text accumulation issue for output_key #5590. Drives _run_async_impl with a stubbed
    _llm_flow that yields the canned event sequence the streaming
    flow produces (interleaved partial text, text+function_call,
    function_response, more text, final tool-free text), merges each
    event's state_delta into session state via
    session_service.append_event, and asserts on
    session.state["final_output"] — the value named in the bug
    report. Fails on main with
    AssertionError: 'Conclusion one. Conclusion two.' == 'Intro one. Intro two.Progress.Conclusion one. Conclusion two.';
    passes with this fix.

Verification

  • pre-commit run --all-files — clean
  • pyink --check --diff --config pyproject.toml src/ tests/ — clean
  • isort --check src/ tests/ — clean
  • Mypy New Error Check (full-repo two-branch diff) — 0 new errors,
    0 removed
  • pylint clean on touched files (no new warnings; pre-existing
    warnings unchanged)
  • pytest tests/unittests --ignore=tests/unittests/artifacts/test_artifact_service.py --ignore=tests/unittests/tools/google_api_tool/test_googleapi_to_openapi_converter.py
    5608 passed, 2266 warnings, 6 subtests passed in 436.97s
  • Live reproduction against gemini-2.5-pro on Vertex confirms all
    three text sections (intro + progress + conclusion) now reach
    state[output_key]; pre-fix run had only the conclusion.

Design notes

I considered an alternative that adds a PrivateAttr accumulator to
LlmAgent and keeps the fix inside __maybe_save_output_to_state. I
went with the call-site approach because:

  • It avoids adding state to the agent (no new instance fields).
  • It keeps LlmAgent's field set unchanged.
  • A PrivateAttr on LlmAgent would be the first in the agent class
    hierarchy — this bug didn't feel like the right place to introduce
    that pattern, even though PrivateAttr is already used elsewhere
    in the codebase (agents/invocation_context.py:221,
    sessions/session.py:53).

The trade-off: the regression test constructs an InvocationContext
directly, so it's mildly brittle to future changes in
InvocationContext's required fields. The 14 existing tests in
tests/unittests/agents/test_llm_agent_output_save.py are byte-
identical to main and continue to pass unchanged.

Out of scope

  • Other tests in test_llm_agent_output_save.py use name-mangled
    access to internal state (agent._LlmAgent__maybe_save_output_to_state(event)).
    The new test deliberately asserts at a higher level
    (_run_async_impl + session.state[output_key]), but the existing
    tests are untouched.

LlmAgent.__maybe_save_output_to_state only writes state_delta[output_key]
on events where Event.is_final_response() returns True. That gate is
False for any event carrying a function_call or function_response part,
so under StreamingMode.SSE the text on text+function_call events was
silently dropped from output_key — only the final tool-free event's
text was saved. Production agents using output_key with tools were
losing 60-70% of model text, including intros and progress narration.

Add a private helper that accumulates non-partial text across the
model turn into a local string (per-invocation, scoped to the caller)
and writes the running value to state_delta[output_key] on every
text-bearing event from this agent. Wire it into _run_async_impl and
_run_live_impl alongside the existing __maybe_save_output_to_state
call. __maybe_save_output_to_state is unchanged; output_schema
behavior is preserved (validate one complete final document, never
accumulate partial JSON).

Fixes google#5590
@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Text accumulation issue for output_key

2 participants