Skip to content

Fix: agent engine sandbox code executor gemini 2.x incompatibility#5499

Open
JesserHamdaoui wants to merge 13 commits intogoogle:mainfrom
JesserHamdaoui:fix/5315-AgentEngineSandboxCodeExecutor-Gemini-2.x-incompatibility
Open

Fix: agent engine sandbox code executor gemini 2.x incompatibility#5499
JesserHamdaoui wants to merge 13 commits intogoogle:mainfrom
JesserHamdaoui:fix/5315-AgentEngineSandboxCodeExecutor-Gemini-2.x-incompatibility

Conversation

@JesserHamdaoui
Copy link
Copy Markdown
Contributor


Problem:

AgentEngineSandboxCodeExecutor fails with Gemini 2.x models when the agent is asked to execute Python code. The observed errors are UNEXPECTED_TOOL_CALL (when no other tools are registered) or MALFORMED_FUNCTION_CALL (when other tools are present).

The failure is a contract mismatch across three layers that must agree for code execution to succeed:

Layer What it expects
AgentEngineSandboxCodeExecutor Code returned by the model in ```python / ```tool_code markdown fences inside text parts
extract_code_and_truncate_content Already handles native executable_code parts correctly — this layer is not the problem
Vertex AI Gemini API server A response containing an executable_code part is only valid if the request explicitly declared Tool(code_execution=ToolCodeExecution()), otherwise the response is rejected before ADK ever sees it

Gemini 2.x models, as it seems, are post-trained to satisfy "execute Python" requests by emitting structured native executable_code parts rather than markdown text. Because AgentEngineSandboxCodeExecutor does not declare the code_execution tool in the outgoing request (it was designed to receive markdown), the Vertex AI API validator rejects the model's response and returns content=null. The ADK post-processor never receives anything to route to the sandbox.

This is why passing code_executor=AgentEngineSandboxCodeExecutor(...) to the agent constructor does not help: code_executor is an ADK-side construct only. It tells ADK where to send code once a response arrives; it does not communicate anything to the Vertex AI API server, which has no knowledge of the attached sandbox and enforces the tool-declaration contract at response validation time.

Solution:

Two complementary fixes applied entirely within google/adk/flows/llm_flows/_code_execution.py, with no changes to any other ADK file and no public API changes:

Layer 1 — Pre-processor steering (_run_pre_processor): When the configured executor is a BaseCodeExecutor but not a BuiltInCodeExecutor, append a system-instruction (_NON_BUILTIN_EXECUTOR_INSTRUCTION) to every outgoing request. The instruction explicitly tells the model to wrap Python code in ```tool_code markdown fences and forbids native executable_code emission, reducing the frequency with which the model triggers the API validator.

Layer 2 — Response-processor recovery (_run_post_processor): When the API still rejects the response with UNEXPECTED_TOOL_CALL or MALFORMED_FUNCTION_CALL, parse the rejected code out of error_message via _extract_code_from_error_message, reconstruct a synthetic executable_code part on llm_response.content, clear the error fields, and let the existing extract_code_and_truncate_contentcode_executor.execute_code pipeline handle it exactly as if the model had emitted the part cleanly. Note that extract_code_and_truncate_content already supports executable_code parts, this recovery path simply gives it the chance to run.

Together, Layer 1 stops most rejections at the request side and Layer 2 rescues the cases where the model still emits a native tool call despite the steering, ensuring the full User → Model → Executor → Sandbox flow completes across Gemini 2.x.


Testing Plan

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Four new test groups were added to tests/unittests/flows/llm_flows/test_code_execution.py:

Test group What it covers
test_extract_code_from_error_message_* Valid single-line payload, multiline payload, None input, non-matching error message
test_maybe_recover_from_api_rejection_* UNEXPECTED_TOOL_CALL recovery, MALFORMED_FUNCTION_CALL recovery, unrecognised error code, missing error code, unparseable message
test_pre_processor_injects_instruction_* Instruction appended for non-built-in executor; not appended for BuiltInCodeExecutor
test_post_processor_recovery_* Rejected response with no content is recovered and routed to the sandbox executor; BuiltInCodeExecutor skips the recovery path entirely

Manual End-to-End (E2E) Tests:

Please refer to #5315 for the full reproduction script and setup steps. Using the reproduction case from that issue, the fix was verified against a live Vertex AI Agent Engine sandbox.

Steering prompt that tells Gemini 2.x to wrap code in tool_code fences
instead of emitting native executable_code parts when no code_execution
tool is declared on the request.
…L_CALL_RE constants

Frozenset of Gemini 2.x error codes that indicate a native
code_execution tool call was rejected, and a regex to extract the
code payload from the error message.
Parses the code payload out of a Gemini UNEXPECTED_TOOL_CALL rejection
error message using _UNEXPECTED_TOOL_CALL_RE.
Reconstructs the executable_code part that Gemini 2.x intended to emit
when the API rejected the response with UNEXPECTED_TOOL_CALL or
MALFORMED_FUNCTION_CALL, allowing the sandbox executor pipeline to
proceed normally.
…-in executors

Appends _NON_BUILTIN_EXECUTOR_INSTRUCTION to every LLM request that
uses a non-BuiltInCodeExecutor, steering Gemini 2.x to output code in
tool_code markdown fences rather than native executable_code parts
which the API rejects as UNEXPECTED_TOOL_CALL.
…ost-processor

When Gemini 2.x emits a native code_execution call and the API rejects
it, llm_response.content is empty. For non-built-in executors, attempt
to reconstruct the executable_code part from the error message via
_maybe_recover_from_api_rejection so the sandbox executor pipeline can
still run the code.
@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Apr 26, 2026
@JesserHamdaoui JesserHamdaoui changed the title Fix/5315 agent engine sandbox code executor gemini 2.x incompatibility Fix: agent engine sandbox code executor gemini 2.x incompatibility Apr 27, 2026
@rohityan rohityan self-assigned this Apr 27, 2026
@rohityan rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label Apr 27, 2026
@rohityan
Copy link
Copy Markdown
Collaborator

Hi @JesserHamdaoui , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Can you please fix the failing mypy-diff tests before we can proceed with the review.

@JesserHamdaoui
Copy link
Copy Markdown
Contributor Author

Hi @rohityan,
done! all the mypy-diff errors should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation request clarification [Status] The maintainer need clarification or more information from the author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AgentEngineSandboxCodeExecutor incompatible with Gemini 2.x models (MALFORMED_FUNCTION_CALL)

3 participants