Fix/5315 agent engine sandbox code executor gemini 2.x incompatibility#5498
Conversation
Steering prompt that tells Gemini 2.x to wrap code in tool_code fences instead of emitting native executable_code parts when no code_execution tool is declared on the request. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…L_CALL_RE constants Frozenset of Gemini 2.x error codes that indicate a native code_execution tool call was rejected, and a regex to extract the code payload from the error message.
Parses the code payload out of a Gemini UNEXPECTED_TOOL_CALL rejection error message using _UNEXPECTED_TOOL_CALL_RE.
Reconstructs the executable_code part that Gemini 2.x intended to emit when the API rejected the response with UNEXPECTED_TOOL_CALL or MALFORMED_FUNCTION_CALL, allowing the sandbox executor pipeline to proceed normally.
…-in executors Appends _NON_BUILTIN_EXECUTOR_INSTRUCTION to every LLM request that uses a non-BuiltInCodeExecutor, steering Gemini 2.x to output code in tool_code markdown fences rather than native executable_code parts which the API rejects as UNEXPECTED_TOOL_CALL.
…ost-processor When Gemini 2.x emits a native code_execution call and the API rejects it, llm_response.content is empty. For non-built-in executors, attempt to reconstruct the executable_code part from the error message via _maybe_recover_from_api_rejection so the sandbox executor pipeline can still run the code.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
Response from ADK Triaging Agent Hello @JesserHamdaoui, thank you for your contribution! Before we can review your pull request, you'll need to sign our Contributor License Agreement (CLA). It looks like the CLA check is currently failing. You can find more information and the link to sign the agreement in the "cla/google" check at the bottom of this PR. Thank you! |
|
Closed this due to mentioning co-authoring a commit with Claude code that led to fail the google CLA check. |
Problem:
AgentEngineSandboxCodeExecutorfails with Gemini 2.x models when the agent is asked to execute Python code. The observed errors areUNEXPECTED_TOOL_CALL(when no other tools are registered) orMALFORMED_FUNCTION_CALL(when other tools are present).The failure is a contract mismatch across three layers that must agree for code execution to succeed:
AgentEngineSandboxCodeExecutor```python/```tool_codemarkdown fences inside text partsextract_code_and_truncate_contentexecutable_codeparts correctly — this layer is not the problemexecutable_codepart is only valid if the request explicitly declaredTool(code_execution=ToolCodeExecution()), otherwise the response is rejected before ADK ever sees itGemini 2.x models, as it seems, are post-trained to satisfy "execute Python" requests by emitting structured native
executable_codeparts rather than markdown text. BecauseAgentEngineSandboxCodeExecutordoes not declare thecode_executiontool in the outgoing request (it was designed to receive markdown), the Vertex AI API validator rejects the model's response and returnscontent=null. The ADK post-processor never receives anything to route to the sandbox.This is why passing
code_executor=AgentEngineSandboxCodeExecutor(...)to the agent constructor does not help:code_executoris an ADK-side construct only. It tells ADK where to send code once a response arrives; it does not communicate anything to the Vertex AI API server, which has no knowledge of the attached sandbox and enforces the tool-declaration contract at response validation time.Solution:
Two complementary fixes applied entirely within
google/adk/flows/llm_flows/_code_execution.py, with no changes to any other ADK file and no public API changes:Layer 1 — Pre-processor steering (
_run_pre_processor): When the configured executor is aBaseCodeExecutorbut not aBuiltInCodeExecutor, append a system-instruction (_NON_BUILTIN_EXECUTOR_INSTRUCTION) to every outgoing request. The instruction explicitly tells the model to wrap Python code in```tool_codemarkdown fences and forbids nativeexecutable_codeemission, reducing the frequency with which the model triggers the API validator.Layer 2 — Response-processor recovery (
_run_post_processor): When the API still rejects the response withUNEXPECTED_TOOL_CALLorMALFORMED_FUNCTION_CALL, parse the rejected code out oferror_messagevia_extract_code_from_error_message, reconstruct a syntheticexecutable_codepart onllm_response.content, clear the error fields, and let the existingextract_code_and_truncate_content→code_executor.execute_codepipeline handle it exactly as if the model had emitted the part cleanly. Note thatextract_code_and_truncate_contentalready supportsexecutable_codeparts, this recovery path simply gives it the chance to run.Together, Layer 1 stops most rejections at the request side and Layer 2 rescues the cases where the model still emits a native tool call despite the steering, ensuring the full User → Model → Executor → Sandbox flow completes across Gemini 2.x.
Testing Plan
Unit Tests:
Four new test groups were added to
tests/unittests/flows/llm_flows/test_code_execution.py:test_extract_code_from_error_message_*Noneinput, non-matching error messagetest_maybe_recover_from_api_rejection_*UNEXPECTED_TOOL_CALLrecovery,MALFORMED_FUNCTION_CALLrecovery, unrecognised error code, missing error code, unparseable messagetest_pre_processor_injects_instruction_*BuiltInCodeExecutortest_post_processor_recovery_*BuiltInCodeExecutorskips the recovery path entirelyManual End-to-End (E2E) Tests:
Please refer to #5315 for the full reproduction script and setup steps. Using the reproduction case from that issue, the fix was verified against a live Vertex AI Agent Engine sandbox.