Skip to content

fix: skip code extraction regex without delimiters#6002

Open
he-yufeng wants to merge 1 commit into
google:mainfrom
he-yufeng:fix/code-executor-delimiter-scan
Open

fix: skip code extraction regex without delimiters#6002
he-yufeng wants to merge 1 commit into
google:mainfrom
he-yufeng:fix/code-executor-delimiter-scan

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

Summary

Fixes #5992.

extract_code_and_truncate_content() runs the full code-block regex over joined model text. When the response has no opening code delimiter, that regex can only return None, but it still scans through the expensive prefix/suffix pattern.

This adds a cheap opening-delimiter check before compiling/searching the full regex. The normal extraction path is unchanged when a delimiter is present.

Testing

python -m pytest tests\unittests\code_executors\test_code_execution_utils.py -q
python -m py_compile src\google\adk\code_executors\code_execution_utils.py tests\unittests\code_executors\test_code_execution_utils.py
python -m pyink --check src\google\adk\code_executors\code_execution_utils.py tests\unittests\code_executors\test_code_execution_utils.py
git diff --check

@adk-bot adk-bot added the tools [Component] This issue is related to tools label Jun 7, 2026
@adk-bot
Copy link
Copy Markdown
Collaborator

adk-bot commented Jun 7, 2026

Response from ADK Triaging Agent

Hello @he-yufeng, thank you for creating this PR!

This is a great fix for the regex code extraction issue. We noticed that you've included a nice Testing section with the commands you ran, which is awesome. To help the reviewers fully verify and understand the fix:

  • Could you please include a brief summary/output log of the passed pytest results (e.g., the terminal output after running pytest on your new test file)?

This information will help our reviewers approve your PR more efficiently. Thanks!

@adk-bot
Copy link
Copy Markdown
Collaborator

adk-bot commented Jun 7, 2026

I will view the adk-pr-analyze skill instructions to ensure I follow its protocols correctly.I will now run the compliance verification script on PR #6002 to check if the contributor has signed the Google CLA. This is a mandatory safety and compliance step before any diff inspection or codebase verification takes place.I will attempt to retrieve the PR metadata directly using the custom Python tool fetch_github_pr to proceed with compiling the necessary information for the analysis.I will now fetch the linked issue #5992 to understand the original problem reported by the user, using the custom tool fetch_github_issue.I will run the gh pr diff command to retrieve the actual diff of PR #6002 so that we can evaluate the user's implementation.I will view the contents of src/google/adk/code_executors/code_execution_utils.py around line 144 using view_file to verify the codebase's current baseline state.I will view the top lines of src/google/adk/code_executors/code_execution_utils.py using view_file to see its header, license block, and imports, especially focusing on whether and how typing annotations are imported.I will view the adk-style skill instructions to check the Python design guidelines, import conventions, and formatting requirements.I will search for document artifacts or style guidance references in the adk-style skill subdirectory using find_by_name.I will view references/file-organization.md to check rules about license headers and from __future__ import annotations.I will view references/testing.md using view_file to verify the code quality and structural expectations for testing.I will run the newly added tests to verify that they work correctly within the repository.I will perform a search in the codebase to inspect the standard delimiters used in code block extraction.I will view src/google/adk/code_executors/base_code_executor.py around line 62 to see what the default code_block_delimiters are and how they are defined.I will now write a comprehensive analysis report for PR #6002 as a markdown artifact inside the workspace's designated artifacts folder. This is a read-only analysis report that thoroughly reviews the bug fix, architectural compliance, code style, and test completeness.I have successfully analyzed GitHub PR #6002 for the google/adk-python repository, including its linked issue #5992.

The detailed analysis across all dimensions—objectives, value, and architectural/style alignment—has been compiled into a premium analysis report.

📝 Analysis Report

Please review the complete, structured analysis here:
👉 analysis_results.md


Key Decision Points & Highlights

  1. Bug Severity: High. Without this fix, multi-hundred-kilobyte non-code responses processed through _CodeExecutionResponseProcessor trigger O(n²) regex backtracking inside extract_code_and_truncate_content, resulting in CPU exhaustion and infinite hangs.
  2. The Fix Quality: Excellent. The substring-check short-circuit (any(d[0] in response_text)) is incredibly fast, clean, and behavior-preserving.
  3. Recommendation: Approve with Nits. The proposed changes are outstanding and ready for integration once the contributor addresses two minor styling nits:
    • Add standard from __future__ import annotations at the top of the newly added test module test_code_execution_utils.py.
    • Add clear, descriptive test docstrings for the new test definitions.

Summary of Work Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tools [Component] This issue is related to tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Code executor: catastrophic O(n^2) regex backtracking in extract_code_and_truncate_content hangs on large delimiter-free responses

2 participants