Skip to content

Code executor: catastrophic O(n^2) regex backtracking in extract_code_and_truncate_content hangs on large delimiter-free responses #5992

@RexLeee

Description

@RexLeee

Version: google-adk 1.28.0 (the regex is unchanged on the 2.x line as well)

What happens

When an LlmAgent has a code_executor, _CodeExecutionResponseProcessor runs CodeExecutionUtils.extract_code_and_truncate_content on every model response. The regex (in google/adk/code_executors/code_execution_utils.py):

pattern = re.compile(
    rf'(?P<prefix>.*?)({leading_delimiter_pattern})(?P<code>.*?)({trailing_delimiter_pattern})(?P<suffix>.*?)$'.encode(),
    re.DOTALL,
)
pattern_match = pattern.search(response_text.encode())

is run via search() on the full joined response text. When the text contains no leading delimiter (e.g. a large structured/JSON response returned by a before_model_callback short-circuit, which still flows through the response post-processors), every start position rescans the whole string looking for the delimiter → O(n²). On a few-hundred-KB response this stalls for tens of minutes — effectively a hang, and no per-turn timeout catches it because it is a synchronous CPU/regex call inside the worker thread.

Reproduce

  1. Attach any BaseCodeExecutor (e.g. UnsafeLocalCodeExecutor) to an LlmAgent.
  2. Have the model (or a before_model_callback) emit a single text part of ~300KB that contains no ```python / ```tool_code fence (e.g. a large Plotly figure JSON).
  3. The post-processor hangs in pattern.search while extracting code.

Suggested fix

Before compiling/searching, short-circuit when no leading delimiter substring is present — the regex can only return None in that case, so this is behavior-preserving and makes the no-match path O(n):

text_parts = [p for p in content.parts if p.text]
if text_parts:
    response_text = '\n'.join(p.text for p in text_parts)
    if not any(d[0] in response_text for d in code_block_delimiters):
        return None  # no opening delimiter -> no code block

(The default delimiters '```tool_code\n' and '```python\n' are literal substrings, so a plain in check is sufficient.)

Metadata

Metadata

Assignees

Labels

tools[Component] This issue is related to tools

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions