Version: google-adk 1.28.0 (the regex is unchanged on the 2.x line as well)
What happens
When an LlmAgent has a code_executor, _CodeExecutionResponseProcessor runs CodeExecutionUtils.extract_code_and_truncate_content on every model response. The regex (in google/adk/code_executors/code_execution_utils.py):
pattern = re.compile(
rf'(?P<prefix>.*?)({leading_delimiter_pattern})(?P<code>.*?)({trailing_delimiter_pattern})(?P<suffix>.*?)$'.encode(),
re.DOTALL,
)
pattern_match = pattern.search(response_text.encode())
is run via search() on the full joined response text. When the text contains no leading delimiter (e.g. a large structured/JSON response returned by a before_model_callback short-circuit, which still flows through the response post-processors), every start position rescans the whole string looking for the delimiter → O(n²). On a few-hundred-KB response this stalls for tens of minutes — effectively a hang, and no per-turn timeout catches it because it is a synchronous CPU/regex call inside the worker thread.
Reproduce
- Attach any
BaseCodeExecutor (e.g. UnsafeLocalCodeExecutor) to an LlmAgent.
- Have the model (or a
before_model_callback) emit a single text part of ~300KB that contains no ```python / ```tool_code fence (e.g. a large Plotly figure JSON).
- The post-processor hangs in
pattern.search while extracting code.
Suggested fix
Before compiling/searching, short-circuit when no leading delimiter substring is present — the regex can only return None in that case, so this is behavior-preserving and makes the no-match path O(n):
text_parts = [p for p in content.parts if p.text]
if text_parts:
response_text = '\n'.join(p.text for p in text_parts)
if not any(d[0] in response_text for d in code_block_delimiters):
return None # no opening delimiter -> no code block
(The default delimiters '```tool_code\n' and '```python\n' are literal substrings, so a plain in check is sufficient.)
Version: google-adk 1.28.0 (the regex is unchanged on the 2.x line as well)
What happens
When an
LlmAgenthas acode_executor,_CodeExecutionResponseProcessorrunsCodeExecutionUtils.extract_code_and_truncate_contenton every model response. The regex (ingoogle/adk/code_executors/code_execution_utils.py):is run via
search()on the full joined response text. When the text contains no leading delimiter (e.g. a large structured/JSON response returned by abefore_model_callbackshort-circuit, which still flows through the response post-processors), every start position rescans the whole string looking for the delimiter → O(n²). On a few-hundred-KB response this stalls for tens of minutes — effectively a hang, and no per-turn timeout catches it because it is a synchronous CPU/regex call inside the worker thread.Reproduce
BaseCodeExecutor(e.g.UnsafeLocalCodeExecutor) to anLlmAgent.before_model_callback) emit a single text part of ~300KB that contains no```python/```tool_codefence (e.g. a large Plotly figure JSON).pattern.searchwhile extracting code.Suggested fix
Before compiling/searching, short-circuit when no leading delimiter substring is present — the regex can only return
Nonein that case, so this is behavior-preserving and makes the no-match path O(n):(The default delimiters
'```tool_code\n'and'```python\n'are literal substrings, so a plainincheck is sufficient.)