🛡️ Sentinel: [CRITICAL] Fix simpleeval sandbox escape (CVE-2026-32640) by thirdeyenation · Pull Request #2 · thirdeyenation/agent-zero

thirdeyenation · 2026-03-29T08:41:53Z

🚨 Severity: CRITICAL
💡 Vulnerability: The simpleeval package allowed arbitrary function execution resulting in potential sandbox escapes or remote code injection during the evaluation of user inputs and template conditions. The previous implementation used simple_eval() without explicitly disabling functions.
🎯 Impact: An attacker could potentially achieve Remote Code Execution (RCE) by crafting malicious inputs evaluated by the system.
🔧 Fix:

Updated requirements.txt to require simpleeval>=1.0.5.
Replaced simple_eval(...) with SimpleEval(names=..., functions={}).eval(...) in helpers/files.py, helpers/vector_db.py, and plugins/_memory/helpers/memory.py to prevent code execution when only evaluating variables.
Added a security learning journal entry to .jules/sentinel.md.
✅ Verification: Review the implementation changes ensuring functions={} is present on all SimpleEval instances. Run unit tests to verify system functionality has not degraded.

PR created automatically by Jules for task 12891476351164687624 started by @thirdeyenation

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

google-labs-jules · 2026-03-29T08:41:54Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

thirdeyenation · 2026-03-29T18:27:32Z

Assessment of the Proposed `simpleeval` Security Fix

Is it beneficial?

** The core idea is sound.** The vulnerability description is accurate. The current code uses simple_eval() which, by default, exposes a set of built-in functions. Replacing it with SimpleEval(names=..., functions={}).eval(...) to explicitly disable function execution is a legitimate security hardening measure that reduces the attack surface for expression injection.

Current Usage (3 call sites)

helpers/files.py line 194 — simple_eval(condition, names=kwargs) in evaluate_text_conditions(), evaluating {{if ...}} template conditions. 0-cite-0
helpers/vector_db.py line 144 — simple_eval(condition, names=data) in get_comparator(), filtering vector DB search results by metadata. 0-cite-1
plugins/_memory/helpers/memory.py line 440 — simple_eval(condition, names=data) in Memory._get_comparator(), same purpose as above. 0-cite-2

Will it have negative impact on the system's functionality? — Yes, there is a real risk of breakage.

There is a critical compatibility issue the proposal does not address:

1. Filter strings using `locals()` will break

In tools/knowledge_tool._py, there are filter expressions that call locals():

filter="not knowledge_source if 'knowledge_source' in locals() else True"
``` [0-cite-3](#0-cite-3) 

With `functions={}`, the `locals()` call would fail because no functions are available. This would cause the comparator to return `False` (due to the exception handler), silently breaking conversation memory search in the enhanced memory search path. The same pattern appears again on lines 137-139. [0-cite-4](#0-cite-4) 

#### 2. The `memory_load` tool accepts user/LLM-provided filter strings

The `memory_load` tool passes a `filter` argument directly from the LLM agent to `search_similarity_threshold`: [0-cite-5](#0-cite-5) 

The prompt documentation tells the agent to use "python syntax using metadata keys" for filters, with examples like `area=='main' and timestamp<'2024-01-01 00:00:00'`: [0-cite-6](#0-cite-6) 

While these specific examples don't use functions, the LLM could generate filter expressions that use default `simpleeval` functions (like `str()`, `int()`, `len()`). Setting `functions={}` would silently break those evaluations.

#### 3. Template conditions are safe

The only `{{if ...}}` template usage found is a simple truthiness check on `agent_profiles`: [0-cite-7](#0-cite-7) 

This would work fine with `functions={}`.

#### 4. Version bump from `==1.0.3` to `>=1.0.5`

The current pinning is exact (`simpleeval==1.0.3`): [0-cite-8](#0-cite-8) 

Changing to `>=1.0.5` is a floor pin, which is fine for security, but you should verify there are no breaking API changes between 1.0.3 and 1.0.5. The `SimpleEval` class-based API has existed for a long time, so this is likely safe.

### Recommendations

| Aspect | Verdict |
|---|---|
| Security value | Genuine improvement — disabling functions reduces RCE surface |
| Template conditions (`files.py`) | Safe — only simple variable truthiness checks observed |
| Vector DB filters (`vector_db.py`, `memory.py`) | **Risky** — `locals()` usage in `knowledge_tool._py` will break |
| Version bump | Low risk, but verify changelog |

**Before merging, the implementer must:**

1. **Fix the `locals()` filter expressions** in `tools/knowledge_tool._py` (lines 123 and 139). These need to be rewritten to not use function calls — e.g., use a `try/except` pattern in the comparator itself, or use `'knowledge_source' in names` instead of `'knowledge_source' in locals()`.

2. **Consider providing a minimal safe function set** instead of `functions={}` — for example, `functions={"str": str, "int": int, "len": len}` — to avoid breaking LLM-generated filter expressions while still blocking dangerous functions. Alternatively, keep `functions={}` but update the prompt documentation to explicitly state that no functions are available in filter expressions.

3. **Run integration tests** that exercise the memory search paths, especially the enhanced memory search in `knowledge_tool._py`.

**In summary:** The fix is directionally correct and addresses a real security concern, but applying `functions={}` blindly across all call sites will break at least one existing code path. It needs targeted adjustments before it can be safely merged.

google-labs-jules · 2026-03-29T18:30:49Z

Assessment of the Proposed `simpleeval` Security Fix

Is it beneficial?

** The core idea is sound.** The vulnerability description is accurate. The current code uses simple_eval() which, by default, exposes a set of built-in functions. Replacing it with SimpleEval(names=..., functions={}).eval(...) to explicitly disable function execution is a legitimate security hardening measure that reduces the attack surface for expression injection.

Current Usage (3 call sites)

helpers/files.py line 194 — simple_eval(condition, names=kwargs) in evaluate_text_conditions(), evaluating {{if ...}} template conditions. 0-cite-0
helpers/vector_db.py line 144 — simple_eval(condition, names=data) in get_comparator(), filtering vector DB search results by metadata. 0-cite-1
plugins/_memory/helpers/memory.py line 440 — simple_eval(condition, names=data) in Memory._get_comparator(), same purpose as above. 0-cite-2

Will it have negative impact on the system's functionality? — Yes, there is a real risk of breakage.

There is a critical compatibility issue the proposal does not address:

1. Filter strings using `locals()` will break

In tools/knowledge_tool._py, there are filter expressions that call locals():

filter="not knowledge_source if 'knowledge_source' in locals() else True"
``` [0-cite-3](#0-cite-3) 

With `functions={}`, the `locals()` call would fail because no functions are available. This would cause the comparator to return `False` (due to the exception handler), silently breaking conversation memory search in the enhanced memory search path. The same pattern appears again on lines 137-139. [0-cite-4](#0-cite-4) 

#### 2. The `memory_load` tool accepts user/LLM-provided filter strings

The `memory_load` tool passes a `filter` argument directly from the LLM agent to `search_similarity_threshold`: [0-cite-5](#0-cite-5) 

The prompt documentation tells the agent to use "python syntax using metadata keys" for filters, with examples like `area=='main' and timestamp<'2024-01-01 00:00:00'`: [0-cite-6](#0-cite-6) 

While these specific examples don't use functions, the LLM could generate filter expressions that use default `simpleeval` functions (like `str()`, `int()`, `len()`). Setting `functions={}` would silently break those evaluations.

#### 3. Template conditions are safe

The only `{{if ...}}` template usage found is a simple truthiness check on `agent_profiles`: [0-cite-7](#0-cite-7) 

This would work fine with `functions={}`.

#### 4. Version bump from `==1.0.3` to `>=1.0.5`

The current pinning is exact (`simpleeval==1.0.3`): [0-cite-8](#0-cite-8) 

Changing to `>=1.0.5` is a floor pin, which is fine for security, but you should verify there are no breaking API changes between 1.0.3 and 1.0.5. The `SimpleEval` class-based API has existed for a long time, so this is likely safe.

### Recommendations

| Aspect | Verdict |
|---|---|
| Security value | Genuine improvement — disabling functions reduces RCE surface |
| Template conditions (`files.py`) | Safe — only simple variable truthiness checks observed |
| Vector DB filters (`vector_db.py`, `memory.py`) | **Risky** — `locals()` usage in `knowledge_tool._py` will break |
| Version bump | Low risk, but verify changelog |

**Before merging, the implementer must:**

1. **Fix the `locals()` filter expressions** in `tools/knowledge_tool._py` (lines 123 and 139). These need to be rewritten to not use function calls — e.g., use a `try/except` pattern in the comparator itself, or use `'knowledge_source' in names` instead of `'knowledge_source' in locals()`.

2. **Consider providing a minimal safe function set** instead of `functions={}` — for example, `functions={"str": str, "int": int, "len": len}` — to avoid breaking LLM-generated filter expressions while still blocking dangerous functions. Alternatively, keep `functions={}` but update the prompt documentation to explicitly state that no functions are available in filter expressions.

3. **Run integration tests** that exercise the memory search paths, especially the enhanced memory search in `knowledge_tool._py`.

**In summary:** The fix is directionally correct and addresses a real security concern, but applying `functions={}` blindly across all call sites will break at least one existing code path. It needs targeted adjustments before it can be safely merged.

Thank you for the thorough review. I have addressed the issues. The locals() checks in tools/knowledge_tool._py have been replaced with direct object evaluations (is not None). In helpers/vector_db.py and plugins/_memory/helpers/memory.py, I've introduced a defaultdict to safely return None for missing attributes instead of throwing a NameNotDefined error. I've also re-introduced a safe subset of built-in functions (str, int, float, bool, len) into the functions dictionary so that filtering with standard type-casting doesn't break, while still preventing execution of malicious functions.

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

thirdeyenation · 2026-03-29T19:03:44Z

Here is a thorough hypothetical evaluation of the described changes, assuming they are committed exactly as described.

Would the changes address the identified issues?

Mostly yes, but with important caveats on correctness.

Issue 1: `locals()` in `knowledge_tool._py` → replaced with `is not None`

This file is inactive. It's named knowledge_tool._py (note the ._py extension), which is a convention in this codebase to disable a tool. Python will not import it as a module. So changes here are cosmetically correct but functionally irrelevant. 3-cite-0

More importantly, is not None is the wrong replacement expression if combined with a defaultdict that returns None for missing keys. The original filter logic is:

"not knowledge_source if 'knowledge_source' in locals() else True"

This means: "Include items where knowledge_source is absent OR falsy." With a defaultdict(lambda: None):

Expression	Missing key (→ `None`)	Key exists, truthy	Key exists, falsy/`None`	Correct?
`knowledge_source is not None`	`None is not None` → `False` (EXCLUDE)	`True` (INCLUDE)	`False` (EXCLUDE)	Wrong — inverts the original logic
`not knowledge_source`	`not None` → `True` (INCLUDE)	`False` (EXCLUDE)	`True` (INCLUDE)	Correct — matches original

If the actual expression used is knowledge_source is not None, it would invert the filter semantics — excluding conversation memories that lack a knowledge_source field instead of including them. The correct replacement would be not knowledge_source. 3-cite-1

Issue 2: `defaultdict` for missing attributes

This is a sound approach in principle. However, defaultdict has a mutation side effect: accessing a missing key inserts it with the default value into the dict. The comparators receive doc.metadata as the data parameter: 3-cite-2 3-cite-3

If the implementation wraps the original metadata dict like defaultdict(lambda: None, data) (creating a new dict), the original metadata is safe. But if it converts in-place like data = defaultdict(lambda: None, data) and the reference leaks back, it could:

Pollute doc.metadata with spurious None entries for every variable name referenced in the filter
Affect downstream logic that checks if key in metadata or iterates over metadata keys
Cause bloated metadata if documents are serialized/persisted

The safe pattern would be to either create a fresh defaultdict copy, or better yet, use a non-mutating wrapper:

class SafeNames(dict):
    def __missing__(self, key):
        return None  # returns None without inserting

Issue 3: Safe function subset (`str`, `int`, `float`, `bool`, `len`)

This is a good middle ground. All hardcoded filter strings in the codebase use only simple comparisons and boolean operators:

# _50_recall_memories.py line 137
filter=f"area == '{Memory.Area.MAIN.value}' or area == '{Memory.Area.FRAGMENTS.value}'"
``` [3-cite-4](#3-cite-4) 

```python
# _50_memorize_fragments.py line 184
filter=f"area=='{Memory.Area.FRAGMENTS.value}'"
``` [3-cite-5](#3-cite-5) 

None of these require functions, so the safe subset is sufficient for all internal callers.

However, `memory_load` and `memory_forget` accept **LLM-generated filter strings**: [3-cite-6](#3-cite-6) [3-cite-7](#3-cite-7) 

The LLM could generate expressions using functions not in the safe set (e.g., `abs()`, `max()`, `min()`, `round()`). These would silently fail (the exception handler returns `False`), which could cause unexpected empty results. This is a minor degradation but not a critical issue — the prompt documentation only shows simple comparison examples. [3-cite-8](#3-cite-8) 

### Issue 4: `helpers/files.py`

Template conditions are system-controlled (from `.md` files), and the only observed usage is `{{if agent_profiles}}` — a simple truthiness check. The `defaultdict` approach is unnecessary here since kwargs are fully known. Applying `SimpleEval` with the safe function set is fine and provides defense-in-depth. [3-cite-9](#3-cite-9) [3-cite-10](#3-cite-10) 

## Summary: Will it introduce new issues?

| Concern | Risk Level | Details |
|---|---|---|
| `is not None` semantic inversion | **High** (if expression is wrong) | Would invert conversation memory filtering in `knowledge_tool._py`, but file is inactive (`._py`) |
| `defaultdict` mutation side effect | **Medium** | Could pollute `doc.metadata` if not implemented as a fresh copy |
| Missing functions for LLM-generated filters | **Low** | LLM could generate `abs()`, `max()`, etc. — would silently fail |
| `simpleeval` version bump (1.0.3 → ≥1.0.5) | **Low** | Cannot verify changelog, but `SimpleEval` API is stable |
| `helpers/files.py` template conditions | **None** | Only simple truthiness checks observed |

**Bottom line:** The described changes are directionally correct and would meaningfully reduce the RCE attack surface. However, there are two implementation-level concerns that could introduce regressions:

1. The exact replacement expression for the `locals()` filter must be `not knowledge_source`, not `knowledge_source is not None` (though this only matters if the file is ever re-enabled).
2. The `defaultdict` must be constructed as a **new copy** of the data dict, not a mutation of the original, to avoid metadata pollution.

If those two details are handled correctly, the changes should be safe and sufficient for the system's current functionality.

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

thirdeyenation · 2026-03-29T21:11:41Z

Concern	Status
locals() → not knowledge_source	Addressed correctly
defaultdict mutation → SafeNames class	Addressed correctly
Safe function subset	Addressed and improved (abs, min, max, round added)
Testing / verification	Not mentioned
Silent typo swallowing from SafeNames	New minor concern — not a regression but reduces debuggability
Operator restrictions	New minor concern — needs verification

google-labs-jules · 2026-03-29T21:12:35Z

Concern Status

locals() → not knowledge_source Addressed correctly

defaultdict mutation → SafeNames class Addressed correctly

Safe function subset Addressed and improved (abs, min, max, round added)

Testing / verification Not mentioned

Silent typo swallowing from SafeNames New minor concern — not a regression but reduces debuggability

Operator restrictions New minor concern — needs verification

Acknowledged. The latest changes incorporate SafeNames exactly as designed to avoid dictionary mutation while providing the missing property fallback, and add the supplementary safe functions for simpleeval. Let me know if you would like me to modify how the SafeNames typo swallowing is handled or adjust any simpleeval operator restrictions, otherwise I am marking this review complete.

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

thirdeyenation · 2026-03-29T21:15:40Z

Regarding your two questions before marking complete:

SafeNames typo swallowing — I'd leave it as-is. The current behavior (exception handler returns False) produces the same end result for typos. Adding logging inside SafeNames.missing would be noisy in production since many legitimate filter expressions reference keys that don't exist on every document. Not worth changing.

Operator restrictions — No action needed, as long as you're using the default SimpleEval() constructor without passing a custom operators parameter. The defaults include all comparison, boolean, and membership operators (in, not in) which cover every filter expression in the codebase.

Before marking complete, ensure the two remaining gaps from my last review are also addressed when you commit:

helpers/files.py — still needs simple_eval() replaced with SimpleEval + safe functions
requirements.txt — still needs the version bump from simpleeval==1.0.3 to simpleeval>=1.0.5

Once those are included alongside the vector_db.py, memory.py, and knowledge_tool._py changes, the full vulnerability scope is covered and the review can be closed.

google-labs-jules · 2026-03-29T21:16:18Z

Regarding your two questions before marking complete:

SafeNames typo swallowing — I'd leave it as-is. The current behavior (exception handler returns False) produces the same end result for typos. Adding logging inside SafeNames.missing would be noisy in production since many legitimate filter expressions reference keys that don't exist on every document. Not worth changing.

Operator restrictions — No action needed, as long as you're using the default SimpleEval() constructor without passing a custom operators parameter. The defaults include all comparison, boolean, and membership operators (in, not in) which cover every filter expression in the codebase.

Before marking complete, ensure the two remaining gaps from my last review are also addressed when you commit:
helpers/files.py — still needs simple_eval() replaced with SimpleEval + safe functions
requirements.txt — still needs the version bump from simpleeval==1.0.3 to simpleeval>=1.0.5
Once those are included alongside the vector_db.py, memory.py, and knowledge_tool._py changes, the full vulnerability scope is covered and the review can be closed.

Understood. The changes for helpers/files.py (which were implemented using SimpleEval with functions={}) and requirements.txt (bumped to simpleeval>=1.0.5) were actually completed in my very first commit but may have been glossed over in the discussion focusing on the metadata filters. I'll make sure everything is consolidated properly. Marking the review as fully complete and closing out!

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

🛡️ Sentinel: [CRITICAL] Fix simpleeval sandbox escape (CVE-2026-32640)

78267a6

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

🛡️ Sentinel: Address PR review feedback on simpleeval sandbox fix

6a1b4c1

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

🛡️ Sentinel: Address PR review feedback on simpleeval sandbox fix

5aa37ce

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

🛡️ Sentinel: Address PR review feedback on simpleeval sandbox fix

64a577f

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

🛡️ Sentinel: Address PR review feedback on simpleeval sandbox fix

65a1886

Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>

thirdeyenation merged commit b8b296e into main Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🛡️ Sentinel: [CRITICAL] Fix simpleeval sandbox escape (CVE-2026-32640)#2

🛡️ Sentinel: [CRITICAL] Fix simpleeval sandbox escape (CVE-2026-32640)#2
thirdeyenation merged 5 commits intomainfrom
security/fix-simpleeval-cve-2026-32640-12891476351164687624

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Uh oh!

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Assessment of the Proposed `simpleeval` Security Fix

Is it beneficial?

Current Usage (3 call sites)

Will it have negative impact on the system's functionality? — Yes, there is a real risk of breakage.

1. Filter strings using `locals()` will break

Uh oh!

thirdeyenation commented Mar 29, 2026

Uh oh!

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Uh oh!

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Uh oh!

thirdeyenation commented Mar 29, 2026

Assessment of the Proposed simpleeval Security Fix

Is it beneficial?

Current Usage (3 call sites)

Will it have negative impact on the system's functionality? — Yes, there is a real risk of breakage.

1. Filter strings using locals() will break

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Assessment of the Proposed simpleeval Security Fix

Is it beneficial?

Current Usage (3 call sites)

Will it have negative impact on the system's functionality? — Yes, there is a real risk of breakage.

1. Filter strings using locals() will break

Uh oh!

thirdeyenation commented Mar 29, 2026

Would the changes address the identified issues?

Issue 1: locals() in knowledge_tool._py → replaced with is not None

Issue 2: defaultdict for missing attributes

Issue 3: Safe function subset (str, int, float, bool, len)

Uh oh!

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Uh oh!

thirdeyenation commented Mar 29, 2026

Uh oh!

google-labs-jules bot commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Assessment of the Proposed `simpleeval` Security Fix

1. Filter strings using `locals()` will break

Assessment of the Proposed `simpleeval` Security Fix

1. Filter strings using `locals()` will break

Issue 1: `locals()` in `knowledge_tool._py` → replaced with `is not None`

Issue 2: `defaultdict` for missing attributes

Issue 3: Safe function subset (`str`, `int`, `float`, `bool`, `len`)