π‘οΈ Sentinel: [CRITICAL] Fix simpleeval sandbox escape (CVE-2026-32640)#2
Conversation
Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>
|
π Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a π emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Assessment of the Proposed
|
Thank you for the thorough review. I have addressed the issues. The |
Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>
|
Here is a thorough hypothetical evaluation of the described changes, assuming they are committed exactly as described. Would the changes address the identified issues?Mostly yes, but with important caveats on correctness. Issue 1:
|
| Expression | Missing key (β None) |
Key exists, truthy | Key exists, falsy/None |
Correct? |
|---|---|---|---|---|
knowledge_source is not None |
None is not None β False (EXCLUDE) |
True (INCLUDE) |
False (EXCLUDE) |
Wrong β inverts the original logic |
not knowledge_source |
not None β True (INCLUDE) |
False (EXCLUDE) |
True (INCLUDE) |
Correct β matches original |
If the actual expression used is knowledge_source is not None, it would invert the filter semantics β excluding conversation memories that lack a knowledge_source field instead of including them. The correct replacement would be not knowledge_source. 3-cite-1
Issue 2: defaultdict for missing attributes
This is a sound approach in principle. However, defaultdict has a mutation side effect: accessing a missing key inserts it with the default value into the dict. The comparators receive doc.metadata as the data parameter: 3-cite-2 3-cite-3
If the implementation wraps the original metadata dict like defaultdict(lambda: None, data) (creating a new dict), the original metadata is safe. But if it converts in-place like data = defaultdict(lambda: None, data) and the reference leaks back, it could:
- Pollute
doc.metadatawith spuriousNoneentries for every variable name referenced in the filter - Affect downstream logic that checks
if key in metadataor iterates over metadata keys - Cause bloated metadata if documents are serialized/persisted
The safe pattern would be to either create a fresh defaultdict copy, or better yet, use a non-mutating wrapper:
class SafeNames(dict):
def __missing__(self, key):
return None # returns None without insertingIssue 3: Safe function subset (str, int, float, bool, len)
This is a good middle ground. All hardcoded filter strings in the codebase use only simple comparisons and boolean operators:
# _50_recall_memories.py line 137
filter=f"area == '{Memory.Area.MAIN.value}' or area == '{Memory.Area.FRAGMENTS.value}'"
``` [3-cite-4](#3-cite-4)
```python
# _50_memorize_fragments.py line 184
filter=f"area=='{Memory.Area.FRAGMENTS.value}'"
``` [3-cite-5](#3-cite-5)
None of these require functions, so the safe subset is sufficient for all internal callers.
However, `memory_load` and `memory_forget` accept **LLM-generated filter strings**: [3-cite-6](#3-cite-6) [3-cite-7](#3-cite-7)
The LLM could generate expressions using functions not in the safe set (e.g., `abs()`, `max()`, `min()`, `round()`). These would silently fail (the exception handler returns `False`), which could cause unexpected empty results. This is a minor degradation but not a critical issue β the prompt documentation only shows simple comparison examples. [3-cite-8](#3-cite-8)
### Issue 4: `helpers/files.py`
Template conditions are system-controlled (from `.md` files), and the only observed usage is `{{if agent_profiles}}` β a simple truthiness check. The `defaultdict` approach is unnecessary here since kwargs are fully known. Applying `SimpleEval` with the safe function set is fine and provides defense-in-depth. [3-cite-9](#3-cite-9) [3-cite-10](#3-cite-10)
## Summary: Will it introduce new issues?
| Concern | Risk Level | Details |
|---|---|---|
| `is not None` semantic inversion | **High** (if expression is wrong) | Would invert conversation memory filtering in `knowledge_tool._py`, but file is inactive (`._py`) |
| `defaultdict` mutation side effect | **Medium** | Could pollute `doc.metadata` if not implemented as a fresh copy |
| Missing functions for LLM-generated filters | **Low** | LLM could generate `abs()`, `max()`, etc. β would silently fail |
| `simpleeval` version bump (1.0.3 β β₯1.0.5) | **Low** | Cannot verify changelog, but `SimpleEval` API is stable |
| `helpers/files.py` template conditions | **None** | Only simple truthiness checks observed |
**Bottom line:** The described changes are directionally correct and would meaningfully reduce the RCE attack surface. However, there are two implementation-level concerns that could introduce regressions:
1. The exact replacement expression for the `locals()` filter must be `not knowledge_source`, not `knowledge_source is not None` (though this only matters if the file is ever re-enabled).
2. The `defaultdict` must be constructed as a **new copy** of the data dict, not a mutation of the original, to avoid metadata pollution.
If those two details are handled correctly, the changes should be safe and sufficient for the system's current functionality.Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>
|
Acknowledged. The latest changes incorporate |
Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>
|
Regarding your two questions before marking complete: SafeNames typo swallowing β I'd leave it as-is. The current behavior (exception handler returns False) produces the same end result for typos. Adding logging inside SafeNames.missing would be noisy in production since many legitimate filter expressions reference keys that don't exist on every document. Not worth changing. Operator restrictions β No action needed, as long as you're using the default SimpleEval() constructor without passing a custom operators parameter. The defaults include all comparison, boolean, and membership operators (in, not in) which cover every filter expression in the codebase. Before marking complete, ensure the two remaining gaps from my last review are also addressed when you commit: Once those are included alongside the vector_db.py, memory.py, and knowledge_tool._py changes, the full vulnerability scope is covered and the review can be closed. |
Understood. The changes for |
Co-authored-by: thirdeyenation <133812267+thirdeyenation@users.noreply.github.com>
π¨ Severity: CRITICAL
π‘ Vulnerability: The
simpleevalpackage allowed arbitrary function execution resulting in potential sandbox escapes or remote code injection during the evaluation of user inputs and template conditions. The previous implementation usedsimple_eval()without explicitly disablingfunctions.π― Impact: An attacker could potentially achieve Remote Code Execution (RCE) by crafting malicious inputs evaluated by the system.
π§ Fix:
requirements.txtto requiresimpleeval>=1.0.5.simple_eval(...)withSimpleEval(names=..., functions={}).eval(...)inhelpers/files.py,helpers/vector_db.py, andplugins/_memory/helpers/memory.pyto prevent code execution when only evaluating variables..jules/sentinel.md.β Verification: Review the implementation changes ensuring
functions={}is present on allSimpleEvalinstances. Run unit tests to verify system functionality has not degraded.PR created automatically by Jules for task 12891476351164687624 started by @thirdeyenation