Add a general purpose expression filter capability (PP-4422)#3373
Add a general purpose expression filter capability (PP-4422)#3373tdilauro wants to merge 9 commits into
Conversation
|
Claude finished @tdilauro's task in 5m 34s —— View job Code Review
SummaryThe implementation is correct and the security model is sound. DetailsMinor:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3373 +/- ##
==========================================
+ Coverage 93.31% 93.35% +0.04%
==========================================
Files 504 508 +4
Lines 46326 46497 +171
Branches 6325 6344 +19
==========================================
+ Hits 43230 43409 +179
+ Misses 2003 1999 -4
+ Partials 1093 1089 -4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
dbernstein
left a comment
There was a problem hiding this comment.
Code Review
Hi @tdilauro : nice work. I'm giving this an approval with a few suggestions. I asked Claude to give special attention to the functional overlap with the patron blocking rules engine. It may be worth creating a tech debt ticket in case we want to eliminate some of the duplication down the road.
Relationship to patron_blocking_rules/rule_engine.py
Both systems wrap simpleeval.EvalWithCompoundTypes and require a strict bool result, but they differ enough that they are not straightforwardly interchangeable:
| Concern | rule_engine.py (existing) |
FilterExpression (this PR) |
|---|---|---|
| Context injection | {placeholder} syntax compiled to __v_key variable names |
Plain dict keys become identifiers directly |
| Dict dot-access | Not supported (context is flat scalars) | First-class feature |
| Type safety | No method whitelist | _safe_types whitelist + mutation blocklist |
| Domain functions | age_in_years, int |
General builtins (abs, all, any, len, min, max, int, float, str) |
| Admin-time validation | validate_rule_expression |
check_syntax (parse-only) |
| Error fail-open | Caller (patron_blocking.py) is fail-open |
evaluate() raises on any error |
The {placeholder} syntax is user-facing and already live, so replacing rule_engine.py with this would be a breaking change. These two systems can coexist.
There is genuine duplication worth noting:
- Both construct a new evaluator per call for thread safety
- Both gate on
isinstance(result, bool) - Both convert various
simpleevalexceptions into a single custom exception type
If/when FilterExpression matures, it would make sense to migrate rule_engine.py to build on top of it (with a custom functions map for age_in_years and the {placeholder} compilation layer on top).
Security
The security model is sound overall.
-
_eval_attributedouble-evaluation:node.valueis evaluated twice — once inobj = self._eval(node.value)and again insidesuper()._eval_attribute(node). The comment correctly notes this is harmless for pure expressions, and the callable-dict guard fires beforesuper()is reached on that path. Correct. -
_MUTATION_METHODSas a blocklist is inherently a maintenance burden. The comment acknowledges this. Suggest adding a defensive test that validates this set covers all non-dunder mutating methods onlistanddict, so a future Python version doesn't silently introduce a gap. -
missing_attribute_returns_false+ method chaining footgun is well-documented in the docstring and covered by themissing-attr-method-chain-raisestest. No issue.
Code Quality
Strengths: Excellent docstrings (especially the missing_attribute_returns_false warning), correct use of frozendict/frozenset for constants, and the type: ignore[misc] with explanation comment follows project conventions exactly.
Issues:
-
No
functionsparameter.rule_engine.pyexposesmake_evaluator(allowed_functions=...)precisely because callers need domain-specific functions.FilterExpressionhardcodes_SAFE_BUILTINS. If this is meant to serve as a general primitive for OIDC/SAML filtering, callers will eventually need to inject custom functions. Recommend adding an optionalfunctionsparameter now. -
No admin-time validation equivalent.
rule_engine.pyhasvalidate_rule_expressionfor a trial evaluation at save-time.check_syntax()only parses — it won't catchNameNotDefined, type errors, or non-bool results. If this class will back configuration-driven rules, avalidate(context: dict[str, Any]) -> Nonemethod would be valuable. Not a blocker, but worth a follow-up ticket. -
functions=dict(_SAFE_BUILTINS)on everyevaluate()call: minor, but worth confirming whethersimpleevalmutates thefunctionsdict. If it doesn't, the copy is unnecessary and_SAFE_BUILTINS(afrozendict) could be passed directly.
Test Coverage
Tests are thorough. A few gaps:
notoperator — not tested (not x,not (a and b)).all()andany()builtins — present in_SAFE_BUILTINSbut no test cases.int(),float(),str()cast builtins — same gap.- Tuple values —
tupleis in_ALWAYS_SAFE_TYPESbut there's no test exercising a tuple context value. check_syntaxboundary: a test showing thatcheck_syntax()passes butevaluate()raises on asimpleeval-disallowed construct (e.g.import os) would document this boundary explicitly.
Minor Nits
FilterExpressionError.__doc__: "fails to parse or evaluate" doesn't cover the non-bool-result case. Consider: "raised when a filter expression cannot be compiled or evaluated, or returns a non-boolean result."
Summary
The implementation is correct, well-documented, and follows project conventions. The duplication with rule_engine.py is real but scoped appropriately — they serve different expression languages with different user-facing contracts. Most actionable before merge:
- Add a
functionsparameter toFilterExpressionto make it genuinely general-purpose. - Add tests for
not,all()/any(), the builtin cast functions, and tuple values. - Consider a defensive test that validates
_MUTATION_METHODSagainst the actuallist/dictAPI surface.
|
@dbernstein Thanks for the review. I mentioned the possibility of more integration in the original PR description above:
|
|
@tdilauro : I think it is good as is. Addressing the test gaps is probably overkill. |
Description
Adds
FilterExpression, a safe boolean expression evaluator backed bysimpleeval. Expressions are Python-like DSL strings evaluated against a named context dictionary without usingeval().Key features:
claim.ouinstead ofclaim["ou"]).str,int,float,dict,list, andtuple; mutating methods (append,update, etc.) are blocked.extra_safe_types.missing_attribute_returns_falsemode returnsFalseinstead of raising on missing attributes.evaluate()call.Motivation and Context
Provides a general purpose filtering primitive for cases where configuration-driven boolean rules need to be evaluated against structured data.
It is a prerequisite for new OIDC patron filtering capabilities and will replace similar capabilities in the SAML patron authentication integration. It may also be possible to refactor the patron blocking rules engine to use these capabilities, but that was not a primary use case.
[Jira PP-4422]
How Has This Been Tested?
Checklist