Skip to content

Add a general purpose expression filter capability (PP-4422)#3373

Open
tdilauro wants to merge 9 commits into
mainfrom
feature/general-purpose-filter
Open

Add a general purpose expression filter capability (PP-4422)#3373
tdilauro wants to merge 9 commits into
mainfrom
feature/general-purpose-filter

Conversation

@tdilauro
Copy link
Copy Markdown
Contributor

Description

Adds FilterExpression, a safe boolean expression evaluator backed by simpleeval. Expressions are Python-like DSL strings evaluated against a named context dictionary without using eval().

Key features:

  • Dot-access on dict values resolves to key lookup first (claim.ou instead of claim["ou"]).
  • Method calls are permitted on str, int, float, dict, list, and tuple; mutating methods (append, update, etc.) are blocked.
  • Additional types can be whitelisted via extra_safe_types.
  • missing_attribute_returns_false mode returns False instead of raising on missing attributes.
  • Thread-safe: a new evaluator is constructed per evaluate() call.

Motivation and Context

Provides a general purpose filtering primitive for cases where configuration-driven boolean rules need to be evaluated against structured data.

It is a prerequisite for new OIDC patron filtering capabilities and will replace similar capabilities in the SAML patron authentication integration. It may also be possible to refactor the patron blocking rules engine to use these capabilities, but that was not a primary use case.

[Jira PP-4422]

How Has This Been Tested?

  • Thorough unit tests for the new functionality.
  • All tests pass locally.
  • CI tests and other checks pass.

Checklist

  • N/A - I have updated the documentation accordingly.
  • All new and existing tests passed.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 21, 2026

Claude finished @tdilauro's task in 5m 34s —— View job


Code Review

  • Read changed files
  • Analyze implementation for bugs and issues
  • Post review findings

Summary

The implementation is correct and the security model is sound. _eval_attribute cleanly layers dict-key-shadow logic before the parent's attribute-access guard, the safe-type/mutation whitelist is properly enforced, and thread safety is maintained by constructing a fresh evaluator per evaluate() call. The not operator test gap from the prior review has been addressed in the latest commit (b6f5bcc). The findings below are all test-coverage gaps or docstring nits — no behavioral bugs found.

Details

Minor: src/palace/manager/util/filter.py:21-22

all and any are included in _SAFE_BUILTINS but have zero test coverage — not even a negative/error case. A simple parametrized entry like "all([x > 0, y > 0])" with a bool-returning expression would confirm they evaluate correctly and that the result passes the isinstance(result, bool) gate (since all()/any() return bool).

"any": any,
"len": len,

Minor: tests/manager/util/test_filter.py:374-388

int(), float(), and str() have error-path coverage (int('not-a-number'), int(float('inf'))) but no happy-path cases confirming they work and return a type that satisfies the isinstance(result, bool) check downstream (they don't, so they must be composed). A test like "int(x) == 5" with {"x": "5"} would document the intended usage.

"int('not-a-number') == 1",
{},
None,
None,
True,
id="value-error-raises",
),
pytest.param(
"int(float('inf')) == 1",
{},
None,
None,
True,
id="overflow-error-raises",
),

Minor: src/palace/manager/util/filter.py:40-54

_MUTATION_METHODS carries a maintenance comment ("if Python adds new named mutation methods … this set must be updated") but there is no defensive test validating it against the actual list/dict API surface. A test that asserts _MUTATION_METHODS is a superset of all non-dunder mutating names on list and dict would catch a gap introduced by a future Python version without relying on manual review.

_MUTATION_METHODS: frozenset[str] = frozenset(
{
"append",
"clear",
"extend",
"insert",
"pop",
"popitem",
"remove",
"reverse",
"setdefault",
"sort",
"update",
}
)

Minor: tests/manager/util/test_filter.py — tuple context value untested

tuple appears in _ALWAYS_SAFE_TYPES at filter.py:31-33 and its methods are therefore permitted, but no test passes a tuple as a context value or calls a tuple method (e.g. "t.count(1) == 2" with {"t": (1, 2, 1)}). Without coverage the tuple branch in the safe-type check is never exercised.

_ALWAYS_SAFE_TYPES: frozenset[type[Any]] = frozenset(
{float, int, str, dict, list, tuple}
)

Nit: src/palace/manager/util/filter.py:58

The docstring "Raised when a filter expression fails to parse or evaluate" omits the non-bool-result case, which is a distinct third condition (evaluate() raises when the expression returns e.g. an int without any evaluation error). Consider: "Raised when a filter expression cannot be compiled or evaluated, or returns a non-boolean result."

"""Raised when a filter expression fails to parse or evaluate."""

@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.35%. Comparing base (95a3a48) to head (b6f5bcc).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3373      +/-   ##
==========================================
+ Coverage   93.31%   93.35%   +0.04%     
==========================================
  Files         504      508       +4     
  Lines       46326    46497     +171     
  Branches     6325     6344      +19     
==========================================
+ Hits        43230    43409     +179     
+ Misses       2003     1999       -4     
+ Partials     1093     1089       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tdilauro tdilauro requested a review from a team May 21, 2026 05:16
Copy link
Copy Markdown
Contributor

@dbernstein dbernstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Hi @tdilauro : nice work. I'm giving this an approval with a few suggestions. I asked Claude to give special attention to the functional overlap with the patron blocking rules engine. It may be worth creating a tech debt ticket in case we want to eliminate some of the duplication down the road.


Relationship to patron_blocking_rules/rule_engine.py

Both systems wrap simpleeval.EvalWithCompoundTypes and require a strict bool result, but they differ enough that they are not straightforwardly interchangeable:

Concern rule_engine.py (existing) FilterExpression (this PR)
Context injection {placeholder} syntax compiled to __v_key variable names Plain dict keys become identifiers directly
Dict dot-access Not supported (context is flat scalars) First-class feature
Type safety No method whitelist _safe_types whitelist + mutation blocklist
Domain functions age_in_years, int General builtins (abs, all, any, len, min, max, int, float, str)
Admin-time validation validate_rule_expression check_syntax (parse-only)
Error fail-open Caller (patron_blocking.py) is fail-open evaluate() raises on any error

The {placeholder} syntax is user-facing and already live, so replacing rule_engine.py with this would be a breaking change. These two systems can coexist.

There is genuine duplication worth noting:

  • Both construct a new evaluator per call for thread safety
  • Both gate on isinstance(result, bool)
  • Both convert various simpleeval exceptions into a single custom exception type

If/when FilterExpression matures, it would make sense to migrate rule_engine.py to build on top of it (with a custom functions map for age_in_years and the {placeholder} compilation layer on top).


Security

The security model is sound overall.

  • _eval_attribute double-evaluation: node.value is evaluated twice — once in obj = self._eval(node.value) and again inside super()._eval_attribute(node). The comment correctly notes this is harmless for pure expressions, and the callable-dict guard fires before super() is reached on that path. Correct.

  • _MUTATION_METHODS as a blocklist is inherently a maintenance burden. The comment acknowledges this. Suggest adding a defensive test that validates this set covers all non-dunder mutating methods on list and dict, so a future Python version doesn't silently introduce a gap.

  • missing_attribute_returns_false + method chaining footgun is well-documented in the docstring and covered by the missing-attr-method-chain-raises test. No issue.


Code Quality

Strengths: Excellent docstrings (especially the missing_attribute_returns_false warning), correct use of frozendict/frozenset for constants, and the type: ignore[misc] with explanation comment follows project conventions exactly.

Issues:

  1. No functions parameter. rule_engine.py exposes make_evaluator(allowed_functions=...) precisely because callers need domain-specific functions. FilterExpression hardcodes _SAFE_BUILTINS. If this is meant to serve as a general primitive for OIDC/SAML filtering, callers will eventually need to inject custom functions. Recommend adding an optional functions parameter now.

  2. No admin-time validation equivalent. rule_engine.py has validate_rule_expression for a trial evaluation at save-time. check_syntax() only parses — it won't catch NameNotDefined, type errors, or non-bool results. If this class will back configuration-driven rules, a validate(context: dict[str, Any]) -> None method would be valuable. Not a blocker, but worth a follow-up ticket.

  3. functions=dict(_SAFE_BUILTINS) on every evaluate() call: minor, but worth confirming whether simpleeval mutates the functions dict. If it doesn't, the copy is unnecessary and _SAFE_BUILTINS (a frozendict) could be passed directly.


Test Coverage

Tests are thorough. A few gaps:

  • not operator — not tested (not x, not (a and b)).
  • all() and any() builtins — present in _SAFE_BUILTINS but no test cases.
  • int(), float(), str() cast builtins — same gap.
  • Tuple valuestuple is in _ALWAYS_SAFE_TYPES but there's no test exercising a tuple context value.
  • check_syntax boundary: a test showing that check_syntax() passes but evaluate() raises on a simpleeval-disallowed construct (e.g. import os) would document this boundary explicitly.

Minor Nits

  • FilterExpressionError.__doc__: "fails to parse or evaluate" doesn't cover the non-bool-result case. Consider: "raised when a filter expression cannot be compiled or evaluated, or returns a non-boolean result."

Summary

The implementation is correct, well-documented, and follows project conventions. The duplication with rule_engine.py is real but scoped appropriately — they serve different expression languages with different user-facing contracts. Most actionable before merge:

  1. Add a functions parameter to FilterExpression to make it genuinely general-purpose.
  2. Add tests for not, all()/any(), the builtin cast functions, and tuple values.
  3. Consider a defensive test that validates _MUTATION_METHODS against the actual list/dict API surface.

@tdilauro
Copy link
Copy Markdown
Contributor Author

@dbernstein Thanks for the review.

I mentioned the possibility of more integration in the original PR description above:

It may also be possible to refactor the patron blocking rules engine to use these capabilities, but that was not a primary use case.

@dbernstein
Copy link
Copy Markdown
Contributor

dbernstein commented May 21, 2026

@tdilauro : I think it is good as is. Addressing the test gaps is probably overkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants