Skip to content

Conversation

@omertuc
Copy link
Contributor

@omertuc omertuc commented Sep 2, 2025

Description

When we originally added the JWT role extraction operators, we made some basic ones which thought would be enough, but we found that there's too much complexity and variance in the various fields we find in Red Hat SSO tokens, so we realized we're going to need a regex operator to handle more complex matching scenarios

This commit simply adds a new MATCH operator to the JsonPathOperator enum, and implements the logic to handle it in the JwtRolesResolver.

_evaluate_operator had to be modified to accept the entire JwtRoleRule so that it can access the pre-compiled regex pattern for better performance.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #
  • Closes #

MGMT-21654

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Tested manually, wrote unit-tests

Summary by CodeRabbit

  • New Features
    • Added regex-based MATCH operator for role rules with iterable support for pattern matching (e.g., email domains).
    • Updated operator semantics for equals/contains/in and consistent negation handling affecting role evaluation.
  • Bug Fixes
    • Improved validation with clear errors for invalid or non-string regex patterns.
  • Tests
    • Added comprehensive tests covering MATCH behavior, validation failures, and compiled-pattern availability.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 2, 2025

Walkthrough

Refactors JwtRolesResolver to pass the full JwtRoleRule to its operator evaluator, consolidates operator dispatch and negation there, and adds a new MATCH operator with regex validation and a compiled_pattern on JwtRoleRule. Unit tests updated to exercise MATCH semantics and related validations.

Changes

Cohort / File(s) Summary
Authorization resolver refactor & MATCH semantics
src/authorization/resolvers.py
_evaluate_operator signature changed to accept JwtRoleRule and match. Dispatch now uses rule.operator. Implements EQUALS, CONTAINS, IN semantics with operands adjusted, adds MATCH handling using rule.compiled_regex, applies rule.negate once after evaluation, and updates call sites in evaluate_role_rules.
Config model: operator enum, validation, compiled regex
src/models/config.py
Adds JsonPathOperator.MATCH. Adds JwtRoleRule.check_regex_pattern to validate MATCH values (must be string and valid regex) and a compiled_regex cached_property returning a compiled Pattern[str] when applicable.
Tests: MATCH behavior and validation
tests/unit/authorization/test_resolvers.py
Adds tests for positive/negative MATCH regex behavior, invalid regex and non-string-pattern error cases, verifies non-string values don't match, and asserts compiled_regex present for MATCH and None otherwise. Adds necessary imports/helpers.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller as Caller
  participant Resolver as JwtRolesResolver
  participant Rule as JwtRoleRule
  participant Claims as JWT Claims

  Caller->>Resolver: evaluate_role_rules(rules, token)
  Resolver->>Claims: extract values via JSONPath
  loop for each extracted match
    Resolver->>Resolver: _evaluate_operator(rule, match)
    alt rule.operator == MATCH
      Resolver->>Rule: access compiled_regex
      alt compiled_regex exists
        Note right of Resolver #DFF2E1: iterate or test single value
        Resolver->>Resolver: apply compiled_regex.match (any true => result=true)
      else
        Resolver->>Resolver: result = False
      end
    else Other operators
      Resolver->>Resolver: evaluate EQUALS / CONTAINS / IN semantics
    end
    alt rule.negate
      Resolver->>Resolver: result = not result
    end
  end
  Resolver-->>Caller: final boolean decision
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers at a MATCH,
Regex trails in every patch.
Rules bundled whole, no scattered bits,
Negation flips and logic fits.
Tests hop in—secure the latch.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@omertuc omertuc marked this pull request as draft September 2, 2025 09:56
@omertuc
Copy link
Contributor Author

omertuc commented Sep 2, 2025

Draft because ran into some issues with some recent changes

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a0d12e2 and 1202be9.

📒 Files selected for processing (3)
  • src/authorization/resolvers.py (2 hunks)
  • src/models/config.py (4 hunks)
  • tests/unit/authorization/test_resolvers.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/unit/authorization/test_resolvers.py (2)
src/models/config.py (3)
  • JwtRoleRule (241-299)
  • JsonPathOperator (232-238)
  • compiled_regex (295-299)
src/authorization/resolvers.py (4)
  • JwtRolesResolver (53-122)
  • resolve_roles (29-30)
  • resolve_roles (36-39)
  • resolve_roles (60-67)
src/authorization/resolvers.py (1)
src/models/config.py (3)
  • JwtRoleRule (241-299)
  • JsonPathOperator (232-238)
  • compiled_regex (295-299)
🪛 GitHub Actions: Unit tests
tests/unit/authorization/test_resolvers.py

[error] 141-144: pytest step failed: tests/unit/authorization/test_resolvers.py::TestJwtRolesResolver::test_resolve_roles_match_operator_non_string_pattern - expected ValueError with message containing 'MATCH operator requires a string regex pattern', but got ValidationError: 'MATCH operator requires a string pattern, int'.

🪛 GitHub Actions: Pyright
src/authorization/resolvers.py

[error] 115-115: Pyright error during 'uv run pyright src': Cannot access attribute 'search' for class 'MethodType'. Attribute 'search' is unknown (reportAttributeAccessIssue).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-pr
🔇 Additional comments (11)
src/models/config.py (2)

4-6: Imports for regex support — LGTM.

Also applies to: 15-15


238-239: New JsonPathOperator.MATCH — LGTM.

src/authorization/resolvers.py (2)

74-77: Passing the full rule into the evaluator — LGTM.


119-122: Negation after evaluation — LGTM.

tests/unit/authorization/test_resolvers.py (7)

5-8: Test imports — LGTM.


81-103: MATCH positive path — LGTM.


104-126: MATCH negative path — LGTM.


127-138: Invalid regex validation — LGTM.


151-172: Non-string match value — LGTM.


173-194: Use correct commands to verify compiled_regex implementation
Ensure JwtRoleRule.compiled_regex is a property returning Pattern[str] so the isinstance(..., re.Pattern) assertions pass, then run:

pyright src
pytest -q tests/unit/authorization/test_resolvers.py::TestJwtRolesResolver

139-150: Non-string pattern expectation — message updated to “MATCH operator requires a string regex pattern”. After applying the config.py change, re-run:

pytest -q tests/unit/authorization/test_resolvers.py::TestJwtRolesResolver::test_resolve_roles_match_operator_non_string_pattern

When we originally added the JWT role extraction operators, we made some
basic ones which thought would be enough, but we found that there's too
much complexity and variance in the various fields we find in Red Hat
SSO tokens, so we realized we're going to need a regex operator to
handle more complex matching scenarios

This commit simply adds a new `MATCH` operator to the `JsonPathOperator`
enum, and implements the logic to handle it in the `JwtRolesResolver`.

`_evaluate_operator` had to be modified to accept the entire
`JwtRoleRule` so that it can access the pre-compiled regex pattern for
better performance.
@omertuc omertuc marked this pull request as ready for review September 2, 2025 10:05
@omertuc
Copy link
Contributor Author

omertuc commented Sep 2, 2025

Undrafted

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/authorization/resolvers.py (1)

100-111: Fix list-handling semantics for EQUALS/IN; keep CONTAINS robust.

match is a list of JSONPath values; current EQUALS compares list-to-scalar and IN checks list-in-container. Evaluate per-item; also make CONTAINS support substring checks for string items. (Re-raising a prior point.)

Apply:

-        rule: JwtRoleRule, match: Any
+        rule: JwtRoleRule, matches: Any
 ) -> bool:  # pylint: disable=too-many-branches
         """Evaluate an operator against a match and rule."""
-        result = False
-        match rule.operator:
+        result = False
+        items = matches if isinstance(matches, list) else [matches]
+        match rule.operator:
             case JsonPathOperator.EQUALS:
-                result = match == rule.value
+                # Any JSONPath value equals the rule value
+                result = any(item == rule.value for item in items)
             case JsonPathOperator.CONTAINS:
-                result = rule.value in match
+                # List membership or substring containment (strings only)
+                result = rule.value in items
+                if isinstance(rule.value, str):
+                    result = result or any(
+                        isinstance(item, str) and rule.value in item for item in items
+                    )
             case JsonPathOperator.IN:
-                result = match in rule.value
+                # Any JSONPath value is contained in the rule's container
+                result = hasattr(rule.value, "__contains__") and any(
+                    item in rule.value for item in items
+                )
🧹 Nitpick comments (2)
src/authorization/resolvers.py (2)

111-118: Quiet Pyright error and make MATCH branch type-safe; also iterate items consistently.

Cast the cached property and reuse per-item iteration.

Apply:

-            case JsonPathOperator.MATCH:
-                # Use the pre-compiled regex pattern for better performance
-                if rule.compiled_regex is not None:
-                    result = any(
-                        isinstance(item, str) and bool(rule.compiled_regex.search(item))
-                        for item in match
-                    )
+            case JsonPathOperator.MATCH:
+                # Use the pre-compiled regex pattern for better performance
+                pattern = cast(Optional[Pattern[str]], getattr(rule, "compiled_regex", None))
+                if pattern is not None:
+                    result = any(
+                        isinstance(item, str) and bool(pattern.search(item))
+                        for item in (matches if isinstance(matches, list) else [matches])
+                    )

Add to imports at the top of the file:

from typing import Any, Optional, Pattern, cast

Also consider ReDoS hardening for untrusted patterns: prefer RE2 (e.g., via re2/regex with timeouts) or require anchored, length-bounded patterns for MATCH.


119-120: Negation after a non-applicable operator could grant roles on misconfig.

If MATCH is misconfigured and produces no decision, negation flips False→True. Consider short-circuiting to False on invalid operator config (log and return) before applying negation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1202be9 and 17d0419.

📒 Files selected for processing (3)
  • src/authorization/resolvers.py (2 hunks)
  • src/models/config.py (3 hunks)
  • tests/unit/authorization/test_resolvers.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/models/config.py
  • tests/unit/authorization/test_resolvers.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/authorization/resolvers.py (1)
src/models/config.py (3)
  • JwtRoleRule (241-299)
  • JsonPathOperator (232-238)
  • compiled_regex (295-299)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (1)
src/authorization/resolvers.py (1)

73-77: Passing the full rule into _evaluate_operator is a good change.

This simplifies dispatch and enables reuse of precompiled regex.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new match rules looks nice!

@tisnik tisnik merged commit be56d50 into lightspeed-core:main Sep 2, 2025
19 checks passed
omertuc added a commit to omertuc/assisted-chat that referenced this pull request Sep 2, 2025
omertuc added a commit to omertuc/assisted-chat that referenced this pull request Sep 2, 2025
omertuc added a commit to omertuc/assisted-chat that referenced this pull request Sep 2, 2025
Bump to lightspeed-core/lightspeed-stack#483

Also include changes from rh-ecosystem-edge#171 so we only have one bump PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants