Skip to content

feat: implement zift CLI scanner and Rego policy generation#1

Merged
boorad merged 11 commits intomainfrom
feat/initial-implementation
Apr 23, 2026
Merged

feat: implement zift CLI scanner and Rego policy generation#1
boorad merged 11 commits intomainfrom
feat/initial-implementation

Conversation

@boorad
Copy link
Copy Markdown
Contributor

@boorad boorad commented Apr 22, 2026

Summary

Initial implementation of the zift CLI tool — scans codebases for embedded authorization logic and generates Rego policies for OPA.

Changes

  • Core scanner engine — tree-sitter AST parsing for TypeScript/JavaScript with 12 built-in detection rules covering RBAC, ABAC, middleware, ownership, and feature gate patterns
  • CLI — clap-based interface with scan, extract, rules, init commands; bare zift . defaults to scan
  • Rego policy generationextract command groups findings by source file into per-file .rego packages with template interpolation from tree-sitter captures
  • Rego syntax validation — integrates Microsoft's regorus crate to validate generated policies and rule templates
  • OPA enforcement point detection — traces imports to identify functions sourced from policy-engine modules (authz, opa, policy, rego, enforce); skips these as already-externalized and reports coverage percentage
  • Rule engine — TOML-defined tree-sitter queries with regex predicates, embedded at compile time, extensible via --rules-dir or config

Scan output example (EA codebase)

Summary: 17 findings in 6 files (15 high, 2 medium)
       87 enforcement points (already using OPA/policy engine, not flagged) — 84% externalized

Testing

cargo test                    # 57 tests
cargo clippy                  # clean (1 expected dead_code warning)
cargo run -- scan .           # scan current directory
cargo run -- rules validate   # validate all rules (tree-sitter + Rego templates)
cargo run -- rules test       # run rule test fixtures
cargo run -- scan --format json . | cargo run -- extract  # end-to-end pipeline

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Adds a CLI tool to scan projects, generate text/JSON reports, extract policy stubs to Rego files, manage rule sets, and initialize config; includes filtering, confidence thresholds, and optional deep/LLM scan parameters.
  • New Rules
    • Many TypeScript/JavaScript rules for RBAC/ABAC/ownership/feature-gates/middleware/JWT/permissions and NestJS patterns.
  • Documentation
    • New user workflows for addressing PR feedback, commits, handoffs, reviews, and PR creation.
  • Tests
    • Expanded unit tests across parsing, scanning, matching, template generation, and validation.

boorad added 5 commits April 22, 2026 15:24
Adapted from ea repo for the zift project — includes commit, pr, review,
handoff, and address-pr-feedback commands with Rust/cargo tooling references
instead of bun/TypeScript. Skipped ea-specific commands (deploy, jira,
validate-integration).
Add the zift CLI tool for scanning codebases to detect embedded authorization
logic. Uses tree-sitter for AST-based pattern matching against TypeScript and
JavaScript source files.

- CLI with clap derive API: scan, extract (stub), report, rules, init commands
- 12 built-in detection rules for RBAC, ABAC, middleware, ownership, and
  feature gate patterns
- Rule engine with TOML-defined tree-sitter queries and regex predicates
- File discovery using the ignore crate (respects .gitignore)
- SHA-256 content hashing for finding deduplication
- Text and JSON output formatters with SARIF placeholder
- Per-language query compilation cache for performance
- Configuration via .zift.toml with sensible defaults
Implement the extract command to generate OPA Rego policies from scanner
output. Findings are grouped by source file into per-file .rego packages
with proper import statements and package naming.

- Rego template engine with {{variable}} interpolation from tree-sitter
  captures
- Category-specific default stubs (RBAC, ABAC, ownership, feature gate, etc.)
- Confidence-based wrapping: high passes through, medium adds TODO, low
  comments out
- Per-rule rego_template sections in 7 detection rules for precise policy
  generation
- Extract command reads findings from file or stdin (JSON), filters by
  confidence, and writes .rego files grouped by source
Integrate Microsoft's regorus (Rust OPA engine) to validate generated Rego
policies at two points: extract command now checks each .rego file before
writing and shows [OK]/[WARN] status, and rules validate now checks
rego_template sections in addition to tree-sitter queries.
Functions imported from policy-engine modules (paths containing authz,
opa, policy, rego, enforce) are now recognized as enforcement points
rather than flagged as inline authorization logic. The scan summary
shows enforcement point count and externalization percentage.

Reduces false positives in codebases already using OPA — e.g., EA repo
drops from 87 findings to 17, with 84% externalized.
@boorad boorad self-assigned this Apr 22, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

📝 Walkthrough

Walkthrough

Adds a Rust CLI tool "zift" implementing Tree-sitter based code scanning (rules, enforcement-point detection), Rego stub/template generation and validation, many TypeScript/JavaScript pattern rules, config/error/logging infrastructure, outputs (text/json), CLI commands, and several .claude workflow docs. (49 words)

Changes

Cohort / File(s) Summary
Documentation Workflows
\.claude/commands/address-pr-feedback.md, \.claude/commands/commit.md, \.claude/commands/handoff.md, \.claude/commands/pr.md, \.claude/commands/review.md
Five new workflow/command templates covering PR feedback, commit flow, handoffs, PR creation, and branch-based reviews.
Project Manifest
Cargo.toml
New Cargo manifest for the zift package declaring dependencies and a dev-dependency.
TypeScript/JS Rule Set
rules/typescript/*.toml
Many new TOML pattern rules (RBAC/ABAC/ownership/middleware/feature-gates/JWT/NestJS/session/permission checks) including predicates, Rego templates, and tests.
CLI & Enums
src/cli.rs
New public CLI types, subcommands (scan/extract/report/rules/init), argument structs, ValueEnum enums and unit tests for parsing/verbosity.
Commands & Dispatcher
src/commands/...
src/commands/{mod.rs,scan.rs,extract.rs,report.rs,rules.rs,init.rs}
New command implementations and central dispatch: scan orchestration, extract (generate/validate Rego), rules management (list/validate/test), report stub, and init (write default config).
Core Types, Config & Errors
src/types.rs, src/config.rs, src/error.rs, src/logging.rs
Added Finding and enums, TOML-backed ZiftConfig loader with defaults and tests, centralized ZiftError/Result alias, and logging init helper.
Scanner Pipeline
src/scanner/...
src/scanner/{mod.rs,discovery.rs,parser.rs,matcher.rs,imports.rs}
File discovery, language detection, parser helpers, rule compilation & execution, predicate filtering, policy-import/enforcement-point detection, deduplication, and ScanResult aggregation.
Rules Ingestion & Embedding
src/rules/{mod.rs,embedded.rs}
PatternRule model, TOML parsing/validation, embedded rules via include_str!, filesystem rule loading and override-by-id merge semantics.
Rego Generation & Validation
src/rego/{templates.rs,grouping.rs,validator.rs,mod.rs}
Template rendering, literal extraction, default stub generators, confidence wrapping, grouping findings into .rego files, and Rego syntax validation via regorus.
Output Writers
src/output/{mod.rs,text.rs,json.rs}
Text and JSON output modules formatting findings, computing summaries/enforcement percentages, and writing reports.
Utilities & Entrypoint
src/main.rs
Main wiring: CLI parse, logging init, config load, command dispatch, and top-level error handling.
Tests & Fixtures
src/**, rules/typescript/*
Unit tests across CLI, parser, matcher, templates, rego validator, discovery, imports, rules parsing/override, commands, config; rule test fixtures for TOML rules.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant CLI as CLI (arg parsing)
    participant Config as ZiftConfig
    participant Rules as RulesLoader
    participant Scanner as Scanner
    participant Parser as Parser (tree-sitter)
    participant Matcher as Matcher (queries)
    participant Rego as Rego (templates/validator)
    participant Output as Output (text/json)

    User->>CLI: invoke zift scan --path=...
    CLI->>Config: load_config(path)
    Config-->>CLI: ZiftConfig
    CLI->>Rules: load_rules(extra_dir?, config)
    Rules-->>CLI: Vec<PatternRule>
    CLI->>Scanner: scan(root, rules, args, config)
    Scanner->>Scanner: discover_files(root, exclude)
    Scanner->>Parser: parse_source(file, lang, tsx?)
    Parser-->>Scanner: Tree
    Scanner->>Matcher: compile_rule(rule, ts_lang)
    Matcher-->>Scanner: CompiledRule
    Scanner->>Matcher: execute_query(compiled, tree, source)
    Matcher-->>Scanner: Vec<Finding)
    Scanner->>Scanner: detect policy imports & enforcement points
    Scanner-->>Scanner: ScanResult { findings, enforcement_points }
    CLI->>Rego: group_findings(findings, prefix, outdir)
    Rego-->>CLI: Vec<RegoFile>
    CLI->>Rego: validate_template/render_template(...)
    Rego-->>CLI: ValidationResult
    CLI->>Output: print(findings, format)
    Output-->>User: Report (text/json)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I am a rabbit, nibbling through code,
I sniff for rules on every road.
I stitch Rego stubs with careful paws,
I group, validate, and mind the laws.
Hooray — I hop where safety grows! 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.80% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: implement zift CLI scanner and Rego policy generation' is a clear and concise summary of the main changes, following conventional commit conventions and accurately reflecting the primary objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@amazon-q-developer amazon-q-developer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security and Crash Risk Review

I've identified several critical security vulnerabilities and crash risks that must be addressed before merging:

🛑 Critical Security Issues (3)

  1. Path Traversal in File Writing - User-controlled file paths can write outside intended directories, potentially overwriting system files
  2. Path Traversal in Rego Path Generation - Malicious finding paths can bypass output directory restrictions
  3. Symlink/Directory Traversal - Recursive rule loading lacks symlink protection and depth limits, enabling DoS attacks

⚠️ Crash Risks (3)

  1. Array Out of Bounds - Unchecked array indexing in matcher that will panic on invalid capture indices
  2. UTF-8 String Slicing - Byte-based slicing on multi-byte characters causes panics
  3. String Slicing Underflow - Single-character quote strings cause panic in strip_quotes

🔍 Logic Errors (1)

  1. Silent Path Canonicalization Failure - unwrap_or silently falls back without proper error handling

All issues have actionable fix suggestions. The security vulnerabilities are particularly concerning for a tool that processes untrusted input and writes files to the filesystem. Please address these before merging.


You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.


⚠️ This PR contains more than 30 files. Amazon Q is better at reviewing smaller PRs, and may miss issues in larger changesets.

Comment thread src/commands/scan.rs Outdated
Comment thread src/scanner/matcher.rs Outdated
Comment thread src/commands/extract.rs
Comment thread src/rego/grouping.rs
Comment thread src/rego/grouping.rs
Comment thread src/rules/mod.rs
Comment thread src/rego/templates.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (5)
rules/typescript/feature-gate-check.toml-20-27 (1)

20-27: ⚠️ Potential issue | 🟡 Minor

Add negative test case to validate the rule does not produce false positives.

The rule currently has only one positive test case. Add a negative test to verify that non-matching patterns are correctly rejected (e.g., if (user.name === "admin") should not match, or checking a different property like user.tier with a non-matching predicate value).

Note: The concern about quote handling in template interpolation is not an issue—render_template explicitly calls strip_quotes() on captured values before interpolation, so "enterprise" becomes enterprise in the final Rego output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/feature-gate-check.toml` around lines 20 - 27, Add a
negative test that asserts the rule.rego_template does not allow non-matching
inputs: create a test case that invokes the template (using template/template
rendering path that substitutes "{{plan_value}}") with an input where
input.user.plan is a different value (e.g., "starter" or use a different
property like input.user.tier) and assert allow == false; target the
rule.rego_template/template and the resulting allow decision (default allow :=
false, allow if { input.user.plan in {"{{plan_value}}"} }) so the test ensures a
non-matching plan_value does not produce a true decision.
src/commands/scan.rs-9-9 (1)

9-9: ⚠️ Potential issue | 🟡 Minor

Handle non-NotFound canonicalization failures explicitly.

At Line 9, all canonicalize() errors are swallowed. This can hide permission or filesystem errors and degrade diagnostics.

🛠️ Narrow fallback to only missing-path cases
-    let path = args.path.canonicalize().unwrap_or(args.path.clone());
+    let path = match args.path.canonicalize() {
+        Ok(p) => p,
+        Err(e) if e.kind() == std::io::ErrorKind::NotFound => args.path.clone(),
+        Err(e) => return Err(e.into()),
+    };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/scan.rs` at line 9, The current assignment let path =
args.path.canonicalize().unwrap_or(args.path.clone()) swallows all canonicalize
errors; change it to explicitly match on args.path.canonicalize() and only fall
back to args.path.clone() when the error.kind() == std::io::ErrorKind::NotFound,
otherwise propagate or return the error (e.g., return Err(e) or use ?), so
permission and other filesystem errors are not hidden; update the code around
the path variable where args.path.canonicalize() is called to implement this
conditional handling.
.claude/commands/address-pr-feedback.md-94-94 (1)

94-94: ⚠️ Potential issue | 🟡 Minor

Use clearer wording than “Needs input.”

“Needs user input” reads more naturally in this checklist context.

✏️ Suggested edit
-- **Needs input**: Any ambiguous items requiring user decision
+- **Needs user input**: Any ambiguous items requiring user decision
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/commands/address-pr-feedback.md at line 94, Replace the checklist
phrase "Needs input" with "Needs user input" to improve clarity; locate the
exact string "Needs input" in .claude/commands/address-pr-feedback.md (the
checklist item) and update it to "Needs user input" wherever it appears in that
checklist section so the wording reads naturally for users.
src/scanner/discovery.rs-39-49 (1)

39-49: ⚠️ Potential issue | 🟡 Minor

Silent failures when adding invalid exclude patterns.

Invalid override patterns and build failures are silently ignored with let _ = and the if let Ok() pattern. Users won't know their exclusion patterns are malformed.

Consider logging a warning when patterns fail to parse:

🛡️ Suggested fix to log pattern errors
     if !exclude_patterns.is_empty() {
         let mut overrides = OverrideBuilder::new(root);
         for pattern in exclude_patterns {
             // Negate the pattern so it becomes an exclusion
-            let _ = overrides.add(&format!("!{pattern}"));
+            if let Err(e) = overrides.add(&format!("!{pattern}")) {
+                tracing::warn!("invalid exclude pattern '{}': {}", pattern, e);
+            }
         }
-        if let Ok(ov) = overrides.build() {
-            builder.overrides(ov);
+        match overrides.build() {
+            Ok(ov) => { builder.overrides(ov); }
+            Err(e) => { tracing::warn!("failed to build overrides: {}", e); }
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scanner/discovery.rs` around lines 39 - 49, The current loop silently
ignores failures when adding exclude patterns and when building overrides:
detect errors from OverrideBuilder::add while iterating over exclude_patterns
and log a warning including the invalid pattern and error; likewise, when
overrides.build() returns Err, log a warning with the build error and the root
or patterns and avoid silently proceeding—use the same logging facility used
elsewhere (e.g., log::warn! or the crate's logger) and still only call
builder.overrides(ov) on Ok(ov).
src/output/text.rs-13-16 (1)

13-16: ⚠️ Potential issue | 🟡 Minor

Still report enforcement coverage when there are no findings.

This early return hides enforcement_points, so a repo that is fully externalized just prints “No authorization patterns found.” without the coverage summary. That makes text output disagree with the JSON summary for the same scan.

Suggested fix
     if findings.is_empty() {
         writeln!(writer, "No authorization patterns found.")?;
+        if enforcement_points > 0 {
+            writeln!(
+                writer,
+                "{enforcement_points} enforcement points detected (already using OPA/policy engine)."
+            )?;
+        }
         return Ok(());
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/output/text.rs` around lines 13 - 16, The function currently returns
early when findings.is_empty(), which prevents printing enforcement coverage
info (enforcement_points) and causes text output to omit the coverage summary;
change the logic so that if findings.is_empty() you still write "No
authorization patterns found." to writer but do NOT return—continue to call the
same enforcement coverage generation/printing code that uses enforcement_points
(the same codepath used for non-empty findings) so the coverage summary is
always emitted; update the block around findings.is_empty() in
src/output/text.rs (the writer/writeln and the surrounding control flow) to
remove the early return and ensure enforcement_points printing runs afterward.
🧹 Nitpick comments (11)
rules/typescript/feature-gate-check.toml (1)

29-35: Add a negative test case to validate predicate filtering.

Only the positive case is tested. Consider adding a test where the property name doesn't match the predicate (e.g., if (user.name === "admin")) to ensure non-matching properties are correctly excluded.

📝 Example negative test case
 expect_match = true
+
+[[rule.tests]]
+input = """
+if (user.name === "admin") {
+  grantAccess();
+}
+"""
+expect_match = false
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/feature-gate-check.toml` around lines 29 - 35, Add a
negative test case to the existing [[rule.tests]] entries to verify predicate
filtering: create a new [[rule.tests]] block with an input that uses a
non-matching property name (e.g., `if (user.name === "admin")`) and set
expect_match = false so the rule does not trigger; reference the existing
rule.tests/input/expect_match pattern to mirror the positive case and ensure
non-matching properties are excluded by the predicate filter.
src/commands/init.rs (1)

25-32: Minor TOCTOU race between exists check and write.

The path.exists() check followed by fs::write() creates a small race window where another process could create the file between the check and write. For a CLI init command this is low risk, but using OpenOptions::new().create_new(true) would make this atomic.

♻️ Atomic create-if-not-exists alternative
-    if path.exists() {
-        return Err(ZiftError::General(format!(
-            "config file already exists: {}",
-            path.display()
-        )));
-    }
-
-    std::fs::write(&path, DEFAULT_CONFIG)?;
+    use std::fs::OpenOptions;
+    use std::io::Write;
+
+    let mut file = OpenOptions::new()
+        .write(true)
+        .create_new(true)
+        .open(&path)
+        .map_err(|e| {
+            if e.kind() == std::io::ErrorKind::AlreadyExists {
+                ZiftError::General(format!("config file already exists: {}", path.display()))
+            } else {
+                e.into()
+            }
+        })?;
+    file.write_all(DEFAULT_CONFIG.as_bytes())?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/init.rs` around lines 25 - 32, The exists-then-write sequence
(path.exists() followed by std::fs::write(&path, DEFAULT_CONFIG)) has a TOCTOU
race; replace it with an atomic create-if-not-exists using
OpenOptions::new().write(true).create_new(true).open(&path) and then write
DEFAULT_CONFIG into that file handle, propagating errors as before (return the
same ZiftError::General on already-existing or map the io error), so the init
logic in this function becomes atomic and race-free.
rules/typescript/express-auth-middleware.toml (1)

7-14: Consider extending query to match invoked middleware patterns.

The current query only matches identifiers passed directly to use() (e.g., app.use(requireAuth)), but misses invoked middleware like app.use(requireAuth()) or app.use(passport.authenticate("jwt")). These are common patterns in Express apps.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/express-auth-middleware.toml` around lines 7 - 14, The query
only captures bare identifier arguments to use() and misses invoked middleware
forms; update the Tree-sitter query (the query variable that uses captures
`@method`, `@middleware_name`, `@match`) to also match arguments that are
call_expression nodes and member_expression call_expression nodes (e.g.,
app.use(requireAuth()) and app.use(passport.authenticate("jwt"))). Specifically
add alternative patterns for the arguments: a direct call_expression (with its
callee captured) and a member_expression whose parent is a call_expression, and
capture those callee identifiers/member_expression as
`@middleware_name/`@middleware_call so invoked middleware is recognized by the
same rule.
rules/typescript/session-auth-check.toml (1)

8-15: Constrain the receiver to reduce false positives.

At Lines 8-15, matching by method name alone can flag unrelated APIs that also expose isAuthenticated/isAuthorized. Adding receiver capture/predicates (e.g., req, request, ctx, session) will improve precision.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/session-auth-check.toml` around lines 8 - 15, The rule
currently matches any call_expression with a member_expression by method name
(captured as `@method`) which yields false positives; modify the tree-sitter
pattern to also capture the member_expression's object as a receiver (e.g., add
a capture like `@receiver` on the object node within member_expression) and add a
new predicate (e.g., rule.predicates.receiver) that matches common receiver
identifiers such as ^(req|request|ctx|session|ctx.request)$ so the rule only
triggers when the method (isAuthenticated|isLoggedIn|isAuthorized) is invoked on
one of those receivers; keep the existing method predicate
(rule.predicates.method) and ensure both predicates must match.
rules/typescript/express-route-middleware.toml (1)

8-16: Rule currently misses common middleware argument shapes.

At Line 13, the query only matches middleware when it is a plain identifier in the second argument slot. This misses patterns like auth.requireAuth, requireAuth(), and other common middleware forms, reducing match coverage.

♻️ Suggested query broadening
 (call_expression
   function: (member_expression
     property: (property_identifier) `@method`)
   arguments: (arguments
     (string)
-    (identifier) `@middleware_name`
+    [
+      (identifier) `@middleware_name`
+      (member_expression
+        property: (property_identifier) `@middleware_name`)
+      (call_expression
+        function: (identifier) `@middleware_name`)
+    ]
     .
     (_))
 ) `@match`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/express-route-middleware.toml` around lines 8 - 16, The
current call_expression query only binds middleware when the second argument is
a plain (identifier) `@middleware_name`; update the pattern so `@middleware_name`
also captures member expressions and invocation forms (e.g., auth.requireAuth,
requireAuth(), or auth.requireAuth()) by allowing the second argument to be
(identifier | member_expression | call_expression) and optionally a
call_expression whose function is a member_expression, keeping the existing
captures like (member_expression property:(property_identifier) `@method`) and the
outer (call_expression function: ... ) `@match` so the rule will match both direct
identifiers, dotted properties, and common invoked middleware factories.
src/commands/extract.rs (1)

17-32: Fragile coupling to hardcoded CLI default values.

The config fallback logic compares against hardcoded strings ("app" and "./policies/generated") that must stay in sync with the #[arg(default_value = ...)] attributes in ExtractArgs. If someone changes the CLI defaults without updating these checks, the config override silently stops working.

Consider defining these defaults as constants shared between cli.rs and here, or using a different pattern (e.g., Option without defaults in CLI, defaulting in the command).

♻️ Example: extract shared constants
// In cli.rs or a shared constants module:
pub const DEFAULT_PACKAGE_PREFIX: &str = "app";
pub const DEFAULT_OUTPUT_DIR: &str = "./policies/generated";

// Then use in both places
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/extract.rs` around lines 17 - 32, The fallback logic in
execute() for ExtractArgs currently compares CLI values to hardcoded defaults
("app" and "./policies/generated"), which is fragile; replace these literal
checks by referencing shared constants or by removing CLI defaults and treating
those fields as Option so config wins when present. Specifically, introduce
shared DEFAULT_PACKAGE_PREFIX and DEFAULT_OUTPUT_DIR constants (used by cli.rs
and here) or change ExtractArgs.package_prefix/output_dir to Option and adjust
execute() to use config.extract.package_prefix.unwrap_or(args.package_prefix)
and config.extract.output_dir.map(PathBuf::from).unwrap_or_else(||
args.output_dir.clone()) so the comparisons to hardcoded strings are eliminated
and the fallback behavior is deterministic.
rules/typescript/jwt-token-check.toml (1)

6-6: Description and predicate scope mismatch.

The description says "JWT token verification or authorization header access" but the method predicate includes sign, which is for creating tokens, not verifying them. Consider either:

  1. Updating the description to include "signing/creation"
  2. Removing sign from the predicate if only verification is intended
♻️ Option 1: Update description
-description = "JWT token verification or authorization header access"
+description = "JWT token verification, decoding, or signing"

Also applies to: 20-21

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/jwt-token-check.toml` at line 6, The rule's description
string ("JWT token verification or authorization header access") is inconsistent
with the predicate which includes the "sign" method (used for token creation);
either update the description to mention signing/creation (e.g., add
"signing/creation" or "signing") or remove "sign" from the method predicate so
the rule only targets verification/authorization checks — locate the rule with
the description line and the predicate that lists "sign" and apply one of these
two fixes consistently (also update the similar occurrence referenced around the
other predicate block).
src/output/json.rs (1)

38-46: Inconsistent casing between category and confidence aggregation.

Category keys are lowercased (.to_lowercase()) but confidence keys are not. This creates inconsistent JSON output where categories might be "rbac" while confidences are "High". Consider making both consistent.

♻️ Suggested fix for consistency
         *by_confidence
-            .entry(f.confidence.to_string())
+            .entry(f.confidence.to_string().to_lowercase())
             .or_default() += 1;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/output/json.rs` around lines 38 - 46, The aggregation uses lowercased
keys for categories but not for confidences, causing inconsistent JSON keys; in
the loop over findings where by_category and by_confidence are updated
(variables f, f.category, f.confidence, entries via .entry(...).or_default()),
normalize the confidence key the same way as category (e.g., call
.to_lowercase() on f.confidence when building the entry key) so both maps
produce consistently-cased keys in the output.
src/cli.rs (1)

88-104: Deep scan options lack --deep requirement enforcement at parse time.

The --provider, --model, --max-cost, and --api-key options are documented as requiring --deep, but clap doesn't enforce this constraint. Users can specify --provider anthropic without --deep, which would be silently ignored or cause confusion.

Consider using clap's requires attribute to enforce the dependency:

♻️ Enforce --deep requirement
     // -- Deep scan options --
     /// LLM provider (requires --deep)
-    #[arg(long)]
+    #[arg(long, requires = "deep")]
     pub provider: Option<LlmProvider>,

     /// Model to use (requires --deep)
-    #[arg(long)]
+    #[arg(long, requires = "deep")]
     pub model: Option<String>,

     /// Maximum spend limit for LLM calls (requires --deep)
-    #[arg(long)]
+    #[arg(long, requires = "deep")]
     pub max_cost: Option<f64>,

     /// API key (or set ZIFT_API_KEY / provider-specific env vars)
-    #[arg(long, env = "ZIFT_API_KEY")]
+    #[arg(long, env = "ZIFT_API_KEY", requires = "deep")]
     pub api_key: Option<String>,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli.rs` around lines 88 - 104, The CLI deep-scan flags (fields provider,
model, max_cost, api_key in the CLI struct) are documented as requiring --deep
but clap won't enforce that; update each field's #[arg(...)] attribute to
include requires = "deep" (e.g., add requires = "deep" to the attributes for
provider, model, max_cost and api_key) so clap validates that those options can
only be used when --deep is present while preserving existing attributes like
long and env.
src/rules/mod.rs (1)

90-113: Non-deterministic predicate evaluation order.

The r.predicates HashMap iteration order is non-deterministic. If multiple predicates are defined for a rule, they will be pushed into the Vec<(String, Predicate)> in an arbitrary order that may vary between runs. While the current check_predicates logic uses short-circuit AND semantics (all must pass), the order could affect debugging reproducibility and any future order-dependent logic.

♻️ Consider sorting predicates for deterministic order
     let mut predicates = Vec::new();
-    for (capture_name, pred) in r.predicates {
+    let mut pred_entries: Vec<_> = r.predicates.into_iter().collect();
+    pred_entries.sort_by(|a, b| a.0.cmp(&b.0));
+    for (capture_name, pred) in pred_entries {
         let p = if let Some(re) = pred.match_re {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/mod.rs` around lines 90 - 113, Iteration over r.predicates (a
HashMap) yields non-deterministic predicate order; ensure deterministic ordering
by sorting predicates by their capture name before constructing the Vec<(String,
Predicate)>. Modify the loop building predicates (the block that creates
Predicate::Match/Eq/NotMatch/NotEq and pushes into predicates) to first collect
r.predicates.into_iter() (or keys) into a temporary Vec, sort that Vec by
capture_name (or key) and then iterate the sorted collection to build and push
(capture_name, p); keep the same error handling and use the same Predicate enum
and check_predicates logic so behavior is identical except for deterministic
ordering.
src/scanner/matcher.rs (1)

65-69: Silent fallback for invalid UTF-8 capture text.

Using unwrap_or_default() silently converts invalid UTF-8 to an empty string. This could cause predicates to fail unexpectedly or produce empty capture values without any diagnostic. Consider logging a warning or returning an error for malformed source.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scanner/matcher.rs` around lines 65 - 69, The code silently drops invalid
UTF-8 by calling capture.node.utf8_text(source).unwrap_or_default(); change this
to explicitly handle the Result from capture.node.utf8_text(source) (e.g. match
or if let Err(e) = ...) and either return an error from the surrounding function
or emit a warning with context (include capture identifier like capture.index or
capture.name and the error e) instead of producing an empty string; replace the
unwrap_or_default call and ensure callers/signature of the function (in
matcher.rs) propagate or handle the error accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rules/typescript/has-role-call.toml`:
- Around line 15-24: The current predicate regex (matching
hasRole|checkRole|isRole|requireRole|hasPermission|checkPermission|isAuthorized|authorize)
incorrectly maps permission/authorization functions to a role-based Rego
template that uses input.user.role and {{role_value}}; update the config so
role-only functions use the existing role template and permission/authorization
functions use a different template that checks input.user.permissions (or
similar). Concretely, split the predicate matching into two distinct sets (e.g.,
keep hasRole|checkRole|isRole|requireRole for the role template, and create a
new predicate group for hasPermission|checkPermission|isAuthorized|authorize)
and add a corresponding [rule.rego_template] for the permission group that tests
membership against input.user.permissions (or a configurable permission field)
instead of input.user.role.

In `@rules/typescript/nestjs-roles-decorator.toml`:
- Around line 8-17: The rule erroneously matches both Roles and SetMetadata in
the decorator_name predicate, causing `@SetMetadata`("roles","admin") to be parsed
like `@Roles` and extract the wrong string; update the predicate for
decorator_name used by the call_expression pattern so it only matches "Roles"
(remove "SetMetadata") so the call_expression pattern that captures the string
argument only applies to `@Roles` and will no longer extract the first argument of
`@SetMetadata` as a role.

In `@rules/typescript/ownership-check.toml`:
- Line 11: The operator list in the config allows both equality and inequality
(operator key) but the Rego template that emits checks always produces a
positive equality against hardcoded fields (resource.owner and user.id), causing
mismatches; update the template that generates the Rego rule so it uses the
actual matched operands and the actual operator from the match (emit "==" or
"===" for equality and emit "!=" or "!==" as a negated check for inequality), do
not hardcode resource.owner/user.id—use the captured LHS/RHS identifiers from
the parsed expression, and ensure the generated allow/deny rule preserves
default-deny semantics and mirrors the original operator polarity.

In `@rules/typescript/role-check-conditional.toml`:
- Around line 10-13: The binary_expression pattern in
role-check-conditional.toml currently matches any operator and thus treats
inequality as equality; update the pattern to constrain the operator to equality
only (match only '==' and '===' operators) by adding an operator node check
alongside the existing (member_expression property: (property_identifier) `@prop`)
and (string) `@role_value` bindings, and add a unit test that includes a
conditional using '!==', e.g. `if (user.role !== "admin")`, to assert the
pattern does NOT match (preventing the template from emitting a positive allow
for inequality cases).

In `@rules/typescript/role-includes-check.toml`:
- Around line 24-30: The generated Rego template incorrectly checks a scalar
field `input.user.role` against a set using `in`, which loses array-membership
semantics from patterns like `roles.includes("...")`; update the
`rule.rego_template` so the policy tests membership against the
`input.user.roles` array (or uses an array-aware check) instead of
`input.user.role`, and substitute `{{role_value}}` into an array-membership
expression (e.g., use `some i; input.user.roles[i] == "{{role_value}}"` or the
Rego `role in input.user.roles` pattern) so templates produced by `template`
correctly represent `roles.includes(...)` semantics.

In `@src/commands/rules.rs`:
- Around line 34-38: The validation currently always calls
ts_parser::get_language(*lang, false) which only tests the base TypeScript
grammar; update the loop that builds tree_sitter::Query::new to try both
TypeScript variants for TypeScript rules: when iterating rule.languages, if the
language is the TypeScript identifier used in your codebase (the same enum/const
used by the scanner), call ts_parser::get_language(*lang, false) and
ts_parser::get_language(*lang, true) (or otherwise select the TSX variant) and
attempt tree_sitter::Query::new with each variant, treating the rule as valid if
either variant succeeds; only emit an error from the validation logic if
Query::new fails for both variants (for non-TypeScript languages keep the single
existing get_language call).
- Around line 67-76: Add an optional boolean field is_tsx_jsx to the RuleTest
struct (defaulting to false and exposed in TOML deserialization), then use that
field when choosing the parser variant: replace the hardcoded false passed to
ts_parser::get_language(lang, false) and to ts_parser::parse_source(..., lang,
false) with test.is_tsx_jsx so the Parser created in rules.rs (the parser/tree
creation path using parser and tree variables) will parse with the TSX/JSX
grammar when the test metadata requests it.

In `@src/rego/grouping.rs`:
- Around line 123-126: The truncation logic for first_line (from
finding.code_snippet.lines()) is slicing bytes with first_line[..77], which can
panic on non-ASCII UTF-8; replace the byte-slice with a Unicode-safe truncation
using first_line.chars().take(77).collect::<String>() (or an equivalent
chars()-based helper) and then build truncated accordingly (add "..." only when
the char-count exceeded 77) so variables first_line and truncated remain correct
and safe for multi-byte characters.
- Around line 29-43: The generated package segment needs to ensure no segment
starts with a digit: after computing sanitized (from stem) split sanitized on
'.' and transform each segment so that if a segment's first character is a digit
you prefix it with '_' (e.g., "2024" -> "_2024"); then rejoin with '.' and use
that adjusted sanitized when building the final package string with prefix;
apply the same rule to prefix if prefix contains dot-separated segments (or
ensure prefix segments are already validated) so the final output from the block
that computes sanitized/prefix never yields a segment beginning with a digit.

In `@src/rego/templates.rs`:
- Around line 105-135: The Rbac and FeatureGate branches in templates.rs are
incorrectly treating every string literal as a role/plan; stop inferring
roles/plans from arbitrary literals by removing the block that formats literals
into a set and instead default to the safe stub (e.g., "{\"TODO\"}") unless the
caller explicitly marked the literals as roles/plans; update the
AuthCategory::Rbac and AuthCategory::FeatureGate branches to insert the TODO
stub into vars (roles/plans) and, if you add a new explicit flag/parameter
(e.g., explicit_roles or explicit_plans), only format and insert the literals
when that flag is true; keep use of vars.insert("roles"/"plans") but eliminate
the automatic literals -> set conversion to avoid overly permissive policies.

In `@src/rego/validator.rs`:
- Around line 31-38: The current two-pass substitution in validate_template
(validator.rs) naively replaces all {{...}} with a quoted "placeholder", causing
invalid code when a placeholder appears in identifier positions (e.g., after a
dot); change validate_template to perform context-aware replacement: detect
placeholder occurrences and if the placeholder is in an identifier position
(e.g., preceded by a dot operator or at a location where an unquoted identifier
is required) replace it with an unquoted safe identifier (e.g., placeholder_id)
or bracket-index form, otherwise replace with a quoted placeholder string;
implement this by scanning template with a regex that captures surrounding
character (or a small token-aware pass) and update the replacement logic in the
same function (look for re_quoted, re_bare, rendered, full_policy, validate_rego
references) and add a regression test that feeds an ABAC-like template such as
"input.user.{{attribute}} == {{value}}" to validate_template to assert it
produces valid Rego.

In `@src/scanner/imports.rs`:
- Around line 98-101: The current is_enforcement_point function uses substring
matching (code_snippet.contains) which yields false positives (e.g.,
authorizeResult, comments, string literals); update is_enforcement_point to
detect imported identifiers as standalone tokens or use AST/token-based
detection instead of raw contains: for a quick fix use a regex/tokenizer to
match word-boundary/identifier boundaries against each entry in policy_imports,
or better yet parse the snippet into an AST or tokens and check for Identifier
nodes matching the policy_imports set; ensure behavior aligns with scan()
expectations so only true identifier uses are treated as enforcement points.
- Around line 23-39: The parser only handles TS_NAMED_IMPORT_QUERY and
TS_DEFAULT_IMPORT_QUERY so namespace imports like `import * as authz from
"../policy"` are ignored; update find_policy_imports() to also handle a new
TS_NAMESPACE_IMPORT_QUERY that matches an import_clause with a namespace_import
(capture the namespace identifier as `@name` and the source as `@source`) and add a
unit test that parses a file containing `import * as authz from "../policy"` and
subsequent `authz.authorize(...)` to ensure policy_imports is populated;
reference TS_NAMED_IMPORT_QUERY, TS_DEFAULT_IMPORT_QUERY,
TS_NAMESPACE_IMPORT_QUERY, find_policy_imports(), and policy_imports when making
the change.

In `@src/scanner/matcher.rs`:
- Around line 157-176: The project must declare Rust 1.76 to allow the let-chain
syntax used in filter_findings; open Cargo.toml and in the [package] section add
rust-version = "1.76" so the compiler toolchain requirement is enforced for the
repo (this will ensure code using the if let Some(min) = min_confidence && ...
expression in filter_findings compiles consistently).

In `@src/scanner/mod.rs`:
- Around line 119-140: The enforcement_points counter is incremented before
deduplication, causing duplicate enforcement calls to be double-counted; change
the logic in the loop that uses imports::is_enforcement_point to collect skipped
enforcement findings into a temporary Vec (e.g. skipped_enforcement), push
non-enforcement findings into all_findings as before, then after the outer loop
run matcher::dedup_findings on both all_findings and skipped_enforcement and set
enforcement_points = deduped_skipped_enforcement.len() (or add that length)
instead of incrementing enforcement_points inline; reference enforcement_points,
imports::is_enforcement_point, all_findings, and matcher::dedup_findings to
locate and update the code.

In `@src/scanner/parser.rs`:
- Around line 6-13: The get_language function currently panics for unsupported
Language variants; change its signature from returning tree_sitter::Language to
returning Result<tree_sitter::Language, YourErrorType> (e.g., a custom error
enum or anyhow::Error), update the match in get_language so each supported
branch returns Ok(<language>.into()) and replace the unimplemented! branch with
Err(...) containing a clear error message about the unsupported Language
variant, and then update callers (e.g., places that call get_language and any
use in discover_files or parser initialization) to handle the Result (propagate
with ? or map the error) instead of relying on a panic. Ensure to reference the
Language enum and get_language when adding the new error case so unsupported
languages are handled gracefully.

---

Minor comments:
In @.claude/commands/address-pr-feedback.md:
- Line 94: Replace the checklist phrase "Needs input" with "Needs user input" to
improve clarity; locate the exact string "Needs input" in
.claude/commands/address-pr-feedback.md (the checklist item) and update it to
"Needs user input" wherever it appears in that checklist section so the wording
reads naturally for users.

In `@rules/typescript/feature-gate-check.toml`:
- Around line 20-27: Add a negative test that asserts the rule.rego_template
does not allow non-matching inputs: create a test case that invokes the template
(using template/template rendering path that substitutes "{{plan_value}}") with
an input where input.user.plan is a different value (e.g., "starter" or use a
different property like input.user.tier) and assert allow == false; target the
rule.rego_template/template and the resulting allow decision (default allow :=
false, allow if { input.user.plan in {"{{plan_value}}"} }) so the test ensures a
non-matching plan_value does not produce a true decision.

In `@src/commands/scan.rs`:
- Line 9: The current assignment let path =
args.path.canonicalize().unwrap_or(args.path.clone()) swallows all canonicalize
errors; change it to explicitly match on args.path.canonicalize() and only fall
back to args.path.clone() when the error.kind() == std::io::ErrorKind::NotFound,
otherwise propagate or return the error (e.g., return Err(e) or use ?), so
permission and other filesystem errors are not hidden; update the code around
the path variable where args.path.canonicalize() is called to implement this
conditional handling.

In `@src/output/text.rs`:
- Around line 13-16: The function currently returns early when
findings.is_empty(), which prevents printing enforcement coverage info
(enforcement_points) and causes text output to omit the coverage summary; change
the logic so that if findings.is_empty() you still write "No authorization
patterns found." to writer but do NOT return—continue to call the same
enforcement coverage generation/printing code that uses enforcement_points (the
same codepath used for non-empty findings) so the coverage summary is always
emitted; update the block around findings.is_empty() in src/output/text.rs (the
writer/writeln and the surrounding control flow) to remove the early return and
ensure enforcement_points printing runs afterward.

In `@src/scanner/discovery.rs`:
- Around line 39-49: The current loop silently ignores failures when adding
exclude patterns and when building overrides: detect errors from
OverrideBuilder::add while iterating over exclude_patterns and log a warning
including the invalid pattern and error; likewise, when overrides.build()
returns Err, log a warning with the build error and the root or patterns and
avoid silently proceeding—use the same logging facility used elsewhere (e.g.,
log::warn! or the crate's logger) and still only call builder.overrides(ov) on
Ok(ov).

---

Nitpick comments:
In `@rules/typescript/express-auth-middleware.toml`:
- Around line 7-14: The query only captures bare identifier arguments to use()
and misses invoked middleware forms; update the Tree-sitter query (the query
variable that uses captures `@method`, `@middleware_name`, `@match`) to also match
arguments that are call_expression nodes and member_expression call_expression
nodes (e.g., app.use(requireAuth()) and app.use(passport.authenticate("jwt"))).
Specifically add alternative patterns for the arguments: a direct
call_expression (with its callee captured) and a member_expression whose parent
is a call_expression, and capture those callee identifiers/member_expression as
`@middleware_name/`@middleware_call so invoked middleware is recognized by the
same rule.

In `@rules/typescript/express-route-middleware.toml`:
- Around line 8-16: The current call_expression query only binds middleware when
the second argument is a plain (identifier) `@middleware_name`; update the pattern
so `@middleware_name` also captures member expressions and invocation forms (e.g.,
auth.requireAuth, requireAuth(), or auth.requireAuth()) by allowing the second
argument to be (identifier | member_expression | call_expression) and optionally
a call_expression whose function is a member_expression, keeping the existing
captures like (member_expression property:(property_identifier) `@method`) and the
outer (call_expression function: ... ) `@match` so the rule will match both direct
identifiers, dotted properties, and common invoked middleware factories.

In `@rules/typescript/feature-gate-check.toml`:
- Around line 29-35: Add a negative test case to the existing [[rule.tests]]
entries to verify predicate filtering: create a new [[rule.tests]] block with an
input that uses a non-matching property name (e.g., `if (user.name ===
"admin")`) and set expect_match = false so the rule does not trigger; reference
the existing rule.tests/input/expect_match pattern to mirror the positive case
and ensure non-matching properties are excluded by the predicate filter.

In `@rules/typescript/jwt-token-check.toml`:
- Line 6: The rule's description string ("JWT token verification or
authorization header access") is inconsistent with the predicate which includes
the "sign" method (used for token creation); either update the description to
mention signing/creation (e.g., add "signing/creation" or "signing") or remove
"sign" from the method predicate so the rule only targets
verification/authorization checks — locate the rule with the description line
and the predicate that lists "sign" and apply one of these two fixes
consistently (also update the similar occurrence referenced around the other
predicate block).

In `@rules/typescript/session-auth-check.toml`:
- Around line 8-15: The rule currently matches any call_expression with a
member_expression by method name (captured as `@method`) which yields false
positives; modify the tree-sitter pattern to also capture the
member_expression's object as a receiver (e.g., add a capture like `@receiver` on
the object node within member_expression) and add a new predicate (e.g.,
rule.predicates.receiver) that matches common receiver identifiers such as
^(req|request|ctx|session|ctx.request)$ so the rule only triggers when the
method (isAuthenticated|isLoggedIn|isAuthorized) is invoked on one of those
receivers; keep the existing method predicate (rule.predicates.method) and
ensure both predicates must match.

In `@src/cli.rs`:
- Around line 88-104: The CLI deep-scan flags (fields provider, model, max_cost,
api_key in the CLI struct) are documented as requiring --deep but clap won't
enforce that; update each field's #[arg(...)] attribute to include requires =
"deep" (e.g., add requires = "deep" to the attributes for provider, model,
max_cost and api_key) so clap validates that those options can only be used when
--deep is present while preserving existing attributes like long and env.

In `@src/commands/extract.rs`:
- Around line 17-32: The fallback logic in execute() for ExtractArgs currently
compares CLI values to hardcoded defaults ("app" and "./policies/generated"),
which is fragile; replace these literal checks by referencing shared constants
or by removing CLI defaults and treating those fields as Option so config wins
when present. Specifically, introduce shared DEFAULT_PACKAGE_PREFIX and
DEFAULT_OUTPUT_DIR constants (used by cli.rs and here) or change
ExtractArgs.package_prefix/output_dir to Option and adjust execute() to use
config.extract.package_prefix.unwrap_or(args.package_prefix) and
config.extract.output_dir.map(PathBuf::from).unwrap_or_else(||
args.output_dir.clone()) so the comparisons to hardcoded strings are eliminated
and the fallback behavior is deterministic.

In `@src/commands/init.rs`:
- Around line 25-32: The exists-then-write sequence (path.exists() followed by
std::fs::write(&path, DEFAULT_CONFIG)) has a TOCTOU race; replace it with an
atomic create-if-not-exists using
OpenOptions::new().write(true).create_new(true).open(&path) and then write
DEFAULT_CONFIG into that file handle, propagating errors as before (return the
same ZiftError::General on already-existing or map the io error), so the init
logic in this function becomes atomic and race-free.

In `@src/output/json.rs`:
- Around line 38-46: The aggregation uses lowercased keys for categories but not
for confidences, causing inconsistent JSON keys; in the loop over findings where
by_category and by_confidence are updated (variables f, f.category,
f.confidence, entries via .entry(...).or_default()), normalize the confidence
key the same way as category (e.g., call .to_lowercase() on f.confidence when
building the entry key) so both maps produce consistently-cased keys in the
output.

In `@src/rules/mod.rs`:
- Around line 90-113: Iteration over r.predicates (a HashMap) yields
non-deterministic predicate order; ensure deterministic ordering by sorting
predicates by their capture name before constructing the Vec<(String,
Predicate)>. Modify the loop building predicates (the block that creates
Predicate::Match/Eq/NotMatch/NotEq and pushes into predicates) to first collect
r.predicates.into_iter() (or keys) into a temporary Vec, sort that Vec by
capture_name (or key) and then iterate the sorted collection to build and push
(capture_name, p); keep the same error handling and use the same Predicate enum
and check_predicates logic so behavior is identical except for deterministic
ordering.

In `@src/scanner/matcher.rs`:
- Around line 65-69: The code silently drops invalid UTF-8 by calling
capture.node.utf8_text(source).unwrap_or_default(); change this to explicitly
handle the Result from capture.node.utf8_text(source) (e.g. match or if let
Err(e) = ...) and either return an error from the surrounding function or emit a
warning with context (include capture identifier like capture.index or
capture.name and the error e) instead of producing an empty string; replace the
unwrap_or_default call and ensure callers/signature of the function (in
matcher.rs) propagate or handle the error accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 58dde901-c70e-496a-b479-d3195910e62c

📥 Commits

Reviewing files that changed from the base of the PR and between c7e8a9f and b0f6237.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (44)
  • .claude/commands/address-pr-feedback.md
  • .claude/commands/commit.md
  • .claude/commands/handoff.md
  • .claude/commands/pr.md
  • .claude/commands/review.md
  • Cargo.toml
  • rules/typescript/express-auth-middleware.toml
  • rules/typescript/express-route-middleware.toml
  • rules/typescript/feature-gate-check.toml
  • rules/typescript/has-role-call.toml
  • rules/typescript/jwt-token-check.toml
  • rules/typescript/nestjs-roles-decorator.toml
  • rules/typescript/nestjs-use-guards.toml
  • rules/typescript/ownership-check.toml
  • rules/typescript/permission-check-call.toml
  • rules/typescript/role-check-conditional.toml
  • rules/typescript/role-includes-check.toml
  • rules/typescript/session-auth-check.toml
  • src/cli.rs
  • src/commands/extract.rs
  • src/commands/init.rs
  • src/commands/mod.rs
  • src/commands/report.rs
  • src/commands/rules.rs
  • src/commands/scan.rs
  • src/config.rs
  • src/error.rs
  • src/logging.rs
  • src/main.rs
  • src/output/json.rs
  • src/output/mod.rs
  • src/output/text.rs
  • src/rego/grouping.rs
  • src/rego/mod.rs
  • src/rego/templates.rs
  • src/rego/validator.rs
  • src/rules/embedded.rs
  • src/rules/mod.rs
  • src/scanner/discovery.rs
  • src/scanner/imports.rs
  • src/scanner/matcher.rs
  • src/scanner/mod.rs
  • src/scanner/parser.rs
  • src/types.rs

Comment thread rules/typescript/has-role-call.toml
Comment thread rules/typescript/nestjs-roles-decorator.toml Outdated
Comment thread rules/typescript/ownership-check.toml Outdated
Comment thread rules/typescript/role-check-conditional.toml
Comment thread rules/typescript/role-includes-check.toml
Comment thread src/scanner/imports.rs
Comment thread src/scanner/imports.rs Outdated
Comment thread src/scanner/matcher.rs
Comment thread src/scanner/mod.rs
Comment thread src/scanner/parser.rs Outdated
- Fix UTF-8 byte-slicing panic in grouping.rs by truncating via chars()
- Fix strip_quotes panic on single-char strings with length guard
- Remove SetMetadata from nestjs-roles-decorator rule (different arg semantics)
- Use word-boundary regex instead of substring contains() for enforcement point detection
- Deduplicate enforcement points to prevent double-counting across rules
- Remove unused ParseError variant to fix clippy warning
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/scanner/imports.rs (1)

23-39: ⚠️ Potential issue | 🟠 Major

Namespace import query still missing.

The queries only handle named imports (import { foo }) and default imports (import foo). Files using import * as authz from "../policy" followed by authz.authorize(...) won't have their imports detected, causing valid enforcement points to be incorrectly reported as inline findings.

,

🛠️ Proposed fix to add namespace import support
 const TS_DEFAULT_IMPORT_QUERY: &str = r#"
 (import_statement
   (import_clause
     (identifier) `@name`)
   source: (string) `@source`)
 "#;
+
+/// Tree-sitter query for namespace imports: `import * as foo from 'bar'`
+const TS_NAMESPACE_IMPORT_QUERY: &str = r#"
+(import_statement
+  (import_clause
+    (namespace_import
+      (identifier) `@name`))
+  source: (string) `@source`)
+"#;

Then update line 56 to include the new query:

-    for query_src in [TS_NAMED_IMPORT_QUERY, TS_DEFAULT_IMPORT_QUERY] {
+    for query_src in [TS_NAMED_IMPORT_QUERY, TS_DEFAULT_IMPORT_QUERY, TS_NAMESPACE_IMPORT_QUERY] {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scanner/imports.rs` around lines 23 - 39, Add a Tree-sitter query
constant to capture namespace imports (e.g., import * as alias from "source") —
name it TS_NAMESPACE_IMPORT_QUERY — that binds the namespace identifier to `@name`
and the string literal to `@source`, similar to TS_NAMED_IMPORT_QUERY and
TS_DEFAULT_IMPORT_QUERY; then update the code that currently uses only
TS_NAMED_IMPORT_QUERY and TS_DEFAULT_IMPORT_QUERY (the place where import
queries are collected/iterated) to include TS_NAMESPACE_IMPORT_QUERY so
namespace imports (import * as alias from "...") are detected and mapped to the
same name/source pairs.
🧹 Nitpick comments (1)
src/rego/templates.rs (1)

226-234: Add ABAC regression test for quoted RHS value.

There’s no test protecting the ABAC default-stub rendering path that produced this quoting issue. Please add one asserting generated output contains == "..." for string literals.

🧪 Suggested test addition
@@
     fn default_stub_rbac() {
         let stub = generate_default_stub(
             AuthCategory::Rbac,
             r#"authorize(user, "audit:read", { type: "audit" })"#,
         );
         assert!(stub.contains("input.user.role in"));
         assert!(stub.contains("audit:read"));
     }
+
+    #[test]
+    fn default_stub_abac_quotes_value_literal() {
+        let stub = generate_default_stub(
+            AuthCategory::Abac,
+            r#"if (user.department == "finance") { allow(); }"#,
+        );
+        assert!(stub.contains(r#"input.user.department == "finance""#));
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rego/templates.rs` around lines 226 - 234, Add an ABAC regression test
alongside default_stub_rbac that calls generate_default_stub with
AuthCategory::Abac and an authorize rule containing a string literal RHS (e.g.,
authorize(user, "audit:read", { type: "audit" })), then assert the rendered stub
contains the exact comparison form == "..." (for example
assert!(stub.contains("== \"audit:read\""))); ensure the test uses the same
helper generate_default_stub and checks string literals are quoted in the output
to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/rego/templates.rs`:
- Around line 57-58: The rendered ABAC stub ends up with unquoted identifiers
because render_template (which calls strip_quotes) removes quotes added in
generate_default_stub; to fix, change the Rego template in src/rego/templates.rs
so the value placeholder is a literal string (e.g. input.user.{{attribute}} ==
"{{value}}") and then remove the redundant quote-wrapping in
generate_default_stub (stop using format!("\"{v}\"") so values are passed
unquoted into render_template); alternatively adjust
render_template/strip_quotes to not strip quotes for template values, but the
simpler fix is the template->include quotes + remove quoting in
generate_default_stub and update the other ABAC-related templates at the noted
lines (145–149) accordingly.

---

Duplicate comments:
In `@src/scanner/imports.rs`:
- Around line 23-39: Add a Tree-sitter query constant to capture namespace
imports (e.g., import * as alias from "source") — name it
TS_NAMESPACE_IMPORT_QUERY — that binds the namespace identifier to `@name` and the
string literal to `@source`, similar to TS_NAMED_IMPORT_QUERY and
TS_DEFAULT_IMPORT_QUERY; then update the code that currently uses only
TS_NAMED_IMPORT_QUERY and TS_DEFAULT_IMPORT_QUERY (the place where import
queries are collected/iterated) to include TS_NAMESPACE_IMPORT_QUERY so
namespace imports (import * as alias from "...") are detected and mapped to the
same name/source pairs.

---

Nitpick comments:
In `@src/rego/templates.rs`:
- Around line 226-234: Add an ABAC regression test alongside default_stub_rbac
that calls generate_default_stub with AuthCategory::Abac and an authorize rule
containing a string literal RHS (e.g., authorize(user, "audit:read", { type:
"audit" })), then assert the rendered stub contains the exact comparison form ==
"..." (for example assert!(stub.contains("== \"audit:read\""))); ensure the test
uses the same helper generate_default_stub and checks string literals are quoted
in the output to prevent regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b05efddb-eac0-467c-92ee-84c5a101898b

📥 Commits

Reviewing files that changed from the base of the PR and between b0f6237 and cda04e1.

📒 Files selected for processing (6)
  • rules/typescript/nestjs-roles-decorator.toml
  • src/error.rs
  • src/rego/grouping.rs
  • src/rego/templates.rs
  • src/scanner/imports.rs
  • src/scanner/mod.rs
✅ Files skipped from review due to trivial changes (2)
  • src/error.rs
  • src/rego/grouping.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • rules/typescript/nestjs-roles-decorator.toml

Comment thread src/rego/templates.rs Outdated
- Return error on canonicalize failure instead of silent fallback
- Use bounds-checked .get() for capture index in matcher
- Filter path traversal components from derived output paths
- Add depth limit and symlink skipping for rule directory traversal
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
src/rego/grouping.rs (1)

58-63: ⚠️ Potential issue | 🟠 Major

Path containment is still bypassable for absolute/prefixed paths.

Filtering only ParentDir leaves RootDir/Windows Prefix components intact. If stem is absolute, output_dir.join(safe_stem) can resolve outside output_dir.

Suggested fix
-    // Filter out path traversal components
-    let safe_stem: PathBuf = stem
-        .components()
-        .filter(|c| !matches!(c, std::path::Component::ParentDir))
-        .collect();
+    // Keep only normal path components to guarantee a relative safe path
+    let safe_stem: PathBuf = stem
+        .components()
+        .filter_map(|c| match c {
+            std::path::Component::Normal(seg) => Some(seg),
+            _ => None, // drop Prefix/RootDir/CurDir/ParentDir
+        })
+        .collect();
#!/bin/bash
# Verify current sanitization only removes ParentDir in derive_output_path
rg -n -C2 'derive_output_path|Component::ParentDir|output_dir\.join' src/rego/grouping.rs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rego/grouping.rs` around lines 58 - 63, derive_output_path currently only
strips ParentDir but still allows RootDir/Prefix components so an absolute or
prefixed `stem` can escape `output_dir` when you call
`output_dir.join(safe_stem)`. Fix by only collecting safe, relative components
(accept only std::path::Component::Normal, optionally ignore CurDir) and
explicitly skip RootDir and Prefix and ParentDir; build `safe_stem` from those
Normal components so it can never be absolute or prefixed, and handle the case
of an empty result (e.g., use a default file name) before calling
`output_dir.join(...)`. Reference symbols: derive_output_path, stem, safe_stem,
output_dir.join.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/commands/scan.rs`:
- Around line 24-44: Check args.format for OutputFormat::Sarif and return the
ZiftError immediately before calling scanner::scan or loading rules; move the
match on args.format (or an early if args.format == OutputFormat::Sarif) up so
the unsupported-format branch executes before scanner::scan, loaded_rules, and
any expensive work (symbols: args.format, OutputFormat::Sarif, scanner::scan,
loaded_rules, ZiftError::General).

In `@src/rego/grouping.rs`:
- Around line 149-151: The check using trimmed.starts_with("default allow") is
too broad and can match identifiers like "default allow_read"; instead split the
trimmed line on whitespace and ensure the first token is "default" and the
second token is exactly "allow" before treating it as the default-allow line
(use the existing trimmed and has_default_allow variables and update the
conditional that currently does trimmed.starts_with("default allow") to this
stricter token-based check).

In `@src/rules/mod.rs`:
- Around line 92-111: The code that builds the Predicate (the block using
pred.match_re, pred.eq, pred.not_match, pred.not_eq and producing
Predicate::Match/ Eq/NotMatch/NotEq) must validate that exactly one operator is
present; compute how many of pred.match_re, pred.eq, pred.not_match, pred.not_eq
are Some and if the count != 1 return a ZiftError::RuleParse for r.id with a
message like "predicate '{capture_name}' must specify exactly one operator"
before constructing the Predicate, otherwise proceed to construct the
appropriate Predicate variant as currently done.

In `@src/scanner/matcher.rs`:
- Around line 151-156: compute_finding_id currently hashes only (rule_id,
file_path, snippet) so identical snippets in the same file collide; update
compute_finding_id to incorporate the match's unique location (e.g., byte offset
or span start/end or line/column) into the hash input so repeated, distinct
matches in the same file produce different IDs—modify callers that build the
input to pass the match location and change compute_finding_id signature to
accept and include that location (or a token like match_start and match_end) and
ensure dedup_findings still uses the new ID scheme.

---

Duplicate comments:
In `@src/rego/grouping.rs`:
- Around line 58-63: derive_output_path currently only strips ParentDir but
still allows RootDir/Prefix components so an absolute or prefixed `stem` can
escape `output_dir` when you call `output_dir.join(safe_stem)`. Fix by only
collecting safe, relative components (accept only std::path::Component::Normal,
optionally ignore CurDir) and explicitly skip RootDir and Prefix and ParentDir;
build `safe_stem` from those Normal components so it can never be absolute or
prefixed, and handle the case of an empty result (e.g., use a default file name)
before calling `output_dir.join(...)`. Reference symbols: derive_output_path,
stem, safe_stem, output_dir.join.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cc942e90-1cf8-4472-a1b6-87922ad1d4c8

📥 Commits

Reviewing files that changed from the base of the PR and between cda04e1 and 8bd3685.

📒 Files selected for processing (4)
  • src/commands/scan.rs
  • src/rego/grouping.rs
  • src/rules/mod.rs
  • src/scanner/matcher.rs

Comment thread src/commands/scan.rs Outdated
Comment thread src/rego/grouping.rs Outdated
Comment thread src/rules/mod.rs
Comment thread src/scanner/matcher.rs Outdated
- Split permission functions out of has-role-call.toml into new
  authorize-function-call.toml with permission-oriented Rego template
- Remove inequality operators from ownership-check query to prevent
  semantic mismatch in generated Rego
- Add equality operator constraint to role-check-conditional query
- Use idiomatic OPA array membership pattern in role-includes-check
- Detect permission-style strings (containing ':') in generate_default_stub
  and emit permission-based Rego instead of role-based
- Validate TypeScript rules against both TS and TSX grammars in
  validate and test actions
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/rego/templates.rs (1)

57-58: ⚠️ Potential issue | 🟠 Major

ABAC stub currently renders an unquoted value (== admin) instead of a string literal.

Because render_template strips surrounding quotes (Line 14), the value prepared at Lines 156-160 loses quotes and Line 57 becomes input.user.<attr> == admin. In Rego this is an identifier, not a string literal.

🔧 Proposed fix
-    input.user.{{attribute}} == {{value}}
+    input.user.{{attribute}} == "{{value}}"
         AuthCategory::Abac => {
             vars.insert(
                 "attribute".to_string(),
                 literals.first().cloned().unwrap_or("TODO".into()),
             );
             vars.insert(
                 "value".to_string(),
-                literals
-                    .get(1)
-                    .map(|v| format!("\"{v}\""))
-                    .unwrap_or("\"TODO\"".into()),
+                literals.get(1).cloned().unwrap_or("TODO".into()),
             );
         }
#!/bin/bash
# Show the exact ABAC template + render path that causes quote loss.
sed -n '9,18p' src/rego/templates.rs
sed -n '52,60p' src/rego/templates.rs
sed -n '150,162p' src/rego/templates.rs

python - <<'PY'
import re
template = 'input.user.{{attribute}} == {{value}}'
vars = {'attribute':'role', 'value':'"admin"'}
def strip_quotes(s):
    s = s.strip()
    if len(s) >= 2 and ((s.startswith('"') and s.endswith('"')) or (s.startswith("'") and s.endswith("'"))):
        return s[1:-1]
    return s
out = re.sub(r"\{\{(\w+)\}\}", lambda m: strip_quotes(vars.get(m.group(1), m.group(0))), template)
print(out)  # expected current behavior: input.user.role == admin
PY

Also applies to: 156-160

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rego/templates.rs` around lines 57 - 58, The ABAC template
input.user.{{attribute}} == {{value}} ends up with an unquoted identifier
because render_template strips surrounding quotes from substitutions; update
render_template (or the code path that prepares the ABAC template values) so
that values intended as string literals are preserved or re-quoted: either stop
removing surrounding quotes for the {{value}} placeholder when rendering the
ABAC template, or detect the ABAC equality pattern (input.user.{{attribute}} ==
{{value}}) and wrap the substituted value in quotes (e.g., ensure "admin"
remains "admin") before returning the rendered string; change the logic in
render_template and the code that prepares the vars so that the {{value}}
substitution yields a valid Rego string literal rather than an identifier.
🧹 Nitpick comments (1)
rules/typescript/authorize-function-call.toml (1)

27-49: Add one explicit JavaScript fixture to exercise the JS path.

All current tests rely on default language selection (TypeScript first). Add a language = "javascript" case so this dual-language rule is explicitly covered.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rules/typescript/authorize-function-call.toml` around lines 27 - 49, Add an
explicit JavaScript test case to the rule's test matrix: within the
authorize-function-call.toml file add a [[rule.tests]] block with language =
"javascript" and one of the existing inputs (e.g., the
hasPermission("orders:read") or authorize("admin") snippet) and the
corresponding expect_match value; locate the existing [[rule.tests]] entries
that use inputs with hasPermission, authorize, and hasRole to mirror one of
those blocks but include language = "javascript" so the JS path is explicitly
exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rules/typescript/authorize-function-call.toml`:
- Around line 1-49: The new rule "ts-authorize-function-call" (rule id
ts-authorize-function-call) was created but not added to the EMBEDDED_RULES
list, so it won't be loaded by the default CLI; update the EMBEDDED_RULES
constant to include "authorize-function-call" (or the rule's identifier string
used by the list) alongside the existing "permission-check-call" entry and
ensure the rule is present in the same collection/array initialization where
EMBEDDED_RULES is defined so it gets embedded at build/runtime (refer to the
EMBEDDED_RULES symbol and the existing permission-check-call registration to
mirror formatting and placement).

In `@src/commands/rules.rs`:
- Line 76: The code directly indexes rule.languages[0] when computing lang (via
test.language.unwrap_or(rule.languages[0])), which panics if languages is empty;
change this to safely access the first language (e.g., use rule.languages.get(0)
or .first()) and handle the None case by returning a clear error or skipping the
test with a logged message that includes the rule identifier (reference
test.language and rule.languages in rules.rs), or update parse_rule/RuleToml
validation to reject empty languages so the code can assume at least one
language exists.

---

Duplicate comments:
In `@src/rego/templates.rs`:
- Around line 57-58: The ABAC template input.user.{{attribute}} == {{value}}
ends up with an unquoted identifier because render_template strips surrounding
quotes from substitutions; update render_template (or the code path that
prepares the ABAC template values) so that values intended as string literals
are preserved or re-quoted: either stop removing surrounding quotes for the
{{value}} placeholder when rendering the ABAC template, or detect the ABAC
equality pattern (input.user.{{attribute}} == {{value}}) and wrap the
substituted value in quotes (e.g., ensure "admin" remains "admin") before
returning the rendered string; change the logic in render_template and the code
that prepares the vars so that the {{value}} substitution yields a valid Rego
string literal rather than an identifier.

---

Nitpick comments:
In `@rules/typescript/authorize-function-call.toml`:
- Around line 27-49: Add an explicit JavaScript test case to the rule's test
matrix: within the authorize-function-call.toml file add a [[rule.tests]] block
with language = "javascript" and one of the existing inputs (e.g., the
hasPermission("orders:read") or authorize("admin") snippet) and the
corresponding expect_match value; locate the existing [[rule.tests]] entries
that use inputs with hasPermission, authorize, and hasRole to mirror one of
those blocks but include language = "javascript" so the JS path is explicitly
exercised.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7e327b9-dbe1-4bf5-89e8-6a10c626d77a

📥 Commits

Reviewing files that changed from the base of the PR and between 8bd3685 and 230d6a2.

📒 Files selected for processing (7)
  • rules/typescript/authorize-function-call.toml
  • rules/typescript/has-role-call.toml
  • rules/typescript/ownership-check.toml
  • rules/typescript/role-check-conditional.toml
  • rules/typescript/role-includes-check.toml
  • src/commands/rules.rs
  • src/rego/templates.rs
✅ Files skipped from review due to trivial changes (1)
  • rules/typescript/role-check-conditional.toml
🚧 Files skipped from review as they are similar to previous changes (3)
  • rules/typescript/role-includes-check.toml
  • rules/typescript/has-role-call.toml
  • rules/typescript/ownership-check.toml

Comment thread rules/typescript/authorize-function-call.toml
Comment thread src/commands/rules.rs Outdated
- Fix ABAC template rendering bare identifiers instead of quoted strings
- Fail fast for unsupported SARIF format before scanning
- Fix overbroad `default allow` prefix matching in grouping
- Enforce exactly one predicate operator per capture in rule parsing
- Include line numbers in finding ID hash to prevent collisions
- Register authorize-function-call rule in embedded rules
- Guard against panic on empty languages array in rule tests
- Prefix digit-leading segments in Rego package names with underscore
- Replace unimplemented! with Result in get_language()
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/rego/grouping.rs (1)

72-76: ⚠️ Potential issue | 🟠 Major

Filter to Component::Normal to prevent output-path escape.

On Line 74, only ParentDir is removed. RootDir/Prefix can survive, and output_dir.join(safe_stem) may ignore output_dir when safe_stem is absolute.

Suggested fix
-    let safe_stem: PathBuf = stem
-        .components()
-        .filter(|c| !matches!(c, std::path::Component::ParentDir))
-        .collect();
+    let safe_stem: PathBuf = stem
+        .components()
+        .filter_map(|c| match c {
+            std::path::Component::Normal(seg) => Some(seg),
+            _ => None, // drop ParentDir, CurDir, RootDir, Prefix
+        })
+        .collect();
#!/bin/bash
# Verify current sanitization logic and whether Finding.file can be absolute upstream.

rg -n -A20 -B5 --type rust 'fn derive_output_path' src/rego/grouping.rs

# Inspect where Finding.file is populated across the codebase.
rg -n -C3 --type rust '\bFinding\s*\{' src
rg -n -C3 --type rust '\bfile:\s*[^,]+' src

# Look for absolute/canonicalized path construction that could feed Finding.file.
rg -n -C3 --type rust 'canonicalize\(|current_dir\(|absolut|to_path_buf\(\)' src
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rego/grouping.rs` around lines 72 - 76, The current sanitization in
derive_output_path collects Path components but only filters ParentDir, allowing
RootDir/Prefix to survive and make safe_stem absolute; instead build safe_stem
as a fresh PathBuf and iterate stem.components(), pushing only
std::path::Component::Normal segments (ignore RootDir, Prefix, CurDir,
ParentDir, etc.) so the resulting safe_stem is always relative before calling
output_dir.join(safe_stem); update the code around variables stem, safe_stem and
output_dir accordingly.
🧹 Nitpick comments (3)
src/rules/mod.rs (1)

226-298: Unit tests adequately cover happy paths.

Consider adding error-case tests (invalid regex, multiple predicates, empty languages) in a future iteration for more comprehensive coverage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/mod.rs` around lines 226 - 298, Add unit tests that cover error and
edge cases in the tests module: create new #[test] functions that call
parse_rule with malformed inputs (e.g., invalid regex in rule.predicates,
conflicting/multiple predicate keys, empty or missing languages) and assert that
parse_rule returns Err (or the expected error variant), and add tests for
merge_rules behavior when given duplicate ids with incompatible fields;
reference parse_rule, merge_rules, SAMPLE_RULE, and PatternRule to locate where
to add these negative tests.
src/commands/rules.rs (1)

64-67: Consider returning an error instead of calling exit(1).

Using std::process::exit(1) bypasses Rust's error handling and makes the function harder to test. Returning Err(ZiftError::General(...)) would allow main() to handle exits uniformly.

♻️ Proposed refactor
         if errors == 0 {
             println!("All {} rules validated successfully.", loaded.len());
         } else {
             eprintln!("{errors} rule(s) failed validation.");
-            std::process::exit(1);
+            return Err(crate::error::ZiftError::General(format!(
+                "{errors} rule(s) failed validation"
+            )));
         }
         println!("{passed} passed, {failed} failed.");
         if failed > 0 {
-            std::process::exit(1);
+            return Err(crate::error::ZiftError::General(format!(
+                "{failed} rule test(s) failed"
+            )));
         }

Also applies to: 139-142

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/rules.rs` around lines 64 - 67, The code currently calls
std::process::exit(1) after printing validation failures (the
eprintln!("{errors} rule(s) failed validation."); + std::process::exit(1)) which
prevents normal error propagation; change these exits to return
Err(ZiftError::General(format!(...))) with the same failure message (e.g.,
format!("{errors} rule(s) failed validation.")) and update the surrounding
function signature to return Result<..., ZiftError> so callers (including main)
can handle termination; apply the same replacement for the second occurrence
referenced (lines ~139-142) and update any call sites to propagate or map the
error instead of expecting the function to exit.
src/rego/grouping.rs (1)

191-223: Add regression tests for new edge-case guards.

Please add tests for (1) digit-leading package segments (src/2024/auth.tsapp._2024.auth) and (2) output-path sanitization for non-normal components. This protects Lines 37-48 and Lines 72-76 from regressions.

Based on learnings, the digit-leading segment guard in derive_package_name is required and should be regression-tested. As per coding guidelines "**/*.rs: Review code for missing test coverage on new code paths".

Also applies to: 224-285

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rego/grouping.rs` around lines 191 - 223, Add two regression tests
covering edge-case guards: one that verifies derive_package_name handles
digit-leading path segments by prefixing an underscore (e.g.,
Path::new("src/2024/auth.ts") → "app._2024.auth") and another that verifies
derive_output_path sanitizes non-normal path components so output paths remain
under the intended output dir (e.g., inputs with "." or ".." or hyphenated
components are normalized to safe components). Locate and extend the existing
test module that contains derive_package_simple/derive_package_hyphenated and
derive_output_path (functions derive_package_name and derive_output_path) and
add assertions that exercise the digit-leading branch and the sanitization
branch to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/rego/grouping.rs`:
- Around line 72-76: The current sanitization in derive_output_path collects
Path components but only filters ParentDir, allowing RootDir/Prefix to survive
and make safe_stem absolute; instead build safe_stem as a fresh PathBuf and
iterate stem.components(), pushing only std::path::Component::Normal segments
(ignore RootDir, Prefix, CurDir, ParentDir, etc.) so the resulting safe_stem is
always relative before calling output_dir.join(safe_stem); update the code
around variables stem, safe_stem and output_dir accordingly.

---

Nitpick comments:
In `@src/commands/rules.rs`:
- Around line 64-67: The code currently calls std::process::exit(1) after
printing validation failures (the eprintln!("{errors} rule(s) failed
validation."); + std::process::exit(1)) which prevents normal error propagation;
change these exits to return Err(ZiftError::General(format!(...))) with the same
failure message (e.g., format!("{errors} rule(s) failed validation.")) and
update the surrounding function signature to return Result<..., ZiftError> so
callers (including main) can handle termination; apply the same replacement for
the second occurrence referenced (lines ~139-142) and update any call sites to
propagate or map the error instead of expecting the function to exit.

In `@src/rego/grouping.rs`:
- Around line 191-223: Add two regression tests covering edge-case guards: one
that verifies derive_package_name handles digit-leading path segments by
prefixing an underscore (e.g., Path::new("src/2024/auth.ts") → "app._2024.auth")
and another that verifies derive_output_path sanitizes non-normal path
components so output paths remain under the intended output dir (e.g., inputs
with "." or ".." or hyphenated components are normalized to safe components).
Locate and extend the existing test module that contains
derive_package_simple/derive_package_hyphenated and derive_output_path
(functions derive_package_name and derive_output_path) and add assertions that
exercise the digit-leading branch and the sanitization branch to prevent
regressions.

In `@src/rules/mod.rs`:
- Around line 226-298: Add unit tests that cover error and edge cases in the
tests module: create new #[test] functions that call parse_rule with malformed
inputs (e.g., invalid regex in rule.predicates, conflicting/multiple predicate
keys, empty or missing languages) and assert that parse_rule returns Err (or the
expected error variant), and add tests for merge_rules behavior when given
duplicate ids with incompatible fields; reference parse_rule, merge_rules,
SAMPLE_RULE, and PatternRule to locate where to add these negative tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 491061dd-571d-424f-83c8-c6fda69d75bd

📥 Commits

Reviewing files that changed from the base of the PR and between 230d6a2 and c1ae25b.

📒 Files selected for processing (9)
  • src/commands/rules.rs
  • src/commands/scan.rs
  • src/rego/grouping.rs
  • src/rego/templates.rs
  • src/rules/embedded.rs
  • src/rules/mod.rs
  • src/scanner/matcher.rs
  • src/scanner/mod.rs
  • src/scanner/parser.rs
✅ Files skipped from review due to trivial changes (1)
  • src/scanner/mod.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/commands/scan.rs
  • src/rules/embedded.rs
  • src/scanner/parser.rs
  • src/rego/templates.rs
  • src/scanner/matcher.rs

boorad added 2 commits April 22, 2026 20:35
- Add tracing::warn for out-of-range capture indices instead of silent skip
- Strengthen path traversal protection: keep only Normal path components
- Add canonicalization check in extract to verify output stays in output dir
- Validator: use bare identifier for non-quoted placeholders so templates
  with identifier-position vars (e.g. input.user.{{attr}}) validate correctly
- Matcher: execute_query now returns Result and errors on invalid capture
  index instead of silently skipping (surfaces rule bugs)
- Extract: canonicalize parent directory before writing to prevent TOCTOU
  issues with symlinked intermediate directories
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
src/commands/extract.rs (1)

70-93: ⚠️ Potential issue | 🔴 Critical

Don’t canonicalize the target file before it exists.

rego_file.output_path.canonicalize() at Line 80 fails for newly generated files, so extract errors out on a fresh output directory before writing anything. Canonicalize an existing path such as parent() for the containment check, then write the file.

🔧 Proposed fix
-        let canonical_file = rego_file.output_path.canonicalize().map_err(|e| {
+        let parent = rego_file.output_path.parent().ok_or_else(|| {
+            ZiftError::General(format!(
+                "output path '{}' has no parent directory",
+                rego_file.output_path.display()
+            ))
+        })?;
+        let canonical_parent = parent.canonicalize().map_err(|e| {
             ZiftError::General(format!(
-                "failed to resolve output path '{}': {e}",
-                rego_file.output_path.display()
+                "failed to resolve output path parent '{}': {e}",
+                parent.display()
             ))
         })?;
-        if !canonical_file.starts_with(&canonical_output_dir) {
+        if !canonical_parent.starts_with(&canonical_output_dir) {
             return Err(ZiftError::General(format!(
                 "output path '{}' escapes output directory '{}'",
                 rego_file.output_path.display(),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/extract.rs` around lines 70 - 93, The code canonicalizes
rego_file.output_path which fails for new files; instead canonicalize the
output_dir and the resolved parent directory (rego_file.output_path.parent()) to
verify containment, then write the file; update the block that computes
canonical_file to canonicalize parent (e.g., use
rego_file.output_path.parent().ok_or(...).canonicalize()), perform the
starts_with check against canonical_output_dir, and only after that call
std::fs::write(&rego_file.output_path, &rego_file.content)?.
src/rules/mod.rs (1)

88-89: ⚠️ Potential issue | 🟠 Major

Reject empty languages arrays during rule parsing.

A rule with languages = [] still loads successfully here. rules validate then skips query compilation for it, and scans silently never apply it, so malformed external rules pass unnoticed. Fail fast in parse_rule when the list is empty.

🔧 Proposed fix
     let r = file.rule;
+
+    if r.languages.is_empty() {
+        return Err(ZiftError::RuleParse {
+            rule_id: r.id.clone(),
+            message: "rule must declare at least one language".into(),
+        });
+    }
 
     let mut predicates = Vec::new();

Also applies to: 134-152

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/mod.rs` around lines 88 - 89, The parser currently accepts rules
with an empty languages array (e.g., when reading file.rule), which causes those
rules to be ignored later; in parse_rule you should validate that the rule's
languages list is non-empty and return a parse error (or Err) if it's empty so
loading fails fast; locate the code paths that construct/return the parsed Rule
(references: parse_rule, file.rule and the block around the similar check at
lines ~134-152) and add a check that rejects languages == [] with a clear error
message.
src/scanner/matcher.rs (1)

101-101: ⚠️ Potential issue | 🟠 Major

Line-based finding IDs still collide for repeated matches on the same line.

Hashing line_start/line_end fixes cross-line duplicates, but it still collapses identical same-line matches like hasRole("admin") || hasRole("admin") into one finding. Use byte offsets or columns from matched instead.

🔧 Proposed fix
-        let id = compute_finding_id(&compiled.rule.id, file_path, line_start, line_end, &code_snippet);
+        let id = compute_finding_id(
+            &compiled.rule.id,
+            file_path,
+            matched.start_byte(),
+            matched.end_byte(),
+            &code_snippet,
+        );
@@
 fn compute_finding_id(
     rule_id: &str,
     file_path: &Path,
-    line_start: usize,
-    line_end: usize,
+    start_byte: usize,
+    end_byte: usize,
     snippet: &str,
 ) -> String {
@@
-    hasher.update(line_start.to_le_bytes());
-    hasher.update(line_end.to_le_bytes());
+    hasher.update(start_byte.to_le_bytes());
+    hasher.update(end_byte.to_le_bytes());

Also applies to: 157-170

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scanner/matcher.rs` at line 101, The current compute_finding_id call uses
line_start/line_end which still collapses distinct same-line matches; change to
use precise byte or column offsets from the match object (e.g.,
matched.start()/matched.end() or matched.byte_start/byte_end) so identical text
on the same line produces distinct IDs. Update the call site(s) that invoke
compute_finding_id (the call at compute_finding_id(&compiled.rule.id, file_path,
line_start, line_end, &code_snippet) and the similar block around lines 157-170)
to pass the match-derived offsets instead of line numbers, and adjust
compute_finding_id’s parameter names/signature accordingly to accept and hash
the byte/column start and end values. Ensure both occurrences are updated so IDs
include the match-level offsets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/commands/rules.rs`:
- Around line 41-42: The call to ts_parser::get_language(*lang, is_tsx_jsx) must
not use the ? operator because an unsupported language for one external rule
should not abort the whole run; instead, match on its Result inside the loops in
the RulesAction::Validate and RulesAction::Test blocks, treat Err(_) as a
per-rule failure by printing a FAIL for that rule/language and incrementing the
failed counter, then continue to the next variant/rule; keep all other logic the
same so only that error is handled locally rather than returned.

---

Duplicate comments:
In `@src/commands/extract.rs`:
- Around line 70-93: The code canonicalizes rego_file.output_path which fails
for new files; instead canonicalize the output_dir and the resolved parent
directory (rego_file.output_path.parent()) to verify containment, then write the
file; update the block that computes canonical_file to canonicalize parent
(e.g., use rego_file.output_path.parent().ok_or(...).canonicalize()), perform
the starts_with check against canonical_output_dir, and only after that call
std::fs::write(&rego_file.output_path, &rego_file.content)?.

In `@src/rules/mod.rs`:
- Around line 88-89: The parser currently accepts rules with an empty languages
array (e.g., when reading file.rule), which causes those rules to be ignored
later; in parse_rule you should validate that the rule's languages list is
non-empty and return a parse error (or Err) if it's empty so loading fails fast;
locate the code paths that construct/return the parsed Rule (references:
parse_rule, file.rule and the block around the similar check at lines ~134-152)
and add a check that rejects languages == [] with a clear error message.

In `@src/scanner/matcher.rs`:
- Line 101: The current compute_finding_id call uses line_start/line_end which
still collapses distinct same-line matches; change to use precise byte or column
offsets from the match object (e.g., matched.start()/matched.end() or
matched.byte_start/byte_end) so identical text on the same line produces
distinct IDs. Update the call site(s) that invoke compute_finding_id (the call
at compute_finding_id(&compiled.rule.id, file_path, line_start, line_end,
&code_snippet) and the similar block around lines 157-170) to pass the
match-derived offsets instead of line numbers, and adjust compute_finding_id’s
parameter names/signature accordingly to accept and hash the byte/column start
and end values. Ensure both occurrences are updated so IDs include the
match-level offsets.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59bd5f4f-127b-497c-afb3-3042098575d7

📥 Commits

Reviewing files that changed from the base of the PR and between 230d6a2 and 08bbf5d.

📒 Files selected for processing (10)
  • src/commands/extract.rs
  • src/commands/rules.rs
  • src/commands/scan.rs
  • src/rego/grouping.rs
  • src/rego/templates.rs
  • src/rules/embedded.rs
  • src/rules/mod.rs
  • src/scanner/matcher.rs
  • src/scanner/mod.rs
  • src/scanner/parser.rs
✅ Files skipped from review due to trivial changes (1)
  • src/rules/embedded.rs

Comment thread src/commands/rules.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
src/scanner/matcher.rs (1)

201-292: Good test coverage for core matching behavior.

The tests cover positive matches, false positive rejection, and deduplication. Consider adding tests for filter_findings (confidence/category filtering) and check_predicates with NotMatch/NotEq variants to increase coverage of edge cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scanner/matcher.rs` around lines 201 - 292, Tests currently miss coverage
for filter_findings and predicate negations; add unit tests to exercise
confidence/category filtering and check_predicates behavior for NotMatch and
NotEq. Create new tests in the tests module that call parse_and_match or
construct Findings directly and then invoke filter_findings with various
confidence levels and categories to assert inclusion/exclusion, and call
check_predicates (or trigger rules compiled by compile_rule that use
NotMatch/NotEq predicates) to assert they correctly invert matches; reference
functions dedup_findings, filter_findings, and check_predicates in your
assertions so the tests fail if logic changes.
src/scanner/mod.rs (1)

148-154: Silent failure on invalid min_confidence config value.

If config.scan.min_confidence contains an invalid string (e.g., "invalid"), .parse().ok() silently returns None, effectively disabling the confidence filter without any warning. Consider logging a warning when the parse fails so users know their config is ignored.

♻️ Suggested improvement
     let min_confidence = args.confidence.or_else(|| {
         config
             .scan
             .min_confidence
             .as_deref()
-            .and_then(|s| s.parse().ok())
+            .and_then(|s| {
+                s.parse().map_err(|_| {
+                    tracing::warn!("invalid min_confidence '{}' in config, ignoring", s);
+                }).ok()
+            })
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scanner/mod.rs` around lines 148 - 154, The current min_confidence
computation silently ignores parse errors because it uses .parse().ok(); update
the logic around min_confidence (where args.confidence is combined with
config.scan.min_confidence) to attempt parsing with .parse() and, on Err, emit a
warning log that includes the invalid string and that the config value was
ignored, then fall back to None; specifically locate the block assigning
min_confidence and replace the closure to match on
config.scan.min_confidence.as_deref().map(|s| s.parse::<f32>()) (or equivalent),
log on Err (using the module logger or processLogger used elsewhere), and return
None on parse failure.
src/commands/extract.rs (2)

19-32: Consider documenting the default-value coupling with cli.rs.

The config fallback logic compares against hardcoded strings ("app", "./policies/generated") that must match the default_value attributes in ExtractArgs. If those defaults drift, config overrides silently stop working.

A brief comment noting this dependency would help future maintainers, or extract the defaults into shared constants.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/extract.rs` around lines 19 - 32, The fallback logic for
package_prefix and output_dir in extract.rs currently compares against hardcoded
literals ("app" and "./policies/generated") tied to the ExtractArgs defaults
(args.package_prefix and args.output_dir), which can break silently if those
defaults change; update this by either (a) adding a short comment above the
package_prefix and output_dir blocks referencing the dependency on ExtractArgs
defaults (args.package_prefix, args.output_dir) so maintainers know to change
both places, or (b) move the default values into shared constants (e.g.,
DEFAULT_PACKAGE_PREFIX, DEFAULT_OUTPUT_DIR) in the CLI/ExtractArgs module and
use those constants here when calling .filter(...) so the default is
single-sourced and avoids drift.

76-81: Hoist output_dir.canonicalize() out of the loop.

canonical_output_dir is recomputed on every iteration but output_dir doesn't change. Move this before the loop to avoid redundant filesystem calls.

♻️ Proposed refactor
     // Group and generate files
     let rego_files = rego::group_findings(&findings, package_prefix, &output_dir);

+    // Canonicalize output_dir once for path containment checks
+    let canonical_output_dir = output_dir.canonicalize().map_err(|e| {
+        ZiftError::General(format!(
+            "failed to resolve output dir '{}': {e}",
+            output_dir.display()
+        ))
+    })?;
+
     // Write files and validate
     let mut total_files = 0;
     let mut validation_warnings = 0;
     for rego_file in &rego_files {
         // ... existing code ...
         if let Some(parent) = rego_file.output_path.parent() {
             std::fs::create_dir_all(parent)?;
         }
-        // Verify the resolved output path stays within the output directory
-        // by canonicalizing the parent (which now exists) and checking containment
-        // BEFORE writing, to prevent TOCTOU issues with symlinked directories.
-        let canonical_output_dir = output_dir.canonicalize().map_err(|e| {
-            ZiftError::General(format!(
-                "failed to resolve output dir '{}': {e}",
-                output_dir.display()
-            ))
-        })?;
+        // Verify the resolved output path stays within the output directory
         let parent = rego_file.output_path.parent().ok_or_else(|| {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/extract.rs` around lines 76 - 81, Move the call to
output_dir.canonicalize() out of the repeated loop: compute canonical_output_dir
once before entering the loop by calling output_dir.canonicalize().map_err(...)
(preserving the existing ZiftError::General message), then remove the same
canonicalize call from inside the loop and use the precomputed
canonical_output_dir variable where needed; this avoids redundant filesystem
calls and preserves the existing error handling and variable name
canonical_output_dir in src/commands/extract.rs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/commands/extract.rs`:
- Around line 19-32: The fallback logic for package_prefix and output_dir in
extract.rs currently compares against hardcoded literals ("app" and
"./policies/generated") tied to the ExtractArgs defaults (args.package_prefix
and args.output_dir), which can break silently if those defaults change; update
this by either (a) adding a short comment above the package_prefix and
output_dir blocks referencing the dependency on ExtractArgs defaults
(args.package_prefix, args.output_dir) so maintainers know to change both
places, or (b) move the default values into shared constants (e.g.,
DEFAULT_PACKAGE_PREFIX, DEFAULT_OUTPUT_DIR) in the CLI/ExtractArgs module and
use those constants here when calling .filter(...) so the default is
single-sourced and avoids drift.
- Around line 76-81: Move the call to output_dir.canonicalize() out of the
repeated loop: compute canonical_output_dir once before entering the loop by
calling output_dir.canonicalize().map_err(...) (preserving the existing
ZiftError::General message), then remove the same canonicalize call from inside
the loop and use the precomputed canonical_output_dir variable where needed;
this avoids redundant filesystem calls and preserves the existing error handling
and variable name canonical_output_dir in src/commands/extract.rs.

In `@src/scanner/matcher.rs`:
- Around line 201-292: Tests currently miss coverage for filter_findings and
predicate negations; add unit tests to exercise confidence/category filtering
and check_predicates behavior for NotMatch and NotEq. Create new tests in the
tests module that call parse_and_match or construct Findings directly and then
invoke filter_findings with various confidence levels and categories to assert
inclusion/exclusion, and call check_predicates (or trigger rules compiled by
compile_rule that use NotMatch/NotEq predicates) to assert they correctly invert
matches; reference functions dedup_findings, filter_findings, and
check_predicates in your assertions so the tests fail if logic changes.

In `@src/scanner/mod.rs`:
- Around line 148-154: The current min_confidence computation silently ignores
parse errors because it uses .parse().ok(); update the logic around
min_confidence (where args.confidence is combined with
config.scan.min_confidence) to attempt parsing with .parse() and, on Err, emit a
warning log that includes the invalid string and that the config value was
ignored, then fall back to None; specifically locate the block assigning
min_confidence and replace the closure to match on
config.scan.min_confidence.as_deref().map(|s| s.parse::<f32>()) (or equivalent),
log on Err (using the module logger or processLogger used elsewhere), and return
None on parse failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: de492fef-16f4-4f0a-923f-537d316df2d0

📥 Commits

Reviewing files that changed from the base of the PR and between 08bbf5d and 23c9bdf.

📒 Files selected for processing (5)
  • src/commands/extract.rs
  • src/commands/rules.rs
  • src/rego/validator.rs
  • src/scanner/matcher.rs
  • src/scanner/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/rego/validator.rs
  • src/commands/rules.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant