feat: implement zift CLI scanner and Rego policy generation#1
Conversation
Adapted from ea repo for the zift project — includes commit, pr, review, handoff, and address-pr-feedback commands with Rust/cargo tooling references instead of bun/TypeScript. Skipped ea-specific commands (deploy, jira, validate-integration).
Add the zift CLI tool for scanning codebases to detect embedded authorization logic. Uses tree-sitter for AST-based pattern matching against TypeScript and JavaScript source files. - CLI with clap derive API: scan, extract (stub), report, rules, init commands - 12 built-in detection rules for RBAC, ABAC, middleware, ownership, and feature gate patterns - Rule engine with TOML-defined tree-sitter queries and regex predicates - File discovery using the ignore crate (respects .gitignore) - SHA-256 content hashing for finding deduplication - Text and JSON output formatters with SARIF placeholder - Per-language query compilation cache for performance - Configuration via .zift.toml with sensible defaults
Implement the extract command to generate OPA Rego policies from scanner
output. Findings are grouped by source file into per-file .rego packages
with proper import statements and package naming.
- Rego template engine with {{variable}} interpolation from tree-sitter
captures
- Category-specific default stubs (RBAC, ABAC, ownership, feature gate, etc.)
- Confidence-based wrapping: high passes through, medium adds TODO, low
comments out
- Per-rule rego_template sections in 7 detection rules for precise policy
generation
- Extract command reads findings from file or stdin (JSON), filters by
confidence, and writes .rego files grouped by source
Integrate Microsoft's regorus (Rust OPA engine) to validate generated Rego policies at two points: extract command now checks each .rego file before writing and shows [OK]/[WARN] status, and rules validate now checks rego_template sections in addition to tree-sitter queries.
Functions imported from policy-engine modules (paths containing authz, opa, policy, rego, enforce) are now recognized as enforcement points rather than flagged as inline authorization logic. The scan summary shows enforcement point count and externalization percentage. Reduces false positives in codebases already using OPA — e.g., EA repo drops from 87 findings to 17, with 84% externalized.
📝 WalkthroughWalkthroughAdds a Rust CLI tool "zift" implementing Tree-sitter based code scanning (rules, enforcement-point detection), Rego stub/template generation and validation, many TypeScript/JavaScript pattern rules, config/error/logging infrastructure, outputs (text/json), CLI commands, and several .claude workflow docs. (49 words) Changes
Sequence Diagram(s)sequenceDiagram
participant User as User
participant CLI as CLI (arg parsing)
participant Config as ZiftConfig
participant Rules as RulesLoader
participant Scanner as Scanner
participant Parser as Parser (tree-sitter)
participant Matcher as Matcher (queries)
participant Rego as Rego (templates/validator)
participant Output as Output (text/json)
User->>CLI: invoke zift scan --path=...
CLI->>Config: load_config(path)
Config-->>CLI: ZiftConfig
CLI->>Rules: load_rules(extra_dir?, config)
Rules-->>CLI: Vec<PatternRule>
CLI->>Scanner: scan(root, rules, args, config)
Scanner->>Scanner: discover_files(root, exclude)
Scanner->>Parser: parse_source(file, lang, tsx?)
Parser-->>Scanner: Tree
Scanner->>Matcher: compile_rule(rule, ts_lang)
Matcher-->>Scanner: CompiledRule
Scanner->>Matcher: execute_query(compiled, tree, source)
Matcher-->>Scanner: Vec<Finding)
Scanner->>Scanner: detect policy imports & enforcement points
Scanner-->>Scanner: ScanResult { findings, enforcement_points }
CLI->>Rego: group_findings(findings, prefix, outdir)
Rego-->>CLI: Vec<RegoFile>
CLI->>Rego: validate_template/render_template(...)
Rego-->>CLI: ValidationResult
CLI->>Output: print(findings, format)
Output-->>User: Report (text/json)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
There was a problem hiding this comment.
Security and Crash Risk Review
I've identified several critical security vulnerabilities and crash risks that must be addressed before merging:
🛑 Critical Security Issues (3)
- Path Traversal in File Writing - User-controlled file paths can write outside intended directories, potentially overwriting system files
- Path Traversal in Rego Path Generation - Malicious finding paths can bypass output directory restrictions
- Symlink/Directory Traversal - Recursive rule loading lacks symlink protection and depth limits, enabling DoS attacks
⚠️ Crash Risks (3)
- Array Out of Bounds - Unchecked array indexing in matcher that will panic on invalid capture indices
- UTF-8 String Slicing - Byte-based slicing on multi-byte characters causes panics
- String Slicing Underflow - Single-character quote strings cause panic in strip_quotes
🔍 Logic Errors (1)
- Silent Path Canonicalization Failure - unwrap_or silently falls back without proper error handling
All issues have actionable fix suggestions. The security vulnerabilities are particularly concerning for a tool that processes untrusted input and writes files to the filesystem. Please address these before merging.
You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.
There was a problem hiding this comment.
Actionable comments posted: 16
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (5)
rules/typescript/feature-gate-check.toml-20-27 (1)
20-27:⚠️ Potential issue | 🟡 MinorAdd negative test case to validate the rule does not produce false positives.
The rule currently has only one positive test case. Add a negative test to verify that non-matching patterns are correctly rejected (e.g.,
if (user.name === "admin")should not match, or checking a different property likeuser.tierwith a non-matching predicate value).Note: The concern about quote handling in template interpolation is not an issue—
render_templateexplicitly callsstrip_quotes()on captured values before interpolation, so"enterprise"becomesenterprisein the final Rego output.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/feature-gate-check.toml` around lines 20 - 27, Add a negative test that asserts the rule.rego_template does not allow non-matching inputs: create a test case that invokes the template (using template/template rendering path that substitutes "{{plan_value}}") with an input where input.user.plan is a different value (e.g., "starter" or use a different property like input.user.tier) and assert allow == false; target the rule.rego_template/template and the resulting allow decision (default allow := false, allow if { input.user.plan in {"{{plan_value}}"} }) so the test ensures a non-matching plan_value does not produce a true decision.src/commands/scan.rs-9-9 (1)
9-9:⚠️ Potential issue | 🟡 MinorHandle non-
NotFoundcanonicalization failures explicitly.At Line 9, all
canonicalize()errors are swallowed. This can hide permission or filesystem errors and degrade diagnostics.🛠️ Narrow fallback to only missing-path cases
- let path = args.path.canonicalize().unwrap_or(args.path.clone()); + let path = match args.path.canonicalize() { + Ok(p) => p, + Err(e) if e.kind() == std::io::ErrorKind::NotFound => args.path.clone(), + Err(e) => return Err(e.into()), + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/scan.rs` at line 9, The current assignment let path = args.path.canonicalize().unwrap_or(args.path.clone()) swallows all canonicalize errors; change it to explicitly match on args.path.canonicalize() and only fall back to args.path.clone() when the error.kind() == std::io::ErrorKind::NotFound, otherwise propagate or return the error (e.g., return Err(e) or use ?), so permission and other filesystem errors are not hidden; update the code around the path variable where args.path.canonicalize() is called to implement this conditional handling..claude/commands/address-pr-feedback.md-94-94 (1)
94-94:⚠️ Potential issue | 🟡 MinorUse clearer wording than “Needs input.”
“Needs user input” reads more naturally in this checklist context.
✏️ Suggested edit
-- **Needs input**: Any ambiguous items requiring user decision +- **Needs user input**: Any ambiguous items requiring user decision🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/commands/address-pr-feedback.md at line 94, Replace the checklist phrase "Needs input" with "Needs user input" to improve clarity; locate the exact string "Needs input" in .claude/commands/address-pr-feedback.md (the checklist item) and update it to "Needs user input" wherever it appears in that checklist section so the wording reads naturally for users.src/scanner/discovery.rs-39-49 (1)
39-49:⚠️ Potential issue | 🟡 MinorSilent failures when adding invalid exclude patterns.
Invalid override patterns and build failures are silently ignored with
let _ =and theif let Ok()pattern. Users won't know their exclusion patterns are malformed.Consider logging a warning when patterns fail to parse:
🛡️ Suggested fix to log pattern errors
if !exclude_patterns.is_empty() { let mut overrides = OverrideBuilder::new(root); for pattern in exclude_patterns { // Negate the pattern so it becomes an exclusion - let _ = overrides.add(&format!("!{pattern}")); + if let Err(e) = overrides.add(&format!("!{pattern}")) { + tracing::warn!("invalid exclude pattern '{}': {}", pattern, e); + } } - if let Ok(ov) = overrides.build() { - builder.overrides(ov); + match overrides.build() { + Ok(ov) => { builder.overrides(ov); } + Err(e) => { tracing::warn!("failed to build overrides: {}", e); } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner/discovery.rs` around lines 39 - 49, The current loop silently ignores failures when adding exclude patterns and when building overrides: detect errors from OverrideBuilder::add while iterating over exclude_patterns and log a warning including the invalid pattern and error; likewise, when overrides.build() returns Err, log a warning with the build error and the root or patterns and avoid silently proceeding—use the same logging facility used elsewhere (e.g., log::warn! or the crate's logger) and still only call builder.overrides(ov) on Ok(ov).src/output/text.rs-13-16 (1)
13-16:⚠️ Potential issue | 🟡 MinorStill report enforcement coverage when there are no findings.
This early return hides
enforcement_points, so a repo that is fully externalized just prints “No authorization patterns found.” without the coverage summary. That makes text output disagree with the JSON summary for the same scan.Suggested fix
if findings.is_empty() { writeln!(writer, "No authorization patterns found.")?; + if enforcement_points > 0 { + writeln!( + writer, + "{enforcement_points} enforcement points detected (already using OPA/policy engine)." + )?; + } return Ok(()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/output/text.rs` around lines 13 - 16, The function currently returns early when findings.is_empty(), which prevents printing enforcement coverage info (enforcement_points) and causes text output to omit the coverage summary; change the logic so that if findings.is_empty() you still write "No authorization patterns found." to writer but do NOT return—continue to call the same enforcement coverage generation/printing code that uses enforcement_points (the same codepath used for non-empty findings) so the coverage summary is always emitted; update the block around findings.is_empty() in src/output/text.rs (the writer/writeln and the surrounding control flow) to remove the early return and ensure enforcement_points printing runs afterward.
🧹 Nitpick comments (11)
rules/typescript/feature-gate-check.toml (1)
29-35: Add a negative test case to validate predicate filtering.Only the positive case is tested. Consider adding a test where the property name doesn't match the predicate (e.g.,
if (user.name === "admin")) to ensure non-matching properties are correctly excluded.📝 Example negative test case
expect_match = true + +[[rule.tests]] +input = """ +if (user.name === "admin") { + grantAccess(); +} +""" +expect_match = false🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/feature-gate-check.toml` around lines 29 - 35, Add a negative test case to the existing [[rule.tests]] entries to verify predicate filtering: create a new [[rule.tests]] block with an input that uses a non-matching property name (e.g., `if (user.name === "admin")`) and set expect_match = false so the rule does not trigger; reference the existing rule.tests/input/expect_match pattern to mirror the positive case and ensure non-matching properties are excluded by the predicate filter.src/commands/init.rs (1)
25-32: Minor TOCTOU race between exists check and write.The
path.exists()check followed byfs::write()creates a small race window where another process could create the file between the check and write. For a CLI init command this is low risk, but usingOpenOptions::new().create_new(true)would make this atomic.♻️ Atomic create-if-not-exists alternative
- if path.exists() { - return Err(ZiftError::General(format!( - "config file already exists: {}", - path.display() - ))); - } - - std::fs::write(&path, DEFAULT_CONFIG)?; + use std::fs::OpenOptions; + use std::io::Write; + + let mut file = OpenOptions::new() + .write(true) + .create_new(true) + .open(&path) + .map_err(|e| { + if e.kind() == std::io::ErrorKind::AlreadyExists { + ZiftError::General(format!("config file already exists: {}", path.display())) + } else { + e.into() + } + })?; + file.write_all(DEFAULT_CONFIG.as_bytes())?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/init.rs` around lines 25 - 32, The exists-then-write sequence (path.exists() followed by std::fs::write(&path, DEFAULT_CONFIG)) has a TOCTOU race; replace it with an atomic create-if-not-exists using OpenOptions::new().write(true).create_new(true).open(&path) and then write DEFAULT_CONFIG into that file handle, propagating errors as before (return the same ZiftError::General on already-existing or map the io error), so the init logic in this function becomes atomic and race-free.rules/typescript/express-auth-middleware.toml (1)
7-14: Consider extending query to match invoked middleware patterns.The current query only matches identifiers passed directly to
use()(e.g.,app.use(requireAuth)), but misses invoked middleware likeapp.use(requireAuth())orapp.use(passport.authenticate("jwt")). These are common patterns in Express apps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/express-auth-middleware.toml` around lines 7 - 14, The query only captures bare identifier arguments to use() and misses invoked middleware forms; update the Tree-sitter query (the query variable that uses captures `@method`, `@middleware_name`, `@match`) to also match arguments that are call_expression nodes and member_expression call_expression nodes (e.g., app.use(requireAuth()) and app.use(passport.authenticate("jwt"))). Specifically add alternative patterns for the arguments: a direct call_expression (with its callee captured) and a member_expression whose parent is a call_expression, and capture those callee identifiers/member_expression as `@middleware_name/`@middleware_call so invoked middleware is recognized by the same rule.rules/typescript/session-auth-check.toml (1)
8-15: Constrain the receiver to reduce false positives.At Lines 8-15, matching by method name alone can flag unrelated APIs that also expose
isAuthenticated/isAuthorized. Adding receiver capture/predicates (e.g.,req,request,ctx,session) will improve precision.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/session-auth-check.toml` around lines 8 - 15, The rule currently matches any call_expression with a member_expression by method name (captured as `@method`) which yields false positives; modify the tree-sitter pattern to also capture the member_expression's object as a receiver (e.g., add a capture like `@receiver` on the object node within member_expression) and add a new predicate (e.g., rule.predicates.receiver) that matches common receiver identifiers such as ^(req|request|ctx|session|ctx.request)$ so the rule only triggers when the method (isAuthenticated|isLoggedIn|isAuthorized) is invoked on one of those receivers; keep the existing method predicate (rule.predicates.method) and ensure both predicates must match.rules/typescript/express-route-middleware.toml (1)
8-16: Rule currently misses common middleware argument shapes.At Line 13, the query only matches middleware when it is a plain identifier in the second argument slot. This misses patterns like
auth.requireAuth,requireAuth(), and other common middleware forms, reducing match coverage.♻️ Suggested query broadening
(call_expression function: (member_expression property: (property_identifier) `@method`) arguments: (arguments (string) - (identifier) `@middleware_name` + [ + (identifier) `@middleware_name` + (member_expression + property: (property_identifier) `@middleware_name`) + (call_expression + function: (identifier) `@middleware_name`) + ] . (_)) ) `@match`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/express-route-middleware.toml` around lines 8 - 16, The current call_expression query only binds middleware when the second argument is a plain (identifier) `@middleware_name`; update the pattern so `@middleware_name` also captures member expressions and invocation forms (e.g., auth.requireAuth, requireAuth(), or auth.requireAuth()) by allowing the second argument to be (identifier | member_expression | call_expression) and optionally a call_expression whose function is a member_expression, keeping the existing captures like (member_expression property:(property_identifier) `@method`) and the outer (call_expression function: ... ) `@match` so the rule will match both direct identifiers, dotted properties, and common invoked middleware factories.src/commands/extract.rs (1)
17-32: Fragile coupling to hardcoded CLI default values.The config fallback logic compares against hardcoded strings (
"app"and"./policies/generated") that must stay in sync with the#[arg(default_value = ...)]attributes inExtractArgs. If someone changes the CLI defaults without updating these checks, the config override silently stops working.Consider defining these defaults as constants shared between
cli.rsand here, or using a different pattern (e.g.,Optionwithout defaults in CLI, defaulting in the command).♻️ Example: extract shared constants
// In cli.rs or a shared constants module: pub const DEFAULT_PACKAGE_PREFIX: &str = "app"; pub const DEFAULT_OUTPUT_DIR: &str = "./policies/generated"; // Then use in both places🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/extract.rs` around lines 17 - 32, The fallback logic in execute() for ExtractArgs currently compares CLI values to hardcoded defaults ("app" and "./policies/generated"), which is fragile; replace these literal checks by referencing shared constants or by removing CLI defaults and treating those fields as Option so config wins when present. Specifically, introduce shared DEFAULT_PACKAGE_PREFIX and DEFAULT_OUTPUT_DIR constants (used by cli.rs and here) or change ExtractArgs.package_prefix/output_dir to Option and adjust execute() to use config.extract.package_prefix.unwrap_or(args.package_prefix) and config.extract.output_dir.map(PathBuf::from).unwrap_or_else(|| args.output_dir.clone()) so the comparisons to hardcoded strings are eliminated and the fallback behavior is deterministic.rules/typescript/jwt-token-check.toml (1)
6-6: Description and predicate scope mismatch.The description says "JWT token verification or authorization header access" but the method predicate includes
sign, which is for creating tokens, not verifying them. Consider either:
- Updating the description to include "signing/creation"
- Removing
signfrom the predicate if only verification is intended♻️ Option 1: Update description
-description = "JWT token verification or authorization header access" +description = "JWT token verification, decoding, or signing"Also applies to: 20-21
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/jwt-token-check.toml` at line 6, The rule's description string ("JWT token verification or authorization header access") is inconsistent with the predicate which includes the "sign" method (used for token creation); either update the description to mention signing/creation (e.g., add "signing/creation" or "signing") or remove "sign" from the method predicate so the rule only targets verification/authorization checks — locate the rule with the description line and the predicate that lists "sign" and apply one of these two fixes consistently (also update the similar occurrence referenced around the other predicate block).src/output/json.rs (1)
38-46: Inconsistent casing between category and confidence aggregation.Category keys are lowercased (
.to_lowercase()) but confidence keys are not. This creates inconsistent JSON output where categories might be"rbac"while confidences are"High". Consider making both consistent.♻️ Suggested fix for consistency
*by_confidence - .entry(f.confidence.to_string()) + .entry(f.confidence.to_string().to_lowercase()) .or_default() += 1;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/output/json.rs` around lines 38 - 46, The aggregation uses lowercased keys for categories but not for confidences, causing inconsistent JSON keys; in the loop over findings where by_category and by_confidence are updated (variables f, f.category, f.confidence, entries via .entry(...).or_default()), normalize the confidence key the same way as category (e.g., call .to_lowercase() on f.confidence when building the entry key) so both maps produce consistently-cased keys in the output.src/cli.rs (1)
88-104: Deep scan options lack--deeprequirement enforcement at parse time.The
--provider,--model,--max-cost, and--api-keyoptions are documented as requiring--deep, but clap doesn't enforce this constraint. Users can specify--provider anthropicwithout--deep, which would be silently ignored or cause confusion.Consider using clap's
requiresattribute to enforce the dependency:♻️ Enforce --deep requirement
// -- Deep scan options -- /// LLM provider (requires --deep) - #[arg(long)] + #[arg(long, requires = "deep")] pub provider: Option<LlmProvider>, /// Model to use (requires --deep) - #[arg(long)] + #[arg(long, requires = "deep")] pub model: Option<String>, /// Maximum spend limit for LLM calls (requires --deep) - #[arg(long)] + #[arg(long, requires = "deep")] pub max_cost: Option<f64>, /// API key (or set ZIFT_API_KEY / provider-specific env vars) - #[arg(long, env = "ZIFT_API_KEY")] + #[arg(long, env = "ZIFT_API_KEY", requires = "deep")] pub api_key: Option<String>,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cli.rs` around lines 88 - 104, The CLI deep-scan flags (fields provider, model, max_cost, api_key in the CLI struct) are documented as requiring --deep but clap won't enforce that; update each field's #[arg(...)] attribute to include requires = "deep" (e.g., add requires = "deep" to the attributes for provider, model, max_cost and api_key) so clap validates that those options can only be used when --deep is present while preserving existing attributes like long and env.src/rules/mod.rs (1)
90-113: Non-deterministic predicate evaluation order.The
r.predicatesHashMap iteration order is non-deterministic. If multiple predicates are defined for a rule, they will be pushed into theVec<(String, Predicate)>in an arbitrary order that may vary between runs. While the currentcheck_predicateslogic uses short-circuit AND semantics (all must pass), the order could affect debugging reproducibility and any future order-dependent logic.♻️ Consider sorting predicates for deterministic order
let mut predicates = Vec::new(); - for (capture_name, pred) in r.predicates { + let mut pred_entries: Vec<_> = r.predicates.into_iter().collect(); + pred_entries.sort_by(|a, b| a.0.cmp(&b.0)); + for (capture_name, pred) in pred_entries { let p = if let Some(re) = pred.match_re {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rules/mod.rs` around lines 90 - 113, Iteration over r.predicates (a HashMap) yields non-deterministic predicate order; ensure deterministic ordering by sorting predicates by their capture name before constructing the Vec<(String, Predicate)>. Modify the loop building predicates (the block that creates Predicate::Match/Eq/NotMatch/NotEq and pushes into predicates) to first collect r.predicates.into_iter() (or keys) into a temporary Vec, sort that Vec by capture_name (or key) and then iterate the sorted collection to build and push (capture_name, p); keep the same error handling and use the same Predicate enum and check_predicates logic so behavior is identical except for deterministic ordering.src/scanner/matcher.rs (1)
65-69: Silent fallback for invalid UTF-8 capture text.Using
unwrap_or_default()silently converts invalid UTF-8 to an empty string. This could cause predicates to fail unexpectedly or produce empty capture values without any diagnostic. Consider logging a warning or returning an error for malformed source.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner/matcher.rs` around lines 65 - 69, The code silently drops invalid UTF-8 by calling capture.node.utf8_text(source).unwrap_or_default(); change this to explicitly handle the Result from capture.node.utf8_text(source) (e.g. match or if let Err(e) = ...) and either return an error from the surrounding function or emit a warning with context (include capture identifier like capture.index or capture.name and the error e) instead of producing an empty string; replace the unwrap_or_default call and ensure callers/signature of the function (in matcher.rs) propagate or handle the error accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rules/typescript/has-role-call.toml`:
- Around line 15-24: The current predicate regex (matching
hasRole|checkRole|isRole|requireRole|hasPermission|checkPermission|isAuthorized|authorize)
incorrectly maps permission/authorization functions to a role-based Rego
template that uses input.user.role and {{role_value}}; update the config so
role-only functions use the existing role template and permission/authorization
functions use a different template that checks input.user.permissions (or
similar). Concretely, split the predicate matching into two distinct sets (e.g.,
keep hasRole|checkRole|isRole|requireRole for the role template, and create a
new predicate group for hasPermission|checkPermission|isAuthorized|authorize)
and add a corresponding [rule.rego_template] for the permission group that tests
membership against input.user.permissions (or a configurable permission field)
instead of input.user.role.
In `@rules/typescript/nestjs-roles-decorator.toml`:
- Around line 8-17: The rule erroneously matches both Roles and SetMetadata in
the decorator_name predicate, causing `@SetMetadata`("roles","admin") to be parsed
like `@Roles` and extract the wrong string; update the predicate for
decorator_name used by the call_expression pattern so it only matches "Roles"
(remove "SetMetadata") so the call_expression pattern that captures the string
argument only applies to `@Roles` and will no longer extract the first argument of
`@SetMetadata` as a role.
In `@rules/typescript/ownership-check.toml`:
- Line 11: The operator list in the config allows both equality and inequality
(operator key) but the Rego template that emits checks always produces a
positive equality against hardcoded fields (resource.owner and user.id), causing
mismatches; update the template that generates the Rego rule so it uses the
actual matched operands and the actual operator from the match (emit "==" or
"===" for equality and emit "!=" or "!==" as a negated check for inequality), do
not hardcode resource.owner/user.id—use the captured LHS/RHS identifiers from
the parsed expression, and ensure the generated allow/deny rule preserves
default-deny semantics and mirrors the original operator polarity.
In `@rules/typescript/role-check-conditional.toml`:
- Around line 10-13: The binary_expression pattern in
role-check-conditional.toml currently matches any operator and thus treats
inequality as equality; update the pattern to constrain the operator to equality
only (match only '==' and '===' operators) by adding an operator node check
alongside the existing (member_expression property: (property_identifier) `@prop`)
and (string) `@role_value` bindings, and add a unit test that includes a
conditional using '!==', e.g. `if (user.role !== "admin")`, to assert the
pattern does NOT match (preventing the template from emitting a positive allow
for inequality cases).
In `@rules/typescript/role-includes-check.toml`:
- Around line 24-30: The generated Rego template incorrectly checks a scalar
field `input.user.role` against a set using `in`, which loses array-membership
semantics from patterns like `roles.includes("...")`; update the
`rule.rego_template` so the policy tests membership against the
`input.user.roles` array (or uses an array-aware check) instead of
`input.user.role`, and substitute `{{role_value}}` into an array-membership
expression (e.g., use `some i; input.user.roles[i] == "{{role_value}}"` or the
Rego `role in input.user.roles` pattern) so templates produced by `template`
correctly represent `roles.includes(...)` semantics.
In `@src/commands/rules.rs`:
- Around line 34-38: The validation currently always calls
ts_parser::get_language(*lang, false) which only tests the base TypeScript
grammar; update the loop that builds tree_sitter::Query::new to try both
TypeScript variants for TypeScript rules: when iterating rule.languages, if the
language is the TypeScript identifier used in your codebase (the same enum/const
used by the scanner), call ts_parser::get_language(*lang, false) and
ts_parser::get_language(*lang, true) (or otherwise select the TSX variant) and
attempt tree_sitter::Query::new with each variant, treating the rule as valid if
either variant succeeds; only emit an error from the validation logic if
Query::new fails for both variants (for non-TypeScript languages keep the single
existing get_language call).
- Around line 67-76: Add an optional boolean field is_tsx_jsx to the RuleTest
struct (defaulting to false and exposed in TOML deserialization), then use that
field when choosing the parser variant: replace the hardcoded false passed to
ts_parser::get_language(lang, false) and to ts_parser::parse_source(..., lang,
false) with test.is_tsx_jsx so the Parser created in rules.rs (the parser/tree
creation path using parser and tree variables) will parse with the TSX/JSX
grammar when the test metadata requests it.
In `@src/rego/grouping.rs`:
- Around line 123-126: The truncation logic for first_line (from
finding.code_snippet.lines()) is slicing bytes with first_line[..77], which can
panic on non-ASCII UTF-8; replace the byte-slice with a Unicode-safe truncation
using first_line.chars().take(77).collect::<String>() (or an equivalent
chars()-based helper) and then build truncated accordingly (add "..." only when
the char-count exceeded 77) so variables first_line and truncated remain correct
and safe for multi-byte characters.
- Around line 29-43: The generated package segment needs to ensure no segment
starts with a digit: after computing sanitized (from stem) split sanitized on
'.' and transform each segment so that if a segment's first character is a digit
you prefix it with '_' (e.g., "2024" -> "_2024"); then rejoin with '.' and use
that adjusted sanitized when building the final package string with prefix;
apply the same rule to prefix if prefix contains dot-separated segments (or
ensure prefix segments are already validated) so the final output from the block
that computes sanitized/prefix never yields a segment beginning with a digit.
In `@src/rego/templates.rs`:
- Around line 105-135: The Rbac and FeatureGate branches in templates.rs are
incorrectly treating every string literal as a role/plan; stop inferring
roles/plans from arbitrary literals by removing the block that formats literals
into a set and instead default to the safe stub (e.g., "{\"TODO\"}") unless the
caller explicitly marked the literals as roles/plans; update the
AuthCategory::Rbac and AuthCategory::FeatureGate branches to insert the TODO
stub into vars (roles/plans) and, if you add a new explicit flag/parameter
(e.g., explicit_roles or explicit_plans), only format and insert the literals
when that flag is true; keep use of vars.insert("roles"/"plans") but eliminate
the automatic literals -> set conversion to avoid overly permissive policies.
In `@src/rego/validator.rs`:
- Around line 31-38: The current two-pass substitution in validate_template
(validator.rs) naively replaces all {{...}} with a quoted "placeholder", causing
invalid code when a placeholder appears in identifier positions (e.g., after a
dot); change validate_template to perform context-aware replacement: detect
placeholder occurrences and if the placeholder is in an identifier position
(e.g., preceded by a dot operator or at a location where an unquoted identifier
is required) replace it with an unquoted safe identifier (e.g., placeholder_id)
or bracket-index form, otherwise replace with a quoted placeholder string;
implement this by scanning template with a regex that captures surrounding
character (or a small token-aware pass) and update the replacement logic in the
same function (look for re_quoted, re_bare, rendered, full_policy, validate_rego
references) and add a regression test that feeds an ABAC-like template such as
"input.user.{{attribute}} == {{value}}" to validate_template to assert it
produces valid Rego.
In `@src/scanner/imports.rs`:
- Around line 98-101: The current is_enforcement_point function uses substring
matching (code_snippet.contains) which yields false positives (e.g.,
authorizeResult, comments, string literals); update is_enforcement_point to
detect imported identifiers as standalone tokens or use AST/token-based
detection instead of raw contains: for a quick fix use a regex/tokenizer to
match word-boundary/identifier boundaries against each entry in policy_imports,
or better yet parse the snippet into an AST or tokens and check for Identifier
nodes matching the policy_imports set; ensure behavior aligns with scan()
expectations so only true identifier uses are treated as enforcement points.
- Around line 23-39: The parser only handles TS_NAMED_IMPORT_QUERY and
TS_DEFAULT_IMPORT_QUERY so namespace imports like `import * as authz from
"../policy"` are ignored; update find_policy_imports() to also handle a new
TS_NAMESPACE_IMPORT_QUERY that matches an import_clause with a namespace_import
(capture the namespace identifier as `@name` and the source as `@source`) and add a
unit test that parses a file containing `import * as authz from "../policy"` and
subsequent `authz.authorize(...)` to ensure policy_imports is populated;
reference TS_NAMED_IMPORT_QUERY, TS_DEFAULT_IMPORT_QUERY,
TS_NAMESPACE_IMPORT_QUERY, find_policy_imports(), and policy_imports when making
the change.
In `@src/scanner/matcher.rs`:
- Around line 157-176: The project must declare Rust 1.76 to allow the let-chain
syntax used in filter_findings; open Cargo.toml and in the [package] section add
rust-version = "1.76" so the compiler toolchain requirement is enforced for the
repo (this will ensure code using the if let Some(min) = min_confidence && ...
expression in filter_findings compiles consistently).
In `@src/scanner/mod.rs`:
- Around line 119-140: The enforcement_points counter is incremented before
deduplication, causing duplicate enforcement calls to be double-counted; change
the logic in the loop that uses imports::is_enforcement_point to collect skipped
enforcement findings into a temporary Vec (e.g. skipped_enforcement), push
non-enforcement findings into all_findings as before, then after the outer loop
run matcher::dedup_findings on both all_findings and skipped_enforcement and set
enforcement_points = deduped_skipped_enforcement.len() (or add that length)
instead of incrementing enforcement_points inline; reference enforcement_points,
imports::is_enforcement_point, all_findings, and matcher::dedup_findings to
locate and update the code.
In `@src/scanner/parser.rs`:
- Around line 6-13: The get_language function currently panics for unsupported
Language variants; change its signature from returning tree_sitter::Language to
returning Result<tree_sitter::Language, YourErrorType> (e.g., a custom error
enum or anyhow::Error), update the match in get_language so each supported
branch returns Ok(<language>.into()) and replace the unimplemented! branch with
Err(...) containing a clear error message about the unsupported Language
variant, and then update callers (e.g., places that call get_language and any
use in discover_files or parser initialization) to handle the Result (propagate
with ? or map the error) instead of relying on a panic. Ensure to reference the
Language enum and get_language when adding the new error case so unsupported
languages are handled gracefully.
---
Minor comments:
In @.claude/commands/address-pr-feedback.md:
- Line 94: Replace the checklist phrase "Needs input" with "Needs user input" to
improve clarity; locate the exact string "Needs input" in
.claude/commands/address-pr-feedback.md (the checklist item) and update it to
"Needs user input" wherever it appears in that checklist section so the wording
reads naturally for users.
In `@rules/typescript/feature-gate-check.toml`:
- Around line 20-27: Add a negative test that asserts the rule.rego_template
does not allow non-matching inputs: create a test case that invokes the template
(using template/template rendering path that substitutes "{{plan_value}}") with
an input where input.user.plan is a different value (e.g., "starter" or use a
different property like input.user.tier) and assert allow == false; target the
rule.rego_template/template and the resulting allow decision (default allow :=
false, allow if { input.user.plan in {"{{plan_value}}"} }) so the test ensures a
non-matching plan_value does not produce a true decision.
In `@src/commands/scan.rs`:
- Line 9: The current assignment let path =
args.path.canonicalize().unwrap_or(args.path.clone()) swallows all canonicalize
errors; change it to explicitly match on args.path.canonicalize() and only fall
back to args.path.clone() when the error.kind() == std::io::ErrorKind::NotFound,
otherwise propagate or return the error (e.g., return Err(e) or use ?), so
permission and other filesystem errors are not hidden; update the code around
the path variable where args.path.canonicalize() is called to implement this
conditional handling.
In `@src/output/text.rs`:
- Around line 13-16: The function currently returns early when
findings.is_empty(), which prevents printing enforcement coverage info
(enforcement_points) and causes text output to omit the coverage summary; change
the logic so that if findings.is_empty() you still write "No authorization
patterns found." to writer but do NOT return—continue to call the same
enforcement coverage generation/printing code that uses enforcement_points (the
same codepath used for non-empty findings) so the coverage summary is always
emitted; update the block around findings.is_empty() in src/output/text.rs (the
writer/writeln and the surrounding control flow) to remove the early return and
ensure enforcement_points printing runs afterward.
In `@src/scanner/discovery.rs`:
- Around line 39-49: The current loop silently ignores failures when adding
exclude patterns and when building overrides: detect errors from
OverrideBuilder::add while iterating over exclude_patterns and log a warning
including the invalid pattern and error; likewise, when overrides.build()
returns Err, log a warning with the build error and the root or patterns and
avoid silently proceeding—use the same logging facility used elsewhere (e.g.,
log::warn! or the crate's logger) and still only call builder.overrides(ov) on
Ok(ov).
---
Nitpick comments:
In `@rules/typescript/express-auth-middleware.toml`:
- Around line 7-14: The query only captures bare identifier arguments to use()
and misses invoked middleware forms; update the Tree-sitter query (the query
variable that uses captures `@method`, `@middleware_name`, `@match`) to also match
arguments that are call_expression nodes and member_expression call_expression
nodes (e.g., app.use(requireAuth()) and app.use(passport.authenticate("jwt"))).
Specifically add alternative patterns for the arguments: a direct
call_expression (with its callee captured) and a member_expression whose parent
is a call_expression, and capture those callee identifiers/member_expression as
`@middleware_name/`@middleware_call so invoked middleware is recognized by the
same rule.
In `@rules/typescript/express-route-middleware.toml`:
- Around line 8-16: The current call_expression query only binds middleware when
the second argument is a plain (identifier) `@middleware_name`; update the pattern
so `@middleware_name` also captures member expressions and invocation forms (e.g.,
auth.requireAuth, requireAuth(), or auth.requireAuth()) by allowing the second
argument to be (identifier | member_expression | call_expression) and optionally
a call_expression whose function is a member_expression, keeping the existing
captures like (member_expression property:(property_identifier) `@method`) and the
outer (call_expression function: ... ) `@match` so the rule will match both direct
identifiers, dotted properties, and common invoked middleware factories.
In `@rules/typescript/feature-gate-check.toml`:
- Around line 29-35: Add a negative test case to the existing [[rule.tests]]
entries to verify predicate filtering: create a new [[rule.tests]] block with an
input that uses a non-matching property name (e.g., `if (user.name ===
"admin")`) and set expect_match = false so the rule does not trigger; reference
the existing rule.tests/input/expect_match pattern to mirror the positive case
and ensure non-matching properties are excluded by the predicate filter.
In `@rules/typescript/jwt-token-check.toml`:
- Line 6: The rule's description string ("JWT token verification or
authorization header access") is inconsistent with the predicate which includes
the "sign" method (used for token creation); either update the description to
mention signing/creation (e.g., add "signing/creation" or "signing") or remove
"sign" from the method predicate so the rule only targets
verification/authorization checks — locate the rule with the description line
and the predicate that lists "sign" and apply one of these two fixes
consistently (also update the similar occurrence referenced around the other
predicate block).
In `@rules/typescript/session-auth-check.toml`:
- Around line 8-15: The rule currently matches any call_expression with a
member_expression by method name (captured as `@method`) which yields false
positives; modify the tree-sitter pattern to also capture the
member_expression's object as a receiver (e.g., add a capture like `@receiver` on
the object node within member_expression) and add a new predicate (e.g.,
rule.predicates.receiver) that matches common receiver identifiers such as
^(req|request|ctx|session|ctx.request)$ so the rule only triggers when the
method (isAuthenticated|isLoggedIn|isAuthorized) is invoked on one of those
receivers; keep the existing method predicate (rule.predicates.method) and
ensure both predicates must match.
In `@src/cli.rs`:
- Around line 88-104: The CLI deep-scan flags (fields provider, model, max_cost,
api_key in the CLI struct) are documented as requiring --deep but clap won't
enforce that; update each field's #[arg(...)] attribute to include requires =
"deep" (e.g., add requires = "deep" to the attributes for provider, model,
max_cost and api_key) so clap validates that those options can only be used when
--deep is present while preserving existing attributes like long and env.
In `@src/commands/extract.rs`:
- Around line 17-32: The fallback logic in execute() for ExtractArgs currently
compares CLI values to hardcoded defaults ("app" and "./policies/generated"),
which is fragile; replace these literal checks by referencing shared constants
or by removing CLI defaults and treating those fields as Option so config wins
when present. Specifically, introduce shared DEFAULT_PACKAGE_PREFIX and
DEFAULT_OUTPUT_DIR constants (used by cli.rs and here) or change
ExtractArgs.package_prefix/output_dir to Option and adjust execute() to use
config.extract.package_prefix.unwrap_or(args.package_prefix) and
config.extract.output_dir.map(PathBuf::from).unwrap_or_else(||
args.output_dir.clone()) so the comparisons to hardcoded strings are eliminated
and the fallback behavior is deterministic.
In `@src/commands/init.rs`:
- Around line 25-32: The exists-then-write sequence (path.exists() followed by
std::fs::write(&path, DEFAULT_CONFIG)) has a TOCTOU race; replace it with an
atomic create-if-not-exists using
OpenOptions::new().write(true).create_new(true).open(&path) and then write
DEFAULT_CONFIG into that file handle, propagating errors as before (return the
same ZiftError::General on already-existing or map the io error), so the init
logic in this function becomes atomic and race-free.
In `@src/output/json.rs`:
- Around line 38-46: The aggregation uses lowercased keys for categories but not
for confidences, causing inconsistent JSON keys; in the loop over findings where
by_category and by_confidence are updated (variables f, f.category,
f.confidence, entries via .entry(...).or_default()), normalize the confidence
key the same way as category (e.g., call .to_lowercase() on f.confidence when
building the entry key) so both maps produce consistently-cased keys in the
output.
In `@src/rules/mod.rs`:
- Around line 90-113: Iteration over r.predicates (a HashMap) yields
non-deterministic predicate order; ensure deterministic ordering by sorting
predicates by their capture name before constructing the Vec<(String,
Predicate)>. Modify the loop building predicates (the block that creates
Predicate::Match/Eq/NotMatch/NotEq and pushes into predicates) to first collect
r.predicates.into_iter() (or keys) into a temporary Vec, sort that Vec by
capture_name (or key) and then iterate the sorted collection to build and push
(capture_name, p); keep the same error handling and use the same Predicate enum
and check_predicates logic so behavior is identical except for deterministic
ordering.
In `@src/scanner/matcher.rs`:
- Around line 65-69: The code silently drops invalid UTF-8 by calling
capture.node.utf8_text(source).unwrap_or_default(); change this to explicitly
handle the Result from capture.node.utf8_text(source) (e.g. match or if let
Err(e) = ...) and either return an error from the surrounding function or emit a
warning with context (include capture identifier like capture.index or
capture.name and the error e) instead of producing an empty string; replace the
unwrap_or_default call and ensure callers/signature of the function (in
matcher.rs) propagate or handle the error accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 58dde901-c70e-496a-b479-d3195910e62c
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (44)
.claude/commands/address-pr-feedback.md.claude/commands/commit.md.claude/commands/handoff.md.claude/commands/pr.md.claude/commands/review.mdCargo.tomlrules/typescript/express-auth-middleware.tomlrules/typescript/express-route-middleware.tomlrules/typescript/feature-gate-check.tomlrules/typescript/has-role-call.tomlrules/typescript/jwt-token-check.tomlrules/typescript/nestjs-roles-decorator.tomlrules/typescript/nestjs-use-guards.tomlrules/typescript/ownership-check.tomlrules/typescript/permission-check-call.tomlrules/typescript/role-check-conditional.tomlrules/typescript/role-includes-check.tomlrules/typescript/session-auth-check.tomlsrc/cli.rssrc/commands/extract.rssrc/commands/init.rssrc/commands/mod.rssrc/commands/report.rssrc/commands/rules.rssrc/commands/scan.rssrc/config.rssrc/error.rssrc/logging.rssrc/main.rssrc/output/json.rssrc/output/mod.rssrc/output/text.rssrc/rego/grouping.rssrc/rego/mod.rssrc/rego/templates.rssrc/rego/validator.rssrc/rules/embedded.rssrc/rules/mod.rssrc/scanner/discovery.rssrc/scanner/imports.rssrc/scanner/matcher.rssrc/scanner/mod.rssrc/scanner/parser.rssrc/types.rs
- Fix UTF-8 byte-slicing panic in grouping.rs by truncating via chars() - Fix strip_quotes panic on single-char strings with length guard - Remove SetMetadata from nestjs-roles-decorator rule (different arg semantics) - Use word-boundary regex instead of substring contains() for enforcement point detection - Deduplicate enforcement points to prevent double-counting across rules - Remove unused ParseError variant to fix clippy warning
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/scanner/imports.rs (1)
23-39:⚠️ Potential issue | 🟠 MajorNamespace import query still missing.
The queries only handle named imports (
import { foo }) and default imports (import foo). Files usingimport * as authz from "../policy"followed byauthz.authorize(...)won't have their imports detected, causing valid enforcement points to be incorrectly reported as inline findings.,
🛠️ Proposed fix to add namespace import support
const TS_DEFAULT_IMPORT_QUERY: &str = r#" (import_statement (import_clause (identifier) `@name`) source: (string) `@source`) "#; + +/// Tree-sitter query for namespace imports: `import * as foo from 'bar'` +const TS_NAMESPACE_IMPORT_QUERY: &str = r#" +(import_statement + (import_clause + (namespace_import + (identifier) `@name`)) + source: (string) `@source`) +"#;Then update line 56 to include the new query:
- for query_src in [TS_NAMED_IMPORT_QUERY, TS_DEFAULT_IMPORT_QUERY] { + for query_src in [TS_NAMED_IMPORT_QUERY, TS_DEFAULT_IMPORT_QUERY, TS_NAMESPACE_IMPORT_QUERY] {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner/imports.rs` around lines 23 - 39, Add a Tree-sitter query constant to capture namespace imports (e.g., import * as alias from "source") — name it TS_NAMESPACE_IMPORT_QUERY — that binds the namespace identifier to `@name` and the string literal to `@source`, similar to TS_NAMED_IMPORT_QUERY and TS_DEFAULT_IMPORT_QUERY; then update the code that currently uses only TS_NAMED_IMPORT_QUERY and TS_DEFAULT_IMPORT_QUERY (the place where import queries are collected/iterated) to include TS_NAMESPACE_IMPORT_QUERY so namespace imports (import * as alias from "...") are detected and mapped to the same name/source pairs.
🧹 Nitpick comments (1)
src/rego/templates.rs (1)
226-234: Add ABAC regression test for quoted RHS value.There’s no test protecting the ABAC default-stub rendering path that produced this quoting issue. Please add one asserting generated output contains
== "..."for string literals.🧪 Suggested test addition
@@ fn default_stub_rbac() { let stub = generate_default_stub( AuthCategory::Rbac, r#"authorize(user, "audit:read", { type: "audit" })"#, ); assert!(stub.contains("input.user.role in")); assert!(stub.contains("audit:read")); } + + #[test] + fn default_stub_abac_quotes_value_literal() { + let stub = generate_default_stub( + AuthCategory::Abac, + r#"if (user.department == "finance") { allow(); }"#, + ); + assert!(stub.contains(r#"input.user.department == "finance""#)); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rego/templates.rs` around lines 226 - 234, Add an ABAC regression test alongside default_stub_rbac that calls generate_default_stub with AuthCategory::Abac and an authorize rule containing a string literal RHS (e.g., authorize(user, "audit:read", { type: "audit" })), then assert the rendered stub contains the exact comparison form == "..." (for example assert!(stub.contains("== \"audit:read\""))); ensure the test uses the same helper generate_default_stub and checks string literals are quoted in the output to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/rego/templates.rs`:
- Around line 57-58: The rendered ABAC stub ends up with unquoted identifiers
because render_template (which calls strip_quotes) removes quotes added in
generate_default_stub; to fix, change the Rego template in src/rego/templates.rs
so the value placeholder is a literal string (e.g. input.user.{{attribute}} ==
"{{value}}") and then remove the redundant quote-wrapping in
generate_default_stub (stop using format!("\"{v}\"") so values are passed
unquoted into render_template); alternatively adjust
render_template/strip_quotes to not strip quotes for template values, but the
simpler fix is the template->include quotes + remove quoting in
generate_default_stub and update the other ABAC-related templates at the noted
lines (145–149) accordingly.
---
Duplicate comments:
In `@src/scanner/imports.rs`:
- Around line 23-39: Add a Tree-sitter query constant to capture namespace
imports (e.g., import * as alias from "source") — name it
TS_NAMESPACE_IMPORT_QUERY — that binds the namespace identifier to `@name` and the
string literal to `@source`, similar to TS_NAMED_IMPORT_QUERY and
TS_DEFAULT_IMPORT_QUERY; then update the code that currently uses only
TS_NAMED_IMPORT_QUERY and TS_DEFAULT_IMPORT_QUERY (the place where import
queries are collected/iterated) to include TS_NAMESPACE_IMPORT_QUERY so
namespace imports (import * as alias from "...") are detected and mapped to the
same name/source pairs.
---
Nitpick comments:
In `@src/rego/templates.rs`:
- Around line 226-234: Add an ABAC regression test alongside default_stub_rbac
that calls generate_default_stub with AuthCategory::Abac and an authorize rule
containing a string literal RHS (e.g., authorize(user, "audit:read", { type:
"audit" })), then assert the rendered stub contains the exact comparison form ==
"..." (for example assert!(stub.contains("== \"audit:read\""))); ensure the test
uses the same helper generate_default_stub and checks string literals are quoted
in the output to prevent regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b05efddb-eac0-467c-92ee-84c5a101898b
📒 Files selected for processing (6)
rules/typescript/nestjs-roles-decorator.tomlsrc/error.rssrc/rego/grouping.rssrc/rego/templates.rssrc/scanner/imports.rssrc/scanner/mod.rs
✅ Files skipped from review due to trivial changes (2)
- src/error.rs
- src/rego/grouping.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- rules/typescript/nestjs-roles-decorator.toml
- Return error on canonicalize failure instead of silent fallback - Use bounds-checked .get() for capture index in matcher - Filter path traversal components from derived output paths - Add depth limit and symlink skipping for rule directory traversal
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
src/rego/grouping.rs (1)
58-63:⚠️ Potential issue | 🟠 MajorPath containment is still bypassable for absolute/prefixed paths.
Filtering only
ParentDirleavesRootDir/WindowsPrefixcomponents intact. Ifstemis absolute,output_dir.join(safe_stem)can resolve outsideoutput_dir.Suggested fix
- // Filter out path traversal components - let safe_stem: PathBuf = stem - .components() - .filter(|c| !matches!(c, std::path::Component::ParentDir)) - .collect(); + // Keep only normal path components to guarantee a relative safe path + let safe_stem: PathBuf = stem + .components() + .filter_map(|c| match c { + std::path::Component::Normal(seg) => Some(seg), + _ => None, // drop Prefix/RootDir/CurDir/ParentDir + }) + .collect();#!/bin/bash # Verify current sanitization only removes ParentDir in derive_output_path rg -n -C2 'derive_output_path|Component::ParentDir|output_dir\.join' src/rego/grouping.rs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rego/grouping.rs` around lines 58 - 63, derive_output_path currently only strips ParentDir but still allows RootDir/Prefix components so an absolute or prefixed `stem` can escape `output_dir` when you call `output_dir.join(safe_stem)`. Fix by only collecting safe, relative components (accept only std::path::Component::Normal, optionally ignore CurDir) and explicitly skip RootDir and Prefix and ParentDir; build `safe_stem` from those Normal components so it can never be absolute or prefixed, and handle the case of an empty result (e.g., use a default file name) before calling `output_dir.join(...)`. Reference symbols: derive_output_path, stem, safe_stem, output_dir.join.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/commands/scan.rs`:
- Around line 24-44: Check args.format for OutputFormat::Sarif and return the
ZiftError immediately before calling scanner::scan or loading rules; move the
match on args.format (or an early if args.format == OutputFormat::Sarif) up so
the unsupported-format branch executes before scanner::scan, loaded_rules, and
any expensive work (symbols: args.format, OutputFormat::Sarif, scanner::scan,
loaded_rules, ZiftError::General).
In `@src/rego/grouping.rs`:
- Around line 149-151: The check using trimmed.starts_with("default allow") is
too broad and can match identifiers like "default allow_read"; instead split the
trimmed line on whitespace and ensure the first token is "default" and the
second token is exactly "allow" before treating it as the default-allow line
(use the existing trimmed and has_default_allow variables and update the
conditional that currently does trimmed.starts_with("default allow") to this
stricter token-based check).
In `@src/rules/mod.rs`:
- Around line 92-111: The code that builds the Predicate (the block using
pred.match_re, pred.eq, pred.not_match, pred.not_eq and producing
Predicate::Match/ Eq/NotMatch/NotEq) must validate that exactly one operator is
present; compute how many of pred.match_re, pred.eq, pred.not_match, pred.not_eq
are Some and if the count != 1 return a ZiftError::RuleParse for r.id with a
message like "predicate '{capture_name}' must specify exactly one operator"
before constructing the Predicate, otherwise proceed to construct the
appropriate Predicate variant as currently done.
In `@src/scanner/matcher.rs`:
- Around line 151-156: compute_finding_id currently hashes only (rule_id,
file_path, snippet) so identical snippets in the same file collide; update
compute_finding_id to incorporate the match's unique location (e.g., byte offset
or span start/end or line/column) into the hash input so repeated, distinct
matches in the same file produce different IDs—modify callers that build the
input to pass the match location and change compute_finding_id signature to
accept and include that location (or a token like match_start and match_end) and
ensure dedup_findings still uses the new ID scheme.
---
Duplicate comments:
In `@src/rego/grouping.rs`:
- Around line 58-63: derive_output_path currently only strips ParentDir but
still allows RootDir/Prefix components so an absolute or prefixed `stem` can
escape `output_dir` when you call `output_dir.join(safe_stem)`. Fix by only
collecting safe, relative components (accept only std::path::Component::Normal,
optionally ignore CurDir) and explicitly skip RootDir and Prefix and ParentDir;
build `safe_stem` from those Normal components so it can never be absolute or
prefixed, and handle the case of an empty result (e.g., use a default file name)
before calling `output_dir.join(...)`. Reference symbols: derive_output_path,
stem, safe_stem, output_dir.join.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cc942e90-1cf8-4472-a1b6-87922ad1d4c8
📒 Files selected for processing (4)
src/commands/scan.rssrc/rego/grouping.rssrc/rules/mod.rssrc/scanner/matcher.rs
- Split permission functions out of has-role-call.toml into new authorize-function-call.toml with permission-oriented Rego template - Remove inequality operators from ownership-check query to prevent semantic mismatch in generated Rego - Add equality operator constraint to role-check-conditional query - Use idiomatic OPA array membership pattern in role-includes-check - Detect permission-style strings (containing ':') in generate_default_stub and emit permission-based Rego instead of role-based - Validate TypeScript rules against both TS and TSX grammars in validate and test actions
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/rego/templates.rs (1)
57-58:⚠️ Potential issue | 🟠 MajorABAC stub currently renders an unquoted value (
== admin) instead of a string literal.Because
render_templatestrips surrounding quotes (Line 14), the value prepared at Lines 156-160 loses quotes and Line 57 becomesinput.user.<attr> == admin. In Rego this is an identifier, not a string literal.🔧 Proposed fix
- input.user.{{attribute}} == {{value}} + input.user.{{attribute}} == "{{value}}"AuthCategory::Abac => { vars.insert( "attribute".to_string(), literals.first().cloned().unwrap_or("TODO".into()), ); vars.insert( "value".to_string(), - literals - .get(1) - .map(|v| format!("\"{v}\"")) - .unwrap_or("\"TODO\"".into()), + literals.get(1).cloned().unwrap_or("TODO".into()), ); }#!/bin/bash # Show the exact ABAC template + render path that causes quote loss. sed -n '9,18p' src/rego/templates.rs sed -n '52,60p' src/rego/templates.rs sed -n '150,162p' src/rego/templates.rs python - <<'PY' import re template = 'input.user.{{attribute}} == {{value}}' vars = {'attribute':'role', 'value':'"admin"'} def strip_quotes(s): s = s.strip() if len(s) >= 2 and ((s.startswith('"') and s.endswith('"')) or (s.startswith("'") and s.endswith("'"))): return s[1:-1] return s out = re.sub(r"\{\{(\w+)\}\}", lambda m: strip_quotes(vars.get(m.group(1), m.group(0))), template) print(out) # expected current behavior: input.user.role == admin PYAlso applies to: 156-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rego/templates.rs` around lines 57 - 58, The ABAC template input.user.{{attribute}} == {{value}} ends up with an unquoted identifier because render_template strips surrounding quotes from substitutions; update render_template (or the code path that prepares the ABAC template values) so that values intended as string literals are preserved or re-quoted: either stop removing surrounding quotes for the {{value}} placeholder when rendering the ABAC template, or detect the ABAC equality pattern (input.user.{{attribute}} == {{value}}) and wrap the substituted value in quotes (e.g., ensure "admin" remains "admin") before returning the rendered string; change the logic in render_template and the code that prepares the vars so that the {{value}} substitution yields a valid Rego string literal rather than an identifier.
🧹 Nitpick comments (1)
rules/typescript/authorize-function-call.toml (1)
27-49: Add one explicit JavaScript fixture to exercise the JS path.All current tests rely on default language selection (TypeScript first). Add a
language = "javascript"case so this dual-language rule is explicitly covered.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rules/typescript/authorize-function-call.toml` around lines 27 - 49, Add an explicit JavaScript test case to the rule's test matrix: within the authorize-function-call.toml file add a [[rule.tests]] block with language = "javascript" and one of the existing inputs (e.g., the hasPermission("orders:read") or authorize("admin") snippet) and the corresponding expect_match value; locate the existing [[rule.tests]] entries that use inputs with hasPermission, authorize, and hasRole to mirror one of those blocks but include language = "javascript" so the JS path is explicitly exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rules/typescript/authorize-function-call.toml`:
- Around line 1-49: The new rule "ts-authorize-function-call" (rule id
ts-authorize-function-call) was created but not added to the EMBEDDED_RULES
list, so it won't be loaded by the default CLI; update the EMBEDDED_RULES
constant to include "authorize-function-call" (or the rule's identifier string
used by the list) alongside the existing "permission-check-call" entry and
ensure the rule is present in the same collection/array initialization where
EMBEDDED_RULES is defined so it gets embedded at build/runtime (refer to the
EMBEDDED_RULES symbol and the existing permission-check-call registration to
mirror formatting and placement).
In `@src/commands/rules.rs`:
- Line 76: The code directly indexes rule.languages[0] when computing lang (via
test.language.unwrap_or(rule.languages[0])), which panics if languages is empty;
change this to safely access the first language (e.g., use rule.languages.get(0)
or .first()) and handle the None case by returning a clear error or skipping the
test with a logged message that includes the rule identifier (reference
test.language and rule.languages in rules.rs), or update parse_rule/RuleToml
validation to reject empty languages so the code can assume at least one
language exists.
---
Duplicate comments:
In `@src/rego/templates.rs`:
- Around line 57-58: The ABAC template input.user.{{attribute}} == {{value}}
ends up with an unquoted identifier because render_template strips surrounding
quotes from substitutions; update render_template (or the code path that
prepares the ABAC template values) so that values intended as string literals
are preserved or re-quoted: either stop removing surrounding quotes for the
{{value}} placeholder when rendering the ABAC template, or detect the ABAC
equality pattern (input.user.{{attribute}} == {{value}}) and wrap the
substituted value in quotes (e.g., ensure "admin" remains "admin") before
returning the rendered string; change the logic in render_template and the code
that prepares the vars so that the {{value}} substitution yields a valid Rego
string literal rather than an identifier.
---
Nitpick comments:
In `@rules/typescript/authorize-function-call.toml`:
- Around line 27-49: Add an explicit JavaScript test case to the rule's test
matrix: within the authorize-function-call.toml file add a [[rule.tests]] block
with language = "javascript" and one of the existing inputs (e.g., the
hasPermission("orders:read") or authorize("admin") snippet) and the
corresponding expect_match value; locate the existing [[rule.tests]] entries
that use inputs with hasPermission, authorize, and hasRole to mirror one of
those blocks but include language = "javascript" so the JS path is explicitly
exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e7e327b9-dbe1-4bf5-89e8-6a10c626d77a
📒 Files selected for processing (7)
rules/typescript/authorize-function-call.tomlrules/typescript/has-role-call.tomlrules/typescript/ownership-check.tomlrules/typescript/role-check-conditional.tomlrules/typescript/role-includes-check.tomlsrc/commands/rules.rssrc/rego/templates.rs
✅ Files skipped from review due to trivial changes (1)
- rules/typescript/role-check-conditional.toml
🚧 Files skipped from review as they are similar to previous changes (3)
- rules/typescript/role-includes-check.toml
- rules/typescript/has-role-call.toml
- rules/typescript/ownership-check.toml
- Fix ABAC template rendering bare identifiers instead of quoted strings - Fail fast for unsupported SARIF format before scanning - Fix overbroad `default allow` prefix matching in grouping - Enforce exactly one predicate operator per capture in rule parsing - Include line numbers in finding ID hash to prevent collisions - Register authorize-function-call rule in embedded rules - Guard against panic on empty languages array in rule tests - Prefix digit-leading segments in Rego package names with underscore - Replace unimplemented! with Result in get_language()
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/rego/grouping.rs (1)
72-76:⚠️ Potential issue | 🟠 MajorFilter to
Component::Normalto prevent output-path escape.On Line 74, only
ParentDiris removed.RootDir/Prefixcan survive, andoutput_dir.join(safe_stem)may ignoreoutput_dirwhensafe_stemis absolute.Suggested fix
- let safe_stem: PathBuf = stem - .components() - .filter(|c| !matches!(c, std::path::Component::ParentDir)) - .collect(); + let safe_stem: PathBuf = stem + .components() + .filter_map(|c| match c { + std::path::Component::Normal(seg) => Some(seg), + _ => None, // drop ParentDir, CurDir, RootDir, Prefix + }) + .collect();#!/bin/bash # Verify current sanitization logic and whether Finding.file can be absolute upstream. rg -n -A20 -B5 --type rust 'fn derive_output_path' src/rego/grouping.rs # Inspect where Finding.file is populated across the codebase. rg -n -C3 --type rust '\bFinding\s*\{' src rg -n -C3 --type rust '\bfile:\s*[^,]+' src # Look for absolute/canonicalized path construction that could feed Finding.file. rg -n -C3 --type rust 'canonicalize\(|current_dir\(|absolut|to_path_buf\(\)' src🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rego/grouping.rs` around lines 72 - 76, The current sanitization in derive_output_path collects Path components but only filters ParentDir, allowing RootDir/Prefix to survive and make safe_stem absolute; instead build safe_stem as a fresh PathBuf and iterate stem.components(), pushing only std::path::Component::Normal segments (ignore RootDir, Prefix, CurDir, ParentDir, etc.) so the resulting safe_stem is always relative before calling output_dir.join(safe_stem); update the code around variables stem, safe_stem and output_dir accordingly.
🧹 Nitpick comments (3)
src/rules/mod.rs (1)
226-298: Unit tests adequately cover happy paths.Consider adding error-case tests (invalid regex, multiple predicates, empty languages) in a future iteration for more comprehensive coverage.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rules/mod.rs` around lines 226 - 298, Add unit tests that cover error and edge cases in the tests module: create new #[test] functions that call parse_rule with malformed inputs (e.g., invalid regex in rule.predicates, conflicting/multiple predicate keys, empty or missing languages) and assert that parse_rule returns Err (or the expected error variant), and add tests for merge_rules behavior when given duplicate ids with incompatible fields; reference parse_rule, merge_rules, SAMPLE_RULE, and PatternRule to locate where to add these negative tests.src/commands/rules.rs (1)
64-67: Consider returning an error instead of callingexit(1).Using
std::process::exit(1)bypasses Rust's error handling and makes the function harder to test. ReturningErr(ZiftError::General(...))would allowmain()to handle exits uniformly.♻️ Proposed refactor
if errors == 0 { println!("All {} rules validated successfully.", loaded.len()); } else { eprintln!("{errors} rule(s) failed validation."); - std::process::exit(1); + return Err(crate::error::ZiftError::General(format!( + "{errors} rule(s) failed validation" + ))); }println!("{passed} passed, {failed} failed."); if failed > 0 { - std::process::exit(1); + return Err(crate::error::ZiftError::General(format!( + "{failed} rule test(s) failed" + ))); }Also applies to: 139-142
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/rules.rs` around lines 64 - 67, The code currently calls std::process::exit(1) after printing validation failures (the eprintln!("{errors} rule(s) failed validation."); + std::process::exit(1)) which prevents normal error propagation; change these exits to return Err(ZiftError::General(format!(...))) with the same failure message (e.g., format!("{errors} rule(s) failed validation.")) and update the surrounding function signature to return Result<..., ZiftError> so callers (including main) can handle termination; apply the same replacement for the second occurrence referenced (lines ~139-142) and update any call sites to propagate or map the error instead of expecting the function to exit.src/rego/grouping.rs (1)
191-223: Add regression tests for new edge-case guards.Please add tests for (1) digit-leading package segments (
src/2024/auth.ts→app._2024.auth) and (2) output-path sanitization for non-normal components. This protects Lines 37-48 and Lines 72-76 from regressions.Based on learnings, the digit-leading segment guard in
derive_package_nameis required and should be regression-tested. As per coding guidelines "**/*.rs: Review code for missing test coverage on new code paths".Also applies to: 224-285
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rego/grouping.rs` around lines 191 - 223, Add two regression tests covering edge-case guards: one that verifies derive_package_name handles digit-leading path segments by prefixing an underscore (e.g., Path::new("src/2024/auth.ts") → "app._2024.auth") and another that verifies derive_output_path sanitizes non-normal path components so output paths remain under the intended output dir (e.g., inputs with "." or ".." or hyphenated components are normalized to safe components). Locate and extend the existing test module that contains derive_package_simple/derive_package_hyphenated and derive_output_path (functions derive_package_name and derive_output_path) and add assertions that exercise the digit-leading branch and the sanitization branch to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/rego/grouping.rs`:
- Around line 72-76: The current sanitization in derive_output_path collects
Path components but only filters ParentDir, allowing RootDir/Prefix to survive
and make safe_stem absolute; instead build safe_stem as a fresh PathBuf and
iterate stem.components(), pushing only std::path::Component::Normal segments
(ignore RootDir, Prefix, CurDir, ParentDir, etc.) so the resulting safe_stem is
always relative before calling output_dir.join(safe_stem); update the code
around variables stem, safe_stem and output_dir accordingly.
---
Nitpick comments:
In `@src/commands/rules.rs`:
- Around line 64-67: The code currently calls std::process::exit(1) after
printing validation failures (the eprintln!("{errors} rule(s) failed
validation."); + std::process::exit(1)) which prevents normal error propagation;
change these exits to return Err(ZiftError::General(format!(...))) with the same
failure message (e.g., format!("{errors} rule(s) failed validation.")) and
update the surrounding function signature to return Result<..., ZiftError> so
callers (including main) can handle termination; apply the same replacement for
the second occurrence referenced (lines ~139-142) and update any call sites to
propagate or map the error instead of expecting the function to exit.
In `@src/rego/grouping.rs`:
- Around line 191-223: Add two regression tests covering edge-case guards: one
that verifies derive_package_name handles digit-leading path segments by
prefixing an underscore (e.g., Path::new("src/2024/auth.ts") → "app._2024.auth")
and another that verifies derive_output_path sanitizes non-normal path
components so output paths remain under the intended output dir (e.g., inputs
with "." or ".." or hyphenated components are normalized to safe components).
Locate and extend the existing test module that contains
derive_package_simple/derive_package_hyphenated and derive_output_path
(functions derive_package_name and derive_output_path) and add assertions that
exercise the digit-leading branch and the sanitization branch to prevent
regressions.
In `@src/rules/mod.rs`:
- Around line 226-298: Add unit tests that cover error and edge cases in the
tests module: create new #[test] functions that call parse_rule with malformed
inputs (e.g., invalid regex in rule.predicates, conflicting/multiple predicate
keys, empty or missing languages) and assert that parse_rule returns Err (or the
expected error variant), and add tests for merge_rules behavior when given
duplicate ids with incompatible fields; reference parse_rule, merge_rules,
SAMPLE_RULE, and PatternRule to locate where to add these negative tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 491061dd-571d-424f-83c8-c6fda69d75bd
📒 Files selected for processing (9)
src/commands/rules.rssrc/commands/scan.rssrc/rego/grouping.rssrc/rego/templates.rssrc/rules/embedded.rssrc/rules/mod.rssrc/scanner/matcher.rssrc/scanner/mod.rssrc/scanner/parser.rs
✅ Files skipped from review due to trivial changes (1)
- src/scanner/mod.rs
🚧 Files skipped from review as they are similar to previous changes (5)
- src/commands/scan.rs
- src/rules/embedded.rs
- src/scanner/parser.rs
- src/rego/templates.rs
- src/scanner/matcher.rs
- Add tracing::warn for out-of-range capture indices instead of silent skip - Strengthen path traversal protection: keep only Normal path components - Add canonicalization check in extract to verify output stays in output dir
- Validator: use bare identifier for non-quoted placeholders so templates
with identifier-position vars (e.g. input.user.{{attr}}) validate correctly
- Matcher: execute_query now returns Result and errors on invalid capture
index instead of silently skipping (surfaces rule bugs)
- Extract: canonicalize parent directory before writing to prevent TOCTOU
issues with symlinked intermediate directories
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
src/commands/extract.rs (1)
70-93:⚠️ Potential issue | 🔴 CriticalDon’t canonicalize the target file before it exists.
rego_file.output_path.canonicalize()at Line 80 fails for newly generated files, soextracterrors out on a fresh output directory before writing anything. Canonicalize an existing path such asparent()for the containment check, then write the file.🔧 Proposed fix
- let canonical_file = rego_file.output_path.canonicalize().map_err(|e| { + let parent = rego_file.output_path.parent().ok_or_else(|| { + ZiftError::General(format!( + "output path '{}' has no parent directory", + rego_file.output_path.display() + )) + })?; + let canonical_parent = parent.canonicalize().map_err(|e| { ZiftError::General(format!( - "failed to resolve output path '{}': {e}", - rego_file.output_path.display() + "failed to resolve output path parent '{}': {e}", + parent.display() )) })?; - if !canonical_file.starts_with(&canonical_output_dir) { + if !canonical_parent.starts_with(&canonical_output_dir) { return Err(ZiftError::General(format!( "output path '{}' escapes output directory '{}'", rego_file.output_path.display(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/extract.rs` around lines 70 - 93, The code canonicalizes rego_file.output_path which fails for new files; instead canonicalize the output_dir and the resolved parent directory (rego_file.output_path.parent()) to verify containment, then write the file; update the block that computes canonical_file to canonicalize parent (e.g., use rego_file.output_path.parent().ok_or(...).canonicalize()), perform the starts_with check against canonical_output_dir, and only after that call std::fs::write(®o_file.output_path, ®o_file.content)?.src/rules/mod.rs (1)
88-89:⚠️ Potential issue | 🟠 MajorReject empty
languagesarrays during rule parsing.A rule with
languages = []still loads successfully here.rules validatethen skips query compilation for it, and scans silently never apply it, so malformed external rules pass unnoticed. Fail fast inparse_rulewhen the list is empty.🔧 Proposed fix
let r = file.rule; + + if r.languages.is_empty() { + return Err(ZiftError::RuleParse { + rule_id: r.id.clone(), + message: "rule must declare at least one language".into(), + }); + } let mut predicates = Vec::new();Also applies to: 134-152
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rules/mod.rs` around lines 88 - 89, The parser currently accepts rules with an empty languages array (e.g., when reading file.rule), which causes those rules to be ignored later; in parse_rule you should validate that the rule's languages list is non-empty and return a parse error (or Err) if it's empty so loading fails fast; locate the code paths that construct/return the parsed Rule (references: parse_rule, file.rule and the block around the similar check at lines ~134-152) and add a check that rejects languages == [] with a clear error message.src/scanner/matcher.rs (1)
101-101:⚠️ Potential issue | 🟠 MajorLine-based finding IDs still collide for repeated matches on the same line.
Hashing
line_start/line_endfixes cross-line duplicates, but it still collapses identical same-line matches likehasRole("admin") || hasRole("admin")into one finding. Use byte offsets or columns frommatchedinstead.🔧 Proposed fix
- let id = compute_finding_id(&compiled.rule.id, file_path, line_start, line_end, &code_snippet); + let id = compute_finding_id( + &compiled.rule.id, + file_path, + matched.start_byte(), + matched.end_byte(), + &code_snippet, + ); @@ fn compute_finding_id( rule_id: &str, file_path: &Path, - line_start: usize, - line_end: usize, + start_byte: usize, + end_byte: usize, snippet: &str, ) -> String { @@ - hasher.update(line_start.to_le_bytes()); - hasher.update(line_end.to_le_bytes()); + hasher.update(start_byte.to_le_bytes()); + hasher.update(end_byte.to_le_bytes());Also applies to: 157-170
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner/matcher.rs` at line 101, The current compute_finding_id call uses line_start/line_end which still collapses distinct same-line matches; change to use precise byte or column offsets from the match object (e.g., matched.start()/matched.end() or matched.byte_start/byte_end) so identical text on the same line produces distinct IDs. Update the call site(s) that invoke compute_finding_id (the call at compute_finding_id(&compiled.rule.id, file_path, line_start, line_end, &code_snippet) and the similar block around lines 157-170) to pass the match-derived offsets instead of line numbers, and adjust compute_finding_id’s parameter names/signature accordingly to accept and hash the byte/column start and end values. Ensure both occurrences are updated so IDs include the match-level offsets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/commands/rules.rs`:
- Around line 41-42: The call to ts_parser::get_language(*lang, is_tsx_jsx) must
not use the ? operator because an unsupported language for one external rule
should not abort the whole run; instead, match on its Result inside the loops in
the RulesAction::Validate and RulesAction::Test blocks, treat Err(_) as a
per-rule failure by printing a FAIL for that rule/language and incrementing the
failed counter, then continue to the next variant/rule; keep all other logic the
same so only that error is handled locally rather than returned.
---
Duplicate comments:
In `@src/commands/extract.rs`:
- Around line 70-93: The code canonicalizes rego_file.output_path which fails
for new files; instead canonicalize the output_dir and the resolved parent
directory (rego_file.output_path.parent()) to verify containment, then write the
file; update the block that computes canonical_file to canonicalize parent
(e.g., use rego_file.output_path.parent().ok_or(...).canonicalize()), perform
the starts_with check against canonical_output_dir, and only after that call
std::fs::write(®o_file.output_path, ®o_file.content)?.
In `@src/rules/mod.rs`:
- Around line 88-89: The parser currently accepts rules with an empty languages
array (e.g., when reading file.rule), which causes those rules to be ignored
later; in parse_rule you should validate that the rule's languages list is
non-empty and return a parse error (or Err) if it's empty so loading fails fast;
locate the code paths that construct/return the parsed Rule (references:
parse_rule, file.rule and the block around the similar check at lines ~134-152)
and add a check that rejects languages == [] with a clear error message.
In `@src/scanner/matcher.rs`:
- Line 101: The current compute_finding_id call uses line_start/line_end which
still collapses distinct same-line matches; change to use precise byte or column
offsets from the match object (e.g., matched.start()/matched.end() or
matched.byte_start/byte_end) so identical text on the same line produces
distinct IDs. Update the call site(s) that invoke compute_finding_id (the call
at compute_finding_id(&compiled.rule.id, file_path, line_start, line_end,
&code_snippet) and the similar block around lines 157-170) to pass the
match-derived offsets instead of line numbers, and adjust compute_finding_id’s
parameter names/signature accordingly to accept and hash the byte/column start
and end values. Ensure both occurrences are updated so IDs include the
match-level offsets.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 59bd5f4f-127b-497c-afb3-3042098575d7
📒 Files selected for processing (10)
src/commands/extract.rssrc/commands/rules.rssrc/commands/scan.rssrc/rego/grouping.rssrc/rego/templates.rssrc/rules/embedded.rssrc/rules/mod.rssrc/scanner/matcher.rssrc/scanner/mod.rssrc/scanner/parser.rs
✅ Files skipped from review due to trivial changes (1)
- src/rules/embedded.rs
There was a problem hiding this comment.
🧹 Nitpick comments (4)
src/scanner/matcher.rs (1)
201-292: Good test coverage for core matching behavior.The tests cover positive matches, false positive rejection, and deduplication. Consider adding tests for
filter_findings(confidence/category filtering) andcheck_predicateswithNotMatch/NotEqvariants to increase coverage of edge cases.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner/matcher.rs` around lines 201 - 292, Tests currently miss coverage for filter_findings and predicate negations; add unit tests to exercise confidence/category filtering and check_predicates behavior for NotMatch and NotEq. Create new tests in the tests module that call parse_and_match or construct Findings directly and then invoke filter_findings with various confidence levels and categories to assert inclusion/exclusion, and call check_predicates (or trigger rules compiled by compile_rule that use NotMatch/NotEq predicates) to assert they correctly invert matches; reference functions dedup_findings, filter_findings, and check_predicates in your assertions so the tests fail if logic changes.src/scanner/mod.rs (1)
148-154: Silent failure on invalidmin_confidenceconfig value.If
config.scan.min_confidencecontains an invalid string (e.g.,"invalid"),.parse().ok()silently returnsNone, effectively disabling the confidence filter without any warning. Consider logging a warning when the parse fails so users know their config is ignored.♻️ Suggested improvement
let min_confidence = args.confidence.or_else(|| { config .scan .min_confidence .as_deref() - .and_then(|s| s.parse().ok()) + .and_then(|s| { + s.parse().map_err(|_| { + tracing::warn!("invalid min_confidence '{}' in config, ignoring", s); + }).ok() + }) });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scanner/mod.rs` around lines 148 - 154, The current min_confidence computation silently ignores parse errors because it uses .parse().ok(); update the logic around min_confidence (where args.confidence is combined with config.scan.min_confidence) to attempt parsing with .parse() and, on Err, emit a warning log that includes the invalid string and that the config value was ignored, then fall back to None; specifically locate the block assigning min_confidence and replace the closure to match on config.scan.min_confidence.as_deref().map(|s| s.parse::<f32>()) (or equivalent), log on Err (using the module logger or processLogger used elsewhere), and return None on parse failure.src/commands/extract.rs (2)
19-32: Consider documenting the default-value coupling withcli.rs.The config fallback logic compares against hardcoded strings (
"app","./policies/generated") that must match thedefault_valueattributes inExtractArgs. If those defaults drift, config overrides silently stop working.A brief comment noting this dependency would help future maintainers, or extract the defaults into shared constants.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/extract.rs` around lines 19 - 32, The fallback logic for package_prefix and output_dir in extract.rs currently compares against hardcoded literals ("app" and "./policies/generated") tied to the ExtractArgs defaults (args.package_prefix and args.output_dir), which can break silently if those defaults change; update this by either (a) adding a short comment above the package_prefix and output_dir blocks referencing the dependency on ExtractArgs defaults (args.package_prefix, args.output_dir) so maintainers know to change both places, or (b) move the default values into shared constants (e.g., DEFAULT_PACKAGE_PREFIX, DEFAULT_OUTPUT_DIR) in the CLI/ExtractArgs module and use those constants here when calling .filter(...) so the default is single-sourced and avoids drift.
76-81: Hoistoutput_dir.canonicalize()out of the loop.
canonical_output_diris recomputed on every iteration butoutput_dirdoesn't change. Move this before the loop to avoid redundant filesystem calls.♻️ Proposed refactor
// Group and generate files let rego_files = rego::group_findings(&findings, package_prefix, &output_dir); + // Canonicalize output_dir once for path containment checks + let canonical_output_dir = output_dir.canonicalize().map_err(|e| { + ZiftError::General(format!( + "failed to resolve output dir '{}': {e}", + output_dir.display() + )) + })?; + // Write files and validate let mut total_files = 0; let mut validation_warnings = 0; for rego_file in ®o_files { // ... existing code ... if let Some(parent) = rego_file.output_path.parent() { std::fs::create_dir_all(parent)?; } - // Verify the resolved output path stays within the output directory - // by canonicalizing the parent (which now exists) and checking containment - // BEFORE writing, to prevent TOCTOU issues with symlinked directories. - let canonical_output_dir = output_dir.canonicalize().map_err(|e| { - ZiftError::General(format!( - "failed to resolve output dir '{}': {e}", - output_dir.display() - )) - })?; + // Verify the resolved output path stays within the output directory let parent = rego_file.output_path.parent().ok_or_else(|| {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/commands/extract.rs` around lines 76 - 81, Move the call to output_dir.canonicalize() out of the repeated loop: compute canonical_output_dir once before entering the loop by calling output_dir.canonicalize().map_err(...) (preserving the existing ZiftError::General message), then remove the same canonicalize call from inside the loop and use the precomputed canonical_output_dir variable where needed; this avoids redundant filesystem calls and preserves the existing error handling and variable name canonical_output_dir in src/commands/extract.rs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/commands/extract.rs`:
- Around line 19-32: The fallback logic for package_prefix and output_dir in
extract.rs currently compares against hardcoded literals ("app" and
"./policies/generated") tied to the ExtractArgs defaults (args.package_prefix
and args.output_dir), which can break silently if those defaults change; update
this by either (a) adding a short comment above the package_prefix and
output_dir blocks referencing the dependency on ExtractArgs defaults
(args.package_prefix, args.output_dir) so maintainers know to change both
places, or (b) move the default values into shared constants (e.g.,
DEFAULT_PACKAGE_PREFIX, DEFAULT_OUTPUT_DIR) in the CLI/ExtractArgs module and
use those constants here when calling .filter(...) so the default is
single-sourced and avoids drift.
- Around line 76-81: Move the call to output_dir.canonicalize() out of the
repeated loop: compute canonical_output_dir once before entering the loop by
calling output_dir.canonicalize().map_err(...) (preserving the existing
ZiftError::General message), then remove the same canonicalize call from inside
the loop and use the precomputed canonical_output_dir variable where needed;
this avoids redundant filesystem calls and preserves the existing error handling
and variable name canonical_output_dir in src/commands/extract.rs.
In `@src/scanner/matcher.rs`:
- Around line 201-292: Tests currently miss coverage for filter_findings and
predicate negations; add unit tests to exercise confidence/category filtering
and check_predicates behavior for NotMatch and NotEq. Create new tests in the
tests module that call parse_and_match or construct Findings directly and then
invoke filter_findings with various confidence levels and categories to assert
inclusion/exclusion, and call check_predicates (or trigger rules compiled by
compile_rule that use NotMatch/NotEq predicates) to assert they correctly invert
matches; reference functions dedup_findings, filter_findings, and
check_predicates in your assertions so the tests fail if logic changes.
In `@src/scanner/mod.rs`:
- Around line 148-154: The current min_confidence computation silently ignores
parse errors because it uses .parse().ok(); update the logic around
min_confidence (where args.confidence is combined with
config.scan.min_confidence) to attempt parsing with .parse() and, on Err, emit a
warning log that includes the invalid string and that the config value was
ignored, then fall back to None; specifically locate the block assigning
min_confidence and replace the closure to match on
config.scan.min_confidence.as_deref().map(|s| s.parse::<f32>()) (or equivalent),
log on Err (using the module logger or processLogger used elsewhere), and return
None on parse failure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: de492fef-16f4-4f0a-923f-537d316df2d0
📒 Files selected for processing (5)
src/commands/extract.rssrc/commands/rules.rssrc/rego/validator.rssrc/scanner/matcher.rssrc/scanner/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/rego/validator.rs
- src/commands/rules.rs
Summary
Initial implementation of the zift CLI tool — scans codebases for embedded authorization logic and generates Rego policies for OPA.
Changes
scan,extract,rules,initcommands; barezift .defaults to scanextractcommand groups findings by source file into per-file.regopackages with template interpolation from tree-sitter capturesregoruscrate to validate generated policies and rule templates--rules-diror configScan output example (EA codebase)
Testing
🤖 Generated with Claude Code
Summary by CodeRabbit