feat(audit)!: consolidate audit log JSON entries into command_evaluations#333
Conversation
intent(audit): let audit consumers filter by the real binary (e.g. `helmfile`) without re-implementing shell tokenisation in jq, which currently mishandles inline env prefixes like `FOO=x helmfile ...` and matches command names in `--prompt`-style argument bodies. decision(audit): expose runok's existing `ExtractedCommand` shape (env / argv / redirects / pipe) verbatim. The data is already computed during evaluation; serialising it is the cheap path. decision(audit): top-level `parsed` for single commands, per-branch `sub_evaluations[].parsed` for compound. Single inputs stay one-hop reachable from `.parsed.argv[0]`; compound inputs already forced consumers to walk `sub_evaluations`, so duplicating the data at the top would only add noise. rejected(audit): unifying single + compound by always populating `sub_evaluations` with one entry — simpler jq but breaks every existing consumer that branches on `sub_evaluations == null`. rejected(audit): emitting `binary` / `subcommand` as separate fields — subcommand position varies per CLI (`git -C dir status` has it at argv[3]) and runok would have to ship a per-CLI shaping table that drifts from upstream. constraint(audit): backwards compat is preserved via `skip_serializing_if` on every new optional field; old jq filters that don't reference `.parsed` see no schema change.
There was a problem hiding this comment.
Code Review
This pull request introduces a structured parsed field to audit log entries, providing shell-level tokenization results—including environment variables, arguments, redirects, and pipeline status—directly in the JSON output. This allows audit consumers to filter by binary or arguments without re-implementing shell parsing. The implementation updates the command extraction logic using tree-sitter, adds new serializable models, and includes comprehensive integration tests and documentation. Feedback was provided to refactor the new roundtrip tests into an existing parameterized test suite to adhere to the repository's style guide regarding test parameterization and naming.
intent(audit): collapse the single-vs-compound branching consumers had to write (top-level `parsed` / `matched_rules` for non-compound inputs vs. `sub_evaluations[].parsed` for compound) so a one-line jq filter on the actual binary works for both shapes. decision(audit): drop top-level `matched_rules`, `sub_evaluations`, and the just-added optional `parsed`. Surface a flat `command_evaluations: [CommandEvaluation]` instead — one entry for non-compound (`eval_type: "primary"`), one entry per branch for compound (`eval_type: "compound"`), and an empty array when the input has no runnable command (comment-only / parse failure). decision(audit): inline `env` / `argv` / `redirects` / `pipe` directly onto `CommandEvaluation` rather than nesting them under a `parsed` sub-object — consumers were already going to read `command_evaluations[].argv[0]`, so the extra hop bought nothing. rejected(audit): keep both shapes side by side under a feature flag. Both shapes describe the same data with different conventions; carrying them long-term doubles the surface for consumers and the tests that pin the wire format. constraint(audit): runok 0.x — breaking the audit JSON shape is permitted, but `releases/next.md` carries the migration so jq consumers know to pivot through `.command_evaluations[]`.
intent(audit): catch typos in `"primary"` / `"compound"` literals at compile time. The audit log JSON shape stays identical thanks to `#[serde(rename_all = "lowercase")]`, so consumers see no change. decision(audit): mirror the precedent set by `SerializableAction`, which also exposes a Rust enum and serialises as lowercase strings.
…d-field # Conflicts: # src/rules/command_parser.rs
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #333 +/- ##
==========================================
+ Coverage 88.74% 88.77% +0.03%
==========================================
Files 53 53
Lines 12163 12314 +151
==========================================
+ Hits 10794 10932 +138
- Misses 1369 1382 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
intent(releases): match the PR-link convention the rest of the Highlights / Bug Fixes entries already follow, so readers can jump from the changelog to the PR for the full discussion.
…d-field # Conflicts: # docs/src/content/docs/releases/next.md
intent(releases): keep the release notes focused on the schema change runok actually owns. The `jq` Before/After examples drifted into telling consumers how to query their own audit pipeline, which the linked `cli/audit` reference already covers.
…ation intent(audit): pre-#333 logs (top-level \`matched_rules\` plus \`sub_evaluations\`) were being skipped by \`runok audit\` with \`missing field 'command_evaluations'\`, hiding everything users recorded before they upgraded. decision(audit): teach \`AuditEntry\` to deserialise via a shadow \`RawAuditEntry\` that reads both shapes, projecting the legacy one into \`command_evaluations\` (compound -> per-branch \`Compound\` records, non-compound -> a synthesised \`Primary\` record) before construction. constraint(audit): only the read path changes; serialisation still emits the current schema, so old logs surface in the new shape without runok itself ever writing the legacy form again.
Purpose
jq. Filtering on the rawcommandstring mismatches on inline env prefixes (FOO=x helmfile ...) and on binary names buried inside argument bodies such as--prompta && betc.) commands, so consumers must branch on whether to read top-levelmatched_rulesorsub_evaluations[].matched_rulesApproach
AuditEntry.matched_rulesandsub_evaluations; collapse them into a singlecommand_evaluations: [CommandEvaluation]command, action, matched_rules, eval_type, env, argv, redirects, pipeflat onCommandEvaluationso the shell-level parse data sits at the same level as the rule evaluation resulteval_type: "primary"); compound input produces one element per branch (eval_type: "compound"); input with no runnable command (comment-only, parse failure) produces an empty array{ "command": "FOO=x echo hi && BAR=y cat /tmp/f", "command_evaluations": [ { "command": "echo hi", "action": { "type": "allow" }, "eval_type": "compound", "env": [{ "name": "FOO", "value": "x" }], "argv": ["echo", "hi"] }, { "command": "cat /tmp/f", "action": { "type": "allow" }, "eval_type": "compound", "env": [{ "name": "BAR", "value": "y" }], "argv": ["cat", "/tmp/f"] } ] }Design decisions
Whether single and compound entries share one shape
command_evaluations: []shapejqpath (.command_evaluations[]) covers both cases, eliminating the single/compound branchmatched_rules/sub_evaluationsfields directlyparsedfor single inputs, scatter per-branch parse data undersub_evaluations[].parsedfor compoundORbetween top-level andsub_evaluationspaths to handle both casesWhether runok names
binary/subcommandargv[0]themselvesargv[0]binary/subcommandfieldsjqbecomesparsed.subcommandgit -C dir statusputs it atargv[3]); runok would need a per-CLI tableBreaking changes
matched_rulesandsub_evaluationskeys onAuditEntryJSON are removed; their contents move into eachcommand_evaluations[]element