ci: harden Security workflow — Kani fails loud + deep-fuzz pool starvation fix#44
Conversation
The Kani Formal Verification job was failing open: every command had `|| echo "...skipping"` and the verify step had `continue-on-error: true`, so the job stayed green while verifying nothing. Combined with a stale toolchain pin, it had silently stopped running entirely: cargo install --locked kani-verifier -> rustc 1.85.1 is not supported (home 0.5.12 requires rustc 1.88) There are 11 real proof harnesses (byte_storage.rs, encryption/core.rs), so this is meaningful verification that was providing false assurance. - toolchain "1.85" -> stable (Kani's installer needs >=1.88; Kani downloads its own pinned verification toolchain in `cargo kani setup`). - cargo install ... --force (idempotent on the runner's persistent cargo cache). - Drop all `|| echo` swallows and `continue-on-error` — a failed install, setup, or proof now turns CI red. - Add workflow_dispatch so the schedule-only job can be run on demand (and so this fix can actually be verified now).
|
Warning Review limit reached
More reviews will be available in 10 minutes and 49 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds a ChangesSecurity Workflow Enhancement
🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/security.yml:
- Around line 210-213: The Kani job currently relies on default permissions;
update the GitHub Actions job named "kani" to declare explicit least-privilege
permissions by adding a permissions block (e.g., permissions: contents: read)
under the job definition so it only has read access to repository contents;
modify the "kani" job in the workflow to include that permissions block and
remove reliance on defaults.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f5d88ab1-8e99-45ba-86a0-aece08314e09
📒 Files selected for processing (1)
.github/workflows/security.yml
Per CodeRabbit/zizmor review on #44: the kani job relied on default permissions. It only checks out and runs verification, so scope it to contents: read.
|
@coderabbitai Added |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
gh pr view 44 --repo cachekit-io/cachekit-core --json headRefName,commits | jq .
cat .github/workflows/security.ymlLength of output: 14313
One note on the last point: scoping No remaining issues. LGTM. 🐇 ✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/security.yml (1)
221-221:⚠️ Potential issue | 🟡 MinorFix dtolnay/rust-toolchain pin comment in security workflow
.github/workflows/security.ymlline 221 pinsdtolnay/rust-toolchainto a full 40-char SHA, but the trailing comment is# master, which violates the requirement for a version comment. That commit (3c5f7ea28cd621ae0bf5283f0e981fb97b8a7af9) corresponds to adding the Rust 1.94.1 patch release, so update the comment accordingly (e.g.# Rust 1.94.1).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/security.yml at line 221, Update the trailing comment for the pinned action reference `uses: dtolnay/rust-toolchain@3c5f7ea28cd621ae0bf5283f0e981fb97b8a7af9` in the security workflow to reflect the actual release this commit represents (replace `# master` with `# Rust 1.94.1` or equivalent), so the version pin comment accurately documents the pinned commit.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.github/workflows/security.yml:
- Line 221: Update the trailing comment for the pinned action reference `uses:
dtolnay/rust-toolchain@3c5f7ea28cd621ae0bf5283f0e981fb97b8a7af9` in the security
workflow to reflect the actual release this commit represents (replace `#
master` with `# Rust 1.94.1` or equivalent), so the version pin comment
accurately documents the pinned commit.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 64c866c6-df20-4a6d-8d22-423ab183d825
📒 Files selected for processing (1)
.github/workflows/security.yml
The weekly deep-fuzz matrix (16 targets × 8h) shared the `cachekit` ARC pool with required PR CI and, at ~4 concurrent, monopolised it for ~32h every Saturday — starving the required `security` and `stable / ubuntu-latest` checks so no PR could merge until it drained. - Scheduled run: 1h/target (~16 runner-hours, drains in a few hours overnight). - On-demand: `workflow_dispatch` with `run_deep_fuzz=true` runs the full matrix at `fuzz_seconds` per target (default 8h). A plain dispatch does NOT trigger deep-fuzz, so verifying Kani on demand stays cheap. - Duration is driven by FUZZ_SECONDS (schedule=3600, dispatch=input); the timeout buffer logic is unchanged.
|
Expanded scope (same file, all Security-workflow CI reliability): Deep-fuzz pool starvation — the weekly 16×8h matrix monopolised the shared
@coderabbitai please re-review. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
gh pr view 44 --repo cachekit-io/cachekit-core --json headRefName,commits | jq .
gh pr diff 44 --repo cachekit-io/cachekit-core
cat .github/workflows/security.ymlLength of output: 22970
One observation worth noting: a No remaining issues. LGTM. 🐇 ✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/security.yml:
- Around line 208-220: The workflow currently assigns FUZZ_SECONDS from
inputs.fuzz_seconds and uses it in shell arithmetic and as a libFuzzer argument
without validation, which can break timeout arithmetic or pass invalid values to
cargo fuzz run; add a validation step before using FUZZ_SECONDS that ensures
inputs.fuzz_seconds is a non-empty string of digits and greater than zero
(falling back to the existing default '3600' if invalid), and only then export
FUZZ_SECONDS for use in timeout "$((FUZZ_SECONDS + 180))" and the cargo fuzz run
-- -max_total_time="$FUZZ_SECONDS" invocation; reference the FUZZ_SECONDS
variable and the timeout / cargo fuzz run commands so the check runs immediately
after setting FUZZ_SECONDS and before those commands execute.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5f6d78c6-bb7b-4014-8356-572287ccada4
📒 Files selected for processing (1)
.github/workflows/security.yml
Per CodeRabbit review on #44: fuzz_seconds (workflow_dispatch string input) flowed unguarded into $((FUZZ_SECONDS + 180)) and -max_total_time. Reject non-positive-integer values and cap at 32400s (the 540min job timeout) before use, so a bad input fails fast with a clear message instead of breaking the arithmetic or the fuzzer.
|
@coderabbitai Added a positive-integer + range guard (1..32400s, the 540min job cap) for |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
gh pr diff 44 --repo cachekit-io/cachekit-core
cat .github/workflows/security.ymlLength of output: 20686
One edge note (not a defect): No remaining issues. LGTM. 🐇 ✏️ Learnings added
🧠 Learnings used |
|
@coderabbitai approve |
✅ Action performedComments resolved and changes approved. |
What
Make the Kani Formal Verification job fail loudly and fix its toolchain.
The problem (two bugs, compounding)
The job was failing open — green CI while verifying nothing:
|| echo "...skipping"continue-on-error: true…and on top of that it had silently stopped running at all:
There are 11 real proof harnesses (
src/byte_storage.rs,src/encryption/core.rs), so this is meaningful verification that was providing false assurance.Fix
toolchain: "1.85"→stablehome@0.5.12). Kani downloads its own pinned verification toolchain incargo kani setup, so the host toolchain doesn't need to be the project MSRV.cargo install --locked --force kani-verifier|| echo+continue-on-error: trueworkflow_dispatchschedule-only; this lets it be run on demand (and lets this fix be verified now, not next Saturday).ci:type — non-releasing. After merge I'll dispatch the Security workflow to confirm the 11 proofs actually run and pass.Summary by CodeRabbit