feat(cicd_rules): :known_fake_action_sha rule — block partial-prefix-corruption fakes#397
Merged
Merged
Conversation
…fix-corruption fakes Complement to the existing `:unpinned_action` rule. Where `:unpinned_action` catches non-SHA pins (`@v1`, `@main`, etc.), this catches the opposite failure mode: SHA-shaped pins that don't actually exist on the upstream action repo (`gh api commits/<sha> -> 422`). ## How these are getting in Estate audit on 2026-05-30 found 67 fake action SHA pairs across ~50 repos (11% fabrication rate across 372 unique pins). Universal pattern: **partial-prefix corruption** — the first 8-20 hex chars match a real release's SHA exactly, then the suffix is fabricated. Examples: fake: 7ab2955eb728f5440978d7b4f723a50dea1f3608 real: 7ab2955eb728f5440978d5824358023be3a2802d (setup-zig v2.2.0) fake: 49933ea5288caeca8642195f2b846b8bbe245a93 real: 49933ea5288caeca8642d1e84afbd3f7d6820020 (setup-node v4.4.0) fake: 909cc5acb0135c37a79510dd77767e217930de55 real: 909cc5acb0fdd60627fb858598759246509fa755 (setup-deno v2.0.2) The corruption pattern is consistent enough that it's almost certainly a single AI-hallucination event that got copy-pasted across the estate. ## What this rule does Static-list pattern enumerating the 25 known fake SHAs from the 2026-05-30 audit. Caught at scan time, blocked from new code, and the reason field directs to the substitution map in `project_estate_fake_action_sha_punch_list_2026_05_30.md` (memory). ## Tested Verified via inline Elixir tests (in transcript): - 5/5 known fakes correctly matched - 3/3 real SHAs correctly NOT matched (no false-positives on real v4.6.2 upload-artifact, real v4.4.0 setup-node, real checkout v4) ## Out of scope For proactive detection of FUTURE fakes (not in the 25-entry static list), a `mix hypatia.verify_action_shas` task is the right shape — requires network access to `gh api` and can't be a static-regex rule. Documented as follow-up in the punch-list memory. Provenance: discovered while wiring hyperpolymath/snifs#30 build-mode CI gate; the static-rule design intent was previously captured in `feedback_verify_action_sha_pins` memory ("Consider adding a complementary rule that VERIFIES SHA pins resolve upstream — would have caught all four fakes above").
4 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 30, 2026
…via gh api (#399) ## Summary Companion to the static-list `:known_fake_action_sha` rule (PR #397). Where that rule blocks the 25 fakes known at audit time, this mix task catches FUTURE fakes by actually calling `gh api repos/<org>/<action>/commits/<sha>` against every unique action pin in the estate. ## Why this is needed The static rule covers KNOWN fakes. But the next AI-hallucination event will introduce DIFFERENT SHAs that don't match the static list. To catch those, we have to actually verify each pin against upstream — that's network access; can't be a pure-pattern scan. Together, #397 + this PR close the "hypatia is a waste of time if these issues persist" concern: - **#397 (static)** — instant detection of the 25 known fakes - **this task (dynamic)** — periodic audit catches new fabrications ## Design Per hypatia's CLAUDE.md scanner-hygiene guidance, `System.cmd("gh", ["api", ...])` shell-out instead of introducing an HTTP-client dep. Reuses existing `gh` auth — no new auth surface, 5000 req/hr rate limit (plenty for 372 pins). ### Cache `data/verified-action-shas.json` stores `{ref}@{sha}` → `"real"|"fake"` so subsequent runs only verify NEW pins: - Cold: ~5 minutes (372 pins) - Warm: ~5 seconds (deltas only) ### `--paranoid` mode Re-verifies cached real SHAs. Run periodically to catch the rare case where upstream commits a revert that breaks a previously-verified SHA. ### Throttling 50ms inter-call sleep gives ~20 req/sec, well under rate limit even without auth cap. ## Usage \`\`\`bash mix hypatia.verify_action_shas # default text output mix hypatia.verify_action_shas --paranoid # re-verify cache mix hypatia.verify_action_shas --format json # tooling integration \`\`\` ## Exit codes - 0 — verification complete, zero fakes - 2 — fakes found (CI can use as hard gate) - 1 — hard failure (bad args, gh missing, cache unwritable) ## Out of scope (follow-up) - Wire into `LearningScheduler` GenServer for periodic auto-audits - Plumb through `Hypatia.Safety.RateLimiter` for fleet-wide concurrent runs ## Provenance Discovered the need during the snifs#30 build-mode arc / 2026-05-30 estate audit. Design intent captured in `feedback_verify_action_sha_pins` memory ("Consider adding a complementary rule that VERIFIES SHA pins resolve upstream"). ## Test plan - [ ] `mix hypatia.verify_action_shas --format json` runs to completion - [ ] Reports the known 25 fakes when run before round-2 sweep merges - [ ] Reports zero fakes after round-2 sweep + #397 land - [ ] Cache persists across runs (warm path << cold path)
6 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 30, 2026
## Summary Wires `mix hypatia.verify_action_shas` (#399) into `LearningScheduler` so it runs as a daily auto-audit, not just on manual invocation. Completes the auto-defense story alongside `:known_fake_action_sha` static rule (#397). ## How fake SHAs get caught now | Mechanism | Latency | Surface | |---|---|---| | `:known_fake_action_sha` rule (#397) | instant | static scan; catches the 25 known fakes immediately | | `mix hypatia.verify_action_shas` (#399) | manual | one-off audit; catches ANY 422-returning SHA | | **This PR — LearningScheduler integration** | daily | automatic estate-wide audit, no manual run needed | ## Design ### Cadence 24h via mtime check on the verifier's cache file (`data/verified-action-shas.json`). Subsequent runs are ~5 sec (only checks new pins) — cheap enough that daily costs nothing meaningful. ### Isolation Spawned as `System.cmd` subprocess via `Task.start`, NOT inline in the GenServer: - The mix task's `exit({:shutdown, 2})` on fakes-found would crash `LearningScheduler` if run in-process. Subprocess isolates that. - `Task.start` (not `Task.async`) — fire-and-forget; the learning cycle never waits on the gh-api walk. - Exceptions in the verification path are logged but never bubble up. ### Logging - Clean (zero fakes): info-level confirmation - Fakes found (exit 2): warning with clipped sample of output - Other exit codes: warning with stderr clip ## What's complete now The auto-defense story is closed: - New fake committed → static rule catches it at next scan - New fake somehow bypasses static scan → daily audit catches it within 24h - Cache keeps the daily audit cheap (~5 sec warm) - Zero manual intervention required for ongoing protection ## Provenance Closes the "hypatia is a waste of time" concern raised during the 2026-05-30 snifs#30 / fake-SHA arc. ## Test plan - [ ] Module compiles + LearningScheduler still starts cleanly - [ ] First cycle after startup runs the verification (cache absent → `action_sha_verify_stale?` returns true) - [ ] Subsequent cycles within 24h skip the verification (cache fresh) - [ ] After 24h, next cycle runs verification again - [ ] Verification exit 2 (fakes found) logs warning but doesn't crash scheduler - [ ] Verification crash logs warning but doesn't crash scheduler
3 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 30, 2026
## Summary Hypatia carried 3 fake SHA pins in its own workflows. Self-clean so hypatia passes its own `:known_fake_action_sha` rule (#397) when scanning itself. ## Substitutions (version-faithful where possible) | Action | Sites | Fake SHA | Real replacement | Version | |---|---|---|---|---| | `Swatinem/rust-cache` | 7 (ci.yml ×4, tests.yml ×3, release.yml ×1) | `ad397744...b8db` | `9d47c6ad4b02e050fd481d890b2ea34778fd09d6` | v2.7.8 (intent preserved) | | `haskell-actions/hlint-run` | 1 (ci.yml) | `75c62c3b...6ce2` | `0b0024319753ba0c8b2fa21b7018ed252aed8181` | v2.4.9 (intent preserved) | | `haskell-actions/hlint-setup` | 1 (ci.yml) | `17f0f409...ddfa` | `fe9cd1cd1af94a23900c06738e73f6ddb092966a` | **v2.4.10 (bumped — original `# v2.7.0` was doubly fictional)** | ### Note on hlint-setup The original `haskell-actions/hlint-setup@17f0f4093d35cfdbf02aab186d51d0bb8b92ddfa # v2.7.0` was fake at TWO levels: the SHA doesn't exist, AND the version `v2.7.0` was never released. `hlint-setup`'s tag history only goes up to v2.4.10. Bumped to v2.4.10 (current latest) rather than try to preserve a version that never existed. ## Verification All three real SHAs return 200 from `gh api repos/<org>/<action>/commits/<sha>`. Verified pre-commit. ## Why one PR for both action families The round-2 estate sweep in flight (~46 repos, version-faithful substitution map) handles `rust-cache@ad397744...` but NOT `hlint-run`/`hlint-setup` — those aren't in the substitution map. Filing both fixes in one PR for hypatia means: 1. Hypatia is fully self-clean immediately (passes its own rule) 2. When the in-flight sweep reaches hypatia, it'll see `corrected_not_emitted` (no-op) — clean handoff 3. No leftover hlint fixes deferred to a separate PR ## Provenance Estate audit 2026-05-30 found 67 fake action SHA pairs; the in-flight round-2 sweep handles 26 substitutions across 46 repos; ~25 niche single-repo fakes including hlint were documented as deferred. This PR moves the 2 hlint fakes from "deferred" to "done" since they're in hypatia's own repo and there's symbolic value to hypatia being self-clean. See `project_estate_fake_action_sha_punch_list_2026_05_30.md` for the full substitution map context. ## Test plan - [ ] `gh api` returns 200 for all 3 new SHAs (verified) - [ ] No remaining fake-SHA occurrences in hypatia's workflows (verified via grep) - [ ] CI passes (rust-cache + hlint behaviour is identical, just real SHAs)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
Hypatia.Rules.CicdRulesentry that flags GitHub Action pins where the SHA is fabricated. Complement to the existing:unpinned_actionrule.Problem
Estate audit on 2026-05-30 found 67 fake action SHA pairs across ~50 repos (11% fabrication rate across 372 unique pins). Verified via
gh api repos/<org>/<action>/commits/<sha>returning 422.Universal pattern: partial-prefix corruption. The first 8-20 hex chars match a real release's SHA; the suffix is fabricated. Examples:
7ab2955eb728f5440978d7b4f723a50dea1f36087ab2955eb728f5440978d5824358023be3a2802d49933ea5288caeca8642195f2b846b8bbe245a9349933ea5288caeca8642d1e84afbd3f7d6820020909cc5acb0135c37a79510dd77767e217930de55909cc5acb0fdd60627fb858598759246509fa755Almost certainly a single AI-hallucination event that propagated across the estate via copy-customise of templates. The fakes slip past visual review because of the matching prefix.
What this rule does
Adds
:known_fake_action_shato@blocked_patternsenumerating the 25 known fakes from the audit. Caught at scan time; blocks new code from re-introducing them. Reason field links to the substitution map inproject_estate_fake_action_sha_punch_list_2026_05_30memory entry.applies_to: [\"*.yml\", \"*.yaml\"]to scope to workflow files.Tested inline (transcript)
Out of scope
For proactive detection of FUTURE fakes (beyond the static 25-entry list), a
mix hypatia.verify_action_shastask is the right shape — needs network access togh api, can't be a static-regex rule. Documented as a follow-up in the punch-list memory.Provenance
Discovered while wiring
hyperpolymath/snifs#30build-mode CI gate. The static-rule design intent was previously captured infeedback_verify_action_sha_pinsmemory:This PR ships the static-list version of that idea.
Test plan