feat(cicd_rules): :known_fake_action_sha rule — block partial-prefix-corruption fakes by hyperpolymath · Pull Request #397 · hyperpolymath/hypatia

hyperpolymath · 2026-05-30T17:00:21Z

Summary

Adds a new Hypatia.Rules.CicdRules entry that flags GitHub Action pins where the SHA is fabricated. Complement to the existing :unpinned_action rule.

Problem

Estate audit on 2026-05-30 found 67 fake action SHA pairs across ~50 repos (11% fabrication rate across 372 unique pins). Verified via gh api repos/<org>/<action>/commits/<sha> returning 422.

Universal pattern: partial-prefix corruption. The first 8-20 hex chars match a real release's SHA; the suffix is fabricated. Examples:

Fake	Real	Real version
`7ab2955eb728f5440978d`7b4f723a50dea1f3608	`7ab2955eb728f5440978d`5824358023be3a2802d	setup-zig v2.2.0
`49933ea5288caeca8642`195f2b846b8bbe245a93	`49933ea5288caeca8642`d1e84afbd3f7d6820020	setup-node v4.4.0
`909cc5acb0`135c37a79510dd77767e217930de55	`909cc5acb0`fdd60627fb858598759246509fa755	setup-deno v2.0.2

Almost certainly a single AI-hallucination event that propagated across the estate via copy-customise of templates. The fakes slip past visual review because of the matching prefix.

What this rule does

Adds :known_fake_action_sha to @blocked_patterns enumerating the 25 known fakes from the audit. Caught at scan time; blocks new code from re-introducing them. Reason field links to the substitution map in project_estate_fake_action_sha_punch_list_2026_05_30 memory entry.

applies_to: [\"*.yml\", \"*.yaml\"] to scope to workflow files.

Tested inline (transcript)

✓ expect=true  got=true   :: known fake setup-beam
✓ expect=true  got=true   :: known fake upload-artifact (partial-prefix)
✓ expect=false got=false  :: REAL upload-artifact v4.6.2 (no false-positive)
✓ expect=false got=false  :: REAL setup-node v4.4.0 (no false-positive)
✓ expect=true  got=true   :: fake setup-node (partial-prefix corruption)
✓ expect=true  got=true   :: fake codeql
✓ expect=false got=false  :: REAL checkout v4 (no false-positive)

Out of scope

For proactive detection of FUTURE fakes (beyond the static 25-entry list), a mix hypatia.verify_action_shas task is the right shape — needs network access to gh api, can't be a static-regex rule. Documented as a follow-up in the punch-list memory.

Provenance

Discovered while wiring hyperpolymath/snifs#30 build-mode CI gate. The static-rule design intent was previously captured in feedback_verify_action_sha_pins memory:

Consider adding a complementary rule that VERIFIES SHA pins resolve upstream — would have caught all four fakes above.

This PR ships the static-list version of that idea.

Test plan

Rule compiles + registers (verified locally)
No false-positives on real action SHAs in current estate workflows
When scanning the 50 repos still carrying these fakes, the rule flags them
After round-2 sweep (admin-merge style, in flight as PID 607915) completes, the rule's positives should drop to zero estate-wide

…fix-corruption fakes Complement to the existing `:unpinned_action` rule. Where `:unpinned_action` catches non-SHA pins (`@v1`, `@main`, etc.), this catches the opposite failure mode: SHA-shaped pins that don't actually exist on the upstream action repo (`gh api commits/<sha> -> 422`). ## How these are getting in Estate audit on 2026-05-30 found 67 fake action SHA pairs across ~50 repos (11% fabrication rate across 372 unique pins). Universal pattern: **partial-prefix corruption** — the first 8-20 hex chars match a real release's SHA exactly, then the suffix is fabricated. Examples: fake: 7ab2955eb728f5440978d7b4f723a50dea1f3608 real: 7ab2955eb728f5440978d5824358023be3a2802d (setup-zig v2.2.0) fake: 49933ea5288caeca8642195f2b846b8bbe245a93 real: 49933ea5288caeca8642d1e84afbd3f7d6820020 (setup-node v4.4.0) fake: 909cc5acb0135c37a79510dd77767e217930de55 real: 909cc5acb0fdd60627fb858598759246509fa755 (setup-deno v2.0.2) The corruption pattern is consistent enough that it's almost certainly a single AI-hallucination event that got copy-pasted across the estate. ## What this rule does Static-list pattern enumerating the 25 known fake SHAs from the 2026-05-30 audit. Caught at scan time, blocked from new code, and the reason field directs to the substitution map in `project_estate_fake_action_sha_punch_list_2026_05_30.md` (memory). ## Tested Verified via inline Elixir tests (in transcript): - 5/5 known fakes correctly matched - 3/3 real SHAs correctly NOT matched (no false-positives on real v4.6.2 upload-artifact, real v4.4.0 setup-node, real checkout v4) ## Out of scope For proactive detection of FUTURE fakes (not in the 25-entry static list), a `mix hypatia.verify_action_shas` task is the right shape — requires network access to `gh api` and can't be a static-regex rule. Documented as follow-up in the punch-list memory. Provenance: discovered while wiring hyperpolymath/snifs#30 build-mode CI gate; the static-rule design intent was previously captured in `feedback_verify_action_sha_pins` memory ("Consider adding a complementary rule that VERIFIES SHA pins resolve upstream — would have caught all four fakes above").

…via gh api (#399) ## Summary Companion to the static-list `:known_fake_action_sha` rule (PR #397). Where that rule blocks the 25 fakes known at audit time, this mix task catches FUTURE fakes by actually calling `gh api repos/<org>/<action>/commits/<sha>` against every unique action pin in the estate. ## Why this is needed The static rule covers KNOWN fakes. But the next AI-hallucination event will introduce DIFFERENT SHAs that don't match the static list. To catch those, we have to actually verify each pin against upstream — that's network access; can't be a pure-pattern scan. Together, #397 + this PR close the "hypatia is a waste of time if these issues persist" concern: - **#397 (static)** — instant detection of the 25 known fakes - **this task (dynamic)** — periodic audit catches new fabrications ## Design Per hypatia's CLAUDE.md scanner-hygiene guidance, `System.cmd("gh", ["api", ...])` shell-out instead of introducing an HTTP-client dep. Reuses existing `gh` auth — no new auth surface, 5000 req/hr rate limit (plenty for 372 pins). ### Cache `data/verified-action-shas.json` stores `{ref}@{sha}` → `"real"|"fake"` so subsequent runs only verify NEW pins: - Cold: ~5 minutes (372 pins) - Warm: ~5 seconds (deltas only) ### `--paranoid` mode Re-verifies cached real SHAs. Run periodically to catch the rare case where upstream commits a revert that breaks a previously-verified SHA. ### Throttling 50ms inter-call sleep gives ~20 req/sec, well under rate limit even without auth cap. ## Usage \`\`\`bash mix hypatia.verify_action_shas # default text output mix hypatia.verify_action_shas --paranoid # re-verify cache mix hypatia.verify_action_shas --format json # tooling integration \`\`\` ## Exit codes - 0 — verification complete, zero fakes - 2 — fakes found (CI can use as hard gate) - 1 — hard failure (bad args, gh missing, cache unwritable) ## Out of scope (follow-up) - Wire into `LearningScheduler` GenServer for periodic auto-audits - Plumb through `Hypatia.Safety.RateLimiter` for fleet-wide concurrent runs ## Provenance Discovered the need during the snifs#30 build-mode arc / 2026-05-30 estate audit. Design intent captured in `feedback_verify_action_sha_pins` memory ("Consider adding a complementary rule that VERIFIES SHA pins resolve upstream"). ## Test plan - [ ] `mix hypatia.verify_action_shas --format json` runs to completion - [ ] Reports the known 25 fakes when run before round-2 sweep merges - [ ] Reports zero fakes after round-2 sweep + #397 land - [ ] Cache persists across runs (warm path << cold path)

## Summary Wires `mix hypatia.verify_action_shas` (#399) into `LearningScheduler` so it runs as a daily auto-audit, not just on manual invocation. Completes the auto-defense story alongside `:known_fake_action_sha` static rule (#397). ## How fake SHAs get caught now | Mechanism | Latency | Surface | |---|---|---| | `:known_fake_action_sha` rule (#397) | instant | static scan; catches the 25 known fakes immediately | | `mix hypatia.verify_action_shas` (#399) | manual | one-off audit; catches ANY 422-returning SHA | | **This PR — LearningScheduler integration** | daily | automatic estate-wide audit, no manual run needed | ## Design ### Cadence 24h via mtime check on the verifier's cache file (`data/verified-action-shas.json`). Subsequent runs are ~5 sec (only checks new pins) — cheap enough that daily costs nothing meaningful. ### Isolation Spawned as `System.cmd` subprocess via `Task.start`, NOT inline in the GenServer: - The mix task's `exit({:shutdown, 2})` on fakes-found would crash `LearningScheduler` if run in-process. Subprocess isolates that. - `Task.start` (not `Task.async`) — fire-and-forget; the learning cycle never waits on the gh-api walk. - Exceptions in the verification path are logged but never bubble up. ### Logging - Clean (zero fakes): info-level confirmation - Fakes found (exit 2): warning with clipped sample of output - Other exit codes: warning with stderr clip ## What's complete now The auto-defense story is closed: - New fake committed → static rule catches it at next scan - New fake somehow bypasses static scan → daily audit catches it within 24h - Cache keeps the daily audit cheap (~5 sec warm) - Zero manual intervention required for ongoing protection ## Provenance Closes the "hypatia is a waste of time" concern raised during the 2026-05-30 snifs#30 / fake-SHA arc. ## Test plan - [ ] Module compiles + LearningScheduler still starts cleanly - [ ] First cycle after startup runs the verification (cache absent → `action_sha_verify_stale?` returns true) - [ ] Subsequent cycles within 24h skip the verification (cache fresh) - [ ] After 24h, next cycle runs verification again - [ ] Verification exit 2 (fakes found) logs warning but doesn't crash scheduler - [ ] Verification crash logs warning but doesn't crash scheduler

## Summary Hypatia carried 3 fake SHA pins in its own workflows. Self-clean so hypatia passes its own `:known_fake_action_sha` rule (#397) when scanning itself. ## Substitutions (version-faithful where possible) | Action | Sites | Fake SHA | Real replacement | Version | |---|---|---|---|---| | `Swatinem/rust-cache` | 7 (ci.yml ×4, tests.yml ×3, release.yml ×1) | `ad397744...b8db` | `9d47c6ad4b02e050fd481d890b2ea34778fd09d6` | v2.7.8 (intent preserved) | | `haskell-actions/hlint-run` | 1 (ci.yml) | `75c62c3b...6ce2` | `0b0024319753ba0c8b2fa21b7018ed252aed8181` | v2.4.9 (intent preserved) | | `haskell-actions/hlint-setup` | 1 (ci.yml) | `17f0f409...ddfa` | `fe9cd1cd1af94a23900c06738e73f6ddb092966a` | **v2.4.10 (bumped — original `# v2.7.0` was doubly fictional)** | ### Note on hlint-setup The original `haskell-actions/hlint-setup@17f0f4093d35cfdbf02aab186d51d0bb8b92ddfa # v2.7.0` was fake at TWO levels: the SHA doesn't exist, AND the version `v2.7.0` was never released. `hlint-setup`'s tag history only goes up to v2.4.10. Bumped to v2.4.10 (current latest) rather than try to preserve a version that never existed. ## Verification All three real SHAs return 200 from `gh api repos/<org>/<action>/commits/<sha>`. Verified pre-commit. ## Why one PR for both action families The round-2 estate sweep in flight (~46 repos, version-faithful substitution map) handles `rust-cache@ad397744...` but NOT `hlint-run`/`hlint-setup` — those aren't in the substitution map. Filing both fixes in one PR for hypatia means: 1. Hypatia is fully self-clean immediately (passes its own rule) 2. When the in-flight sweep reaches hypatia, it'll see `corrected_not_emitted` (no-op) — clean handoff 3. No leftover hlint fixes deferred to a separate PR ## Provenance Estate audit 2026-05-30 found 67 fake action SHA pairs; the in-flight round-2 sweep handles 26 substitutions across 46 repos; ~25 niche single-repo fakes including hlint were documented as deferred. This PR moves the 2 hlint fakes from "deferred" to "done" since they're in hypatia's own repo and there's symbolic value to hypatia being self-clean. See `project_estate_fake_action_sha_punch_list_2026_05_30.md` for the full substitution map context. ## Test plan - [ ] `gh api` returns 200 for all 3 new SHAs (verified) - [ ] No remaining fake-SHA occurrences in hypatia's workflows (verified via grep) - [ ] CI passes (rust-cache + hlint behaviour is identical, just real SHAs)

hyperpolymath enabled auto-merge (squash) May 30, 2026 17:00

hyperpolymath mentioned this pull request May 30, 2026

feat(mix): hypatia.verify_action_shas — proactive fake-SHA detection via gh api #399

Merged

4 tasks

hyperpolymath merged commit f5e0941 into main May 30, 2026
1 of 31 checks passed

hyperpolymath deleted the claude/cicd-rule-known-fake-action-shas branch May 30, 2026 17:09

hyperpolymath mentioned this pull request May 30, 2026

feat(scheduler): periodic action-SHA verification (24h cadence) #400

Merged

6 tasks

hyperpolymath mentioned this pull request May 30, 2026

fix(ci): hypatia self-clean — replace 3 fake action SHA pins #401

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cicd_rules): :known_fake_action_sha rule — block partial-prefix-corruption fakes#397

feat(cicd_rules): :known_fake_action_sha rule — block partial-prefix-corruption fakes#397
hyperpolymath merged 1 commit into
mainfrom
claude/cicd-rule-known-fake-action-shas

hyperpolymath commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hyperpolymath commented May 30, 2026

Summary

Problem

What this rule does

Tested inline (transcript)

Out of scope

Provenance

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant