Skip to content

Governance baseline gate never consumes .hypatia-baseline.json content — affects 15+ repos #566

Description

@hyperpolymath

Summary

The validate-hypatia-baseline job in standards's shared governance-reusable.yml is structurally incapable of passing for any repo with real findings, regardless of what's recorded in .hypatia-baseline.json. This is not a regression — it appears the "consume the baseline" feature (commit 91d8b88, "feat: consume .hypatia-baseline.json in governance gate (#166)") was implemented for one job (the banned-language-file glob exemptions in the "Language / package anti-pattern policy" job) but never generalized to this one, despite the name and error message implying it does baseline-diffing.

Discovered while investigating a Copilot-diagnosed "failing job" report on standards PR #448 (job 84509134825) — Copilot correctly described the symptom (143 findings, exit 1) but didn't catch that main itself has failed this exact check on every run for 5+ days (2026-06-27 onward), independent of any PR content.

Root cause (with evidence)

standards/.github/workflows/governance-reusable.yml, the validate-hypatia-baseline job:

HYPATIA_FORMAT=json "$HOME/hypatia/hypatia-cli.sh" scan . > hypatia-findings.json
FINDING_COUNT=$(jq '. | length' hypatia-findings.json 2>/dev/null || echo 0)
if [ "$FINDING_COUNT" -gt 0 ]; then
  echo "::error::Baseline validation failed. Found $FINDING_COUNT findings not in baseline."
  exit 1
fi

This never reads .hypatia-baseline.json at all — it fails on any raw scan finding (severity >= medium) with no diffing against previously-accepted findings.

On the scanner side, hypatia/lib/hypatia/scanner_suppression.ex confirms .hypatia-baseline.json is listed only in @universal_excludes — i.e. the scanner excludes the file itself from being scanned as content, exactly like .git/, node_modules/, etc. Its content is never read anywhere to suppress prior findings. The module's own docstring says findings there are meant to be "tracked and reviewed" — but no code path does that reviewing. This is a design gap, not wiring drift: the mechanism for "only fail on NEW findings" was never built for this job.

Confirmed blast radius

15 of 288 top-level repos carry .hypatia-baseline.json; 13 have real curated content (someone did the work of building an exemption list expecting it to filter):

Repo Baseline severity sample
ambientops critical
bofig critical
coq-jr critical
developer-ecosystem high
gitbot-fleet critical
hypatia low
hypatia-attest low
idaptik high
neurophone [] empty
proof-of-work critical
proof-of-work-attest critical
proven high
somethings-fishy critical
standards [] empty (fixed locally in #449)
stapeln info

Every one of these repos' Governance workflow will permanently fail this job whenever the scanner finds anything at medium+ severity, regardless of what's already accepted in their baseline — likely a significant, previously-unattributed source of red CI across the estate. (Nested/grouped-directory repos not in this top-level count weren't scanned for this issue — likely a larger true total.)

Immediate mitigation done

standards PR #449 removes the (empty, non-functional) .hypatia-baseline.json trigger file, which disables the gate for that repo only, pending this proper fix. Narrowly scoped — does not touch the shared governance-reusable.yml. Left un-armed for owner review since it changes security-gate enforcement for the whole repo, even though the gate was already non-functional.

Recommended fix

In the validate-hypatia-baseline job, mirror the pattern already correct in the sibling "Language / package anti-pattern policy" job: read .hypatia-baseline.json, and only fail on findings whose (rule_module, rule_type, file) (or equivalent identity) is not present in the baseline. Then:

  1. Land the CI fix in governance-reusable.yml (standards).
  2. Re-run governance on all 15 repos above; for each, decide whether currently-uncaptured findings should be triaged (fixed) or added to that repo's baseline as accepted debt — this is a per-repo judgment call, likely worth a repo-by-repo pass.
  3. Consider whether standards' own now-empty baseline (143 findings, 54 critical) should be populated as accepted debt or triaged directly — deferred here deliberately, not decided.

Related: #369 (RFC: wire code-scanning alerts into the dispatch pipeline) — same family of "detection without a routed work-order" gap identified in the estate-wide issue-backlog RCA (2026-06-24): fan-out without fan-in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions