Skip to content

fix(ci): un-stale the Hypatia scanner cache + tolerate header-less verbatim LICENSE#441

Merged
hyperpolymath merged 4 commits into
mainfrom
claude/hypatia-scan-cache-licence-hzy9m5
Jun 27, 2026
Merged

fix(ci): un-stale the Hypatia scanner cache + tolerate header-less verbatim LICENSE#441
hyperpolymath merged 4 commits into
mainfrom
claude/hypatia-scan-cache-licence-hzy9m5

Conversation

@hyperpolymath

Copy link
Copy Markdown
Owner

Summary

Three focused CI/governance fixes for the Hypatia scan + licence pipeline.

1. Source-pin the Hypatia scanner cache (two workflows)

hypatia-scan-reusable.yml and the validate-hypatia-baseline job in
governance-reusable.yml both cached ~/.mix, ~/.hex and ~/hypatia
under the keyless key hypatia-scanner-v2-${{ runner.os }}-build. Because
the clone/build steps are guarded by if [ ! -d ] / if [ ! -x ], once
that key was populated the first scanner build ever cached was restored
and reused forever
— the clone/build never re-ran, so scanner fixes
(e.g. the SD022 path-drift precision fix, hypatia#545) never took effect
in CI.

Fix: resolve Hypatia's current main tip with git ls-remote ... HEAD
before the cache step and fold that SHA into the key
(hypatia-scanner-v3-${{ runner.os }}-<sha>). When Hypatia main advances
the key changes, the cache misses, and the clone + escript build re-run.
No restore-keys, by design — a partial restore would repopulate
~/hypatia and the guards would skip the rebuild, reintroducing the
staleness.

2. Tolerate a verbatim, header-less LICENSE in the consistency check

scripts/check-licence-consistency.sh required an
SPDX-License-Identifier header inside the LICENSE file and failed
without one. But the estate template ships LICENSE as plain, unmodified
MPL-2.0 text with no header (SPDX identifiers belong in source files, not
in the canonical upstream licence text), so the licence-consistency
governance job went red on every PR in template-based repos — hypatia,
hermeneia and gitbot-fleet all carry a header-less verbatim MPL-2.0
LICENSE.

The script now establishes the licence identity from either the SPDX
header (when present) or the body-text classification (for a verbatim,
header-less file). The manifest cross-check runs against that identity, so
manifest mismatch is still caught with or without a header. When a header
is present, the body-vs-header drift checks are unchanged (SPDX=MPL-2.0

  • body=PMPL still fails). Only a header-less file whose body matches no
    known template is now reported as an error.

This is a script-logic change only — no LICENSE text or SPDX header is
edited
(owner licence guardrail honoured).

Verification

  • Ran the modified script against hypatia / standards / hermeneia /
    gitbot-fleet — all four pass (hypatia/hermeneia/fleet via the new
    header-less path; standards via its existing SPDX header).
  • Ran five negative fixtures — all still fail loudly: PMPL-under-an
    -MPL-header drift, unidentifiable header-less body, manifest mismatch
    (with and without a header), and a missing LICENSE file.
  • scripts/build-registry.sh --check reports the registry + topology are
    already in sync (no regen needed).

Out of scope (noted for follow-up, not in this PR)

  • The estate-wide refs/pull/<n>/merge checkout failure in
    governance-reusable.yml (8/10 governance jobs error at checkout on
    every PR) — diagnosed; fix proposed separately because it changes
    enforcement behaviour across all consumer repos.
  • The validate-hypatia-baseline self-scan reporting ~105 findings on
    standards — separate triage.

🤖 Generated with Claude Code


Generated by Claude Code

claude added 3 commits June 27, 2026 18:33
The reusable scan workflow cached ~/.mix, ~/.hex and ~/hypatia under the
keyless key `hypatia-scanner-v2-${{ runner.os }}-build`. Because the clone
and build steps are guarded by `if [ ! -d ]` / `if [ ! -x ]`, once that
key was populated the very first scanner build ever cached was restored
and reused on every subsequent run — the clone/build never re-ran, so
scanner fixes (e.g. the SD022 path-drift precision fix, hypatia#545)
never took effect in CI.

Resolve Hypatia's current main tip with `git ls-remote ... HEAD` before
the cache step and fold that SHA into the key
(`hypatia-scanner-v3-${{ runner.os }}-<sha>`). When Hypatia main advances
the key changes, the cache misses, and the clone + escript build re-run.
No restore-keys, by design: a partial restore would repopulate ~/hypatia
and the guards would skip the rebuild, reintroducing the staleness.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017wGTeLwiBGJ5rETC3QT4Pm
… check

check-licence-consistency.sh required an SPDX-License-Identifier header
inside the LICENSE file and failed without one. But the estate template
ships LICENSE as plain, unmodified MPL-2.0 text with no header (SPDX
identifiers belong in source files, not in the canonical upstream licence
text), so the `licence-consistency` governance job went red on every PR
in repos using the template — hypatia, hermeneia and gitbot-fleet all
carry a header-less verbatim MPL-2.0 LICENSE.

Establish the licence identity from EITHER the SPDX header (when present)
OR the body-text classification (for a verbatim, header-less file). The
manifest cross-check now runs against that identity, so manifest mismatch
is still caught with or without a header. When a header IS present the
body-vs-header drift checks are unchanged (SPDX=MPL-2.0 + body=PMPL still
fails). Only a header-less file whose body matches no known template is
now reported as an error.

This is a script-logic change only — no LICENSE text or SPDX header is
edited (owner licence guardrail honoured).

Verified against hypatia/standards/hermeneia/gitbot-fleet (all pass) plus
five negative fixtures (PMPL-under-MPL-header drift, unidentifiable body,
manifest mismatch with and without header, missing LICENSE) — all still
fail loudly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017wGTeLwiBGJ5rETC3QT4Pm
The `validate-hypatia-baseline` job carried the same keyless
`hypatia-scanner-v2-${{ runner.os }}-build` cache + `if [ ! -d ]` /
`if [ ! -x ]` guards as the standalone scan workflow, so baseline
validation ran against whatever scanner was first cached — a stale
ruleset that never picked up rule changes. Apply the same source-pinned
key (resolve hypatia HEAD via git ls-remote, fold the SHA into the key)
so the scanner rebuilds when hypatia main advances.

Note: this makes the baseline job use the *current* scanner; it does not
by itself resolve the separate pre-existing self-scan finding count on
standards (tracked separately).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017wGTeLwiBGJ5rETC3QT4Pm
@hyperpolymath hyperpolymath marked this pull request as ready for review June 27, 2026 18:42
@hyperpolymath hyperpolymath merged commit e9c8888 into main Jun 27, 2026
@hyperpolymath hyperpolymath deleted the claude/hypatia-scan-cache-licence-hzy9m5 branch June 27, 2026 18:43
@sonarqubecloud

Copy link
Copy Markdown

hyperpolymath added a commit that referenced this pull request Jun 27, 2026
… enforcement) (#442)

## Summary

Fixes the estate-wide governance failure where **8 of 10 jobs in
`governance-reusable.yml` die at checkout on every PR** with:

```
fatal: couldn't find remote ref refs/pull/<n>/merge
```

(This is the `refs/pull/<n>/merge` item flagged as out-of-scope in #441
— now resolved with owner approval.)

## Root cause

`governance.yml` triggers on `pull_request` and calls
`governance-reusable.yml` via `workflow_call`. Inside a **reusable**
workflow, `github.ref` inherits the caller's PR ref, which on a
`pull_request` event is the *named merge ref* `refs/pull/<n>/merge`. The
8 jobs that pass `ref: ${{ github.ref }}` to `actions/checkout`
therefore ask it to `git fetch refs/pull/<n>/merge` — a named ref
checkout cannot resolve — and fail at the checkout step.

The two jobs that **omit** an explicit `ref:` (`workflow-staleness`,
`validate-hypatia-baseline`) use checkout's default and were unaffected
— exactly matching the observed **8/10**.

**Consequence today:** governance is effectively **ungated on PRs** —
only `push`-to-`main` runs enforce.

## Fix

Pin the 8 caller-repo checkouts to **`ref: ${{ github.sha }}`** — the
concrete event commit (the PR merge commit on `pull_request`, the pushed
commit on `push`). It resolves to the *same* commit
`refs/pull/<n>/merge` points at, but is always fetchable. A short
comment at each site explains why `github.ref` must not be used here.

- Affected jobs: `language-policy`, `package-policy`, `security-policy`,
`quality`, `wellknown`, `workflow-lint`, `trusted-base`,
`licence-consistency`.
- The 3 secondary `repository: hyperpolymath/standards` + `ref: main`
checkouts (which fetch the check scripts) are **untouched**.
- Content/enforcement semantics are preserved: the diff-based step
(`quality`'s trufflehog) already uses explicit `github.sha` /
`github.event.pull_request.base.sha`, independent of the checkout ref.

## Blast radius (owner-acknowledged)

This **re-enables PR-time governance enforcement across every repo**
consuming the reusable workflow. PRs that were silently passing (because
the jobs never ran) may now surface real governance violations. Approved
by owner 2026-06-27.

## Verification

- `python3 -c "import yaml; yaml.safe_load(...)"` → **YAML OK**.
- Structural check: 13 checkout steps total — 2 bare (unchanged), 8 now
`ref: ${{ github.sha }}`, 3 standards-checkout `ref: main` (unchanged).
No `ref: ${{ github.ref }}` remains.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---
_Generated by [Claude
Code](https://claude.ai/code/session_01MJdfXv5E5gwGD2yaJq8jRM)_

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants