Skip to content

feat(governance): add hypatia-scan-reusable.yml — biggest LOC leverage of the reusable trilogy#193

Open
hyperpolymath wants to merge 2 commits into
mainfrom
feat/hypatia-scan-reusable
Open

feat(governance): add hypatia-scan-reusable.yml — biggest LOC leverage of the reusable trilogy#193
hyperpolymath wants to merge 2 commits into
mainfrom
feat/hypatia-scan-reusable

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Fourth and largest-leverage reusable in the workflow-convergence campaign (#168#174#187#190#192 → this). Targets hypatia-scan.yml, the 416-line estate-wide Hypatia neurosymbolic security scanner.

Drift survey

Full pagination of gh api /search/code against org:hyperpolymath:

Metric Value
Total deployments 255
Unique blob SHAs 30
Structural drift 11.8% — lowest of all 5 surveyed templates
Top 5 SHAs coverage 213/255 (83.5%)
Top-SHA share 100 repos (39.2%)

Feature variance: zero

Sampled top 7 + long-tail 10 variants — every single one carries exactly one scan job. Line counts range 207-416, but this is pure propagation lag: older repos run an earlier slimmer version of the same monolithic job; newer repos run the 413-416-line current canonical.

No customisation, no per-repo extras, no missing jobs in the long tail.

Leverage — biggest of the convergence campaign

PR Per-repo lines Wrappers LOC retired
#187 mirror 145 → 12 ~267 ~35,500
#190 secret-scanner ~80 → 12 ~275 ~19,000
#192 codeql 49 → 5 ~245 ~10,800
#193 hypatia-scan 416 → 16 ~235 ~94,000

This single PR's downstream sweep retires more workflow LOC than #187 + #190 + #192 combined.

Design

Zero inputs except runs-on. The scan job body is byte-identical to the canonical hypatia-scan.yml — no per-repo values to parameterise; every interpolation is ${{ github.* }} or ${{ secrets.* }} which resolve in the caller context.

Caller MUST:

  • Use secrets: inherit — so GITHUB_TOKEN + HYPATIA_DISPATCH_PAT flow through. Without inherit, the Phase-2 gitbot-fleet submission step silently no-ops (it's continue-on-error-guarded), and the DependabotAlerts rule loses read access (HTTP 403).
  • Grant contents: read + security-events: write + pull-requests: write at the call-site permissions block. Called-workflow permissions are CAPPED by caller — without security-events: write at the call site, SARIF upload to Security → Code scanning silently fails.

Caller wrapper shape (post-merge)

# SPDX-License-Identifier: PMPL-1.0-or-later
name: Hypatia Security Scan
on:
  push:
    branches: [ main, master, develop ]
  pull_request:
    branches: [ main, master ]
  schedule:
    - cron: '0 0 * * 0'
  workflow_dispatch:
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
permissions:
  contents: read
  security-events: write
  pull-requests: write
jobs:
  scan:
    uses: hyperpolymath/standards/.github/workflows/hypatia-scan-reusable.yml@<sha>
    secrets: inherit

~16 lines per repo, replacing ~250-416 lines (depending on which propagation snapshot the repo currently carries).

Rollout plan

NOT started in this PR — owner-gated, same as #187 / #190 / #192.

Wave Repos Action
1: current-canonical ~211 Top 6 SHAs at 413-416 lines. Pure mechanical wrapper.
2: standardize-up older ~24 207-253-line older versions; convert to same wrapper — body of the scan job auto-upgrades on next workflow run.
3: per-repo review ~2 hypatia repo itself (345-line — likely a development snapshot), standards (the canonical source). Exclude from sweep.

Total expected sweep: ~235 mechanical wrappers (92.2%) + 2 excluded + minor per-repo review.

Pattern hardening

Note on parallel-session count discrepancy

A separate session's audit memory ([[project_foundational_workflow_survey_2026_05_26]]) recorded "hypatia-scan: 702 copies × 416 lines × HIGH homogeneity". My methodology (gh api /search/code paginated with filename:hypatia-scan.yml path:.github/workflows org:hyperpolymath) returns 255. The HIGH homogeneity finding agrees in both surveys; the 702 figure is likely scheduled-run count or includes historical branches. Either way the leverage doesn't change much — this is the biggest single LOC removal in the campaign.

🤖 Generated with Claude Code

…e of the reusable trilogy

Closes the four-PR reusable-workflow convergence campaign:
  #168#174#187 (mirror) → #190 (secret-scanner) → #192 (codeql)
  → this (#193 hypatia-scan).

Drift survey (gh api /search/code paginated over org:hyperpolymath,
blob-SHA grouped):
- 255 deployments
- 30 unique blob SHAs (11.8% drift — LOWEST of all 5 surveyed templates)
- Top 5 SHAs cover 213/255 (83.5%)
- Top SHA alone covers 100 repos (39.2%)

Feature variance: zero. Every sampled variant (top 7 + long-tail 10)
carries exactly one `scan` job; line counts range 207-416 but this is
pure propagation lag — older repos carry an earlier slimmer version of
the same monolithic job, newer repos carry the 413-416-line canonical.

Leverage: 416-line canonical × ~235 mechanical wrappers retired
≈ ~98,000 lines of duplicated workflow code removed estate-wide. This
is 5× the secret-scanner sweep (~19k) and 3× the mirror sweep (~35k).

Design: zero inputs except `runs-on`. The scan job body is byte-
identical to the canonical hypatia-scan.yml — no per-repo values to
parameterise; everything is `${{ github.* }}` / `${{ secrets.* }}`
which resolve in the caller context.

Caller MUST use `secrets: inherit` so GITHUB_TOKEN +
HYPATIA_DISPATCH_PAT flow through, AND grant `security-events: write`
+ `pull-requests: write` at the call-site permissions block (called
workflow permissions are capped by caller).

Per-call note: the long-tail 250-line variants (~25 repos) are older
versions of the same workflow, NOT customised. Standardize-up safely
in the wrapper sweep.
Same as #192 (codeql-reusable) — auto-merge enabled but zero workflow
runs against the head commit. Pushing empty commit to re-trigger CI.
hyperpolymath added a commit that referenced this pull request May 26, 2026
…ergence set (#205)

## Summary

5th and final reusable in the workflow convergence campaign (see #199
for the meta-doc). Consolidates the per-repo `scorecard.yml` workflow.

## Drift signal (full pagination + per-repo verified)

- **258** top-level estate deployments
- **626** nested copies in monorepos (asdf-tool-plugins,
developer-ecosystem, ssg-collection, standards, ambientops,
julia-ecosystem, etc. — Layer-2 truncation discovery via #204's helper)
- **46** unique blob SHAs / 17.8% structural drift
- Top SHA covers **100/258 (38.8%)** — highest dominant-cluster of the 5
campaigns
- Top 7 SHAs cover ~80%
- **100% mechanical drift, ZERO feature variance** — SPDX header
(PMPL-1.0 / PMPL-1.0-or-later / MPL-2.0), `upload-sarif` SHA-pin churn,
`permissions: read-all` vs `contents: read` wording

## Design

- One input: `runs-on` (default ubuntu-latest)
- No `secrets: inherit` — Scorecard uses `GITHUB_TOKEN` directly
- Caller MUST grant `security-events: write` + `id-token: write` on the
calling job (called-workflow permissions are capped by caller)
- Caller keeps own `on:` triggers + `concurrency:` group

## Per Layer-3 caveat from the campaign meta-doc

Nested workflows are inert — GitHub Actions only runs
`.github/workflows/` at the repo root. Sweeping the 626 nested copies is
single-source-of-truth cleanup, not security hardening.

## Campaign convergence set (closes with this PR)

| PR | Template |
|---|---|
| #187 | mirror-reusable.yml |
| #190 | secret-scanner-reusable.yml |
| #192 | codeql-reusable.yml |
| #193 | hypatia-scan-reusable.yml |
| #194 | sweep-classifier scripts |
| #199 | campaign meta-doc |
| #204 | list-workflow-paths.sh (bypass /search/code undercount) |
| **this** | **scorecard-reusable.yml** |

## Test plan

- [ ] Wrapper sweep (~258 top-level + ~626 nested) — owner-gated; not
part of this PR
- [ ] Update classify-* scripts to consume helper TSV — follow-up

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant