ci: add Coq build oracle workflow (closes #227)#231
Merged
Conversation
Adds `.github/workflows/coq-build.yml` so the standing directive "for proof PRs, the toolchain build is the ONLY merge oracle" (`feedback_proof_pr_build_oracle_is_only_truth`) is actually machine-enforced rather than relying on human discipline. Today's incident (#226 hotfix, #227 issue, #229 follow-up): PR #224 admin-merged 7 min after PR #223 because the existing 17 governance/scanner checks were all green, while `coqc` was not run at all. Result: `tfuneff_lambda_retype_l1_m already exists` on main, blocking every other proof PR until the hotfix landed. The new workflow: - Runs on PR + push-to-main, scoped to `formal/**` paths (cheap, only fires when proofs actually change). - Uses `coqorg/coq:8.18` (matches `formal/Justfile`'s pinned toolchain; matches local dev setup). - Builds `formal/_CoqProject` via the same `coq_makefile -f _CoqProject -o build.mk && make -f build.mk` recipe `formal/Justfile all` uses — drops the `just` dependency since it's two lines. - After a clean build, runs `Print Assumptions preservation_l2_via_l1` and prints the tail of its output. Any new `Admitted.` / `Axiom` feeding the load-bearing Phase D top-level theorem shows up as a workflow-output diff between PRs — the canonical "did this PR introduce a new admit" signal. Branch-protection update to mark this REQUIRED is the owner's call; the workflow file itself ships first so the gate exists at all (one green run on main establishes the check identity for branch-rule configuration). Closes #227 Refs #229 (sibling — captures the broader "no Coq oracle" surface the present workflow now closes) Refs #226 (the hotfix that motivated this — without #227 in place, another #224-style admin-merge could break main again tomorrow) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 75 issues detected
View findings[
{
"reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in abi-verify.yml",
"type": "unknown",
"file": "abi-verify.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in boj-build.yml",
"type": "unknown",
"file": "boj-build.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in codeql.yml",
"type": "unknown",
"file": "codeql.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in coq-build.yml",
"type": "unknown",
"file": "coq-build.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in governance.yml",
"type": "unknown",
"file": "governance.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in hypatia-scan.yml",
"type": "unknown",
"file": "hypatia-scan.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in instant-sync.yml",
"type": "unknown",
"file": "instant-sync.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in mirror.yml",
"type": "unknown",
"file": "mirror.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in rust-ci.yml",
"type": "unknown",
"file": "rust-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
This was referenced May 30, 2026
Merged
hyperpolymath
added a commit
that referenced
this pull request
May 30, 2026
…rexample for TFunEff β (#234) ## Summary Ships a **mechanised proof** that preservation_l2 fails for TFunEff substituends inside fresh-region scopes, plus a design-doc addendum prescribing the conditional Phase 4c closure path. This is the L1/L2 analogue of \`formal/Counterexample.v\` (which witnesses the legacy preservation's soundness gap), providing rigorous justification for the **conditional** preservation_l2 statement that Phase 4c will ship. ## What this PR ships ### \`formal/Counterexample_L2.v\` (new) Three Qed lemmas (**zero axioms, zero admits**) witnessing the soundness gap: - \`e_before_typed\` — input \`EApp outer v2\` types via T_App_L2_Eff at \`TFunEff TUnit TUnit [] []\`. - \`e_step\` — operational step S_App_Fun reduces to \`ERegion r2 v2\`. - \`e_after_untypable\` — no L1 derivation exists for the β-result at the same outer type. **Configuration:** \`\`\`coq T1_inner = TFunEff TUnit TUnit [] [] (* R_in_v = [] *) outer = ELam T1_inner (ERegion r2 (EVar 0)) (* body introduces fresh r2 *) v2 = ELam TUnit EUnit (* a value of type T1_inner *) e_before = EApp outer v2 (* well-typed *) e_after = ERegion r2 v2 (* β-result, NOT well-typed *) \`\`\` **Structural mechanism:** T_Region_L1's \`~ In r (free_regions T)\` premise prevents the fresh r from being in \`R_in_v ⊆ free_regions(TFunEff)\`, so the post-β term's inner TFunEff value cannot re-type at the new threaded R = \`r :: R\`. ### \`formal/SUBST-LEMMA-GENERALIZATION-DESIGN.md\` addendum New §"Phase 4c addendum (2026-05-30) — conditional preservation_l2 for TFunEff β" documents the counterexample and prescribes three resolution paths: 1. **Conditional preservation_l2 (recommended)** — Phase 3b lemma takes the \`regions_introduced_by(e) ⊆ R_in_v\` precondition; Phase 4c β-case requires it; programs outside form a documented soundness-gap class. Aligns with the \`Counterexample.v\` precedent for legacy preservation. 2. **Region-polymorphic TFunEff** — type-system change deferred to a future redesign. 3. **L2 region-transfer combinator** — adds L2 expressiveness in a future PR. ### \`formal/_CoqProject\` Adds \`Counterexample_L2.v\` after \`TypingL2.v\` in the build order. ### \`.machine_readable/6a2/STATE.a2ml\` \`next_action\` shifted to Phase 3b implementation (option-(a) precondition codified by this PR's counterexample); \`last_action\` records the counterexample landing. ## Build + assumption audit - \`coqc -R . Ephapax\` clean rebuild across all 11 .v files. - \`Print Assumptions e_before_typed\` / \`e_step\` / \`e_after_untypable\` → "Closed under the global context" for all three. **Zero axioms, zero admits.** ## Owner-directive compliance (CLAUDE.md 2026-05-27) - ✅ Does not modify \`formal/Semantics.v\` or \`formal/Counterexample.v\` (the legacy soundness-gap artifacts). - ✅ Does not patch \`formal/Typing.v\` (legacy judgment). - ✅ Does not close any residual \`formal/Semantics_L1.v\` admit. - ✅ Adds NEW infrastructure (a new file) orthogonal to legacy admits, mirroring the precedent of \`Counterexample.v\` for the legacy preservation. - ✅ No new \`Axiom\` or \`Admitted\` markers. ## Why this matters Before Phase 3b implementation (~400 lines per PR #230's analysis), this counterexample **mechanically witnesses** the structural constraint the lemma must satisfy. The conditional preservation_l2 design is now rigorously justified rather than asserted on paper. ## Test plan - [x] \`coqc -R . Ephapax\` rebuild across all 11 .v files, zero errors / warnings. - [x] \`Print Assumptions\` audit shows zero axioms on all three counterexample lemmas. - [x] Configuration in \`Counterexample_L2.v\` matches the on-paper analysis in #225 comment 4583148707. - [x] First PR to be gated by the new CI Coq build oracle from PR #231. ## Refs - ephapax#225 — Phase 3b tracking issue (option (a) selected). - ephapax#225 comment 4583148707 — on-paper prototype that motivated this mechanical formalisation. - PR #230 — \`regions_introduced_by\` scaffold (option (a)'s syntactic helper). - PR #228 — Phase 4a (linear T1, no soundness gap). - PR #233 — Phase 4b (ground-non-linear T1, no soundness gap). - PR #231 — CI Coq build oracle (this PR is its first beneficiary). - \`formal/Counterexample.v\` — legacy preservation counterexample (precedent for this file's structure). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 30, 2026
hyperpolymath
added a commit
that referenced
this pull request
May 30, 2026
…243) ## Summary - Expands the \`coq-build.yml\` Print Assumptions audit from 1 theorem to 4 — surfaces \`Admitted\` / \`Axiom\` slippage on EVERY Phase D top-level result, not just \`preservation_l2_via_l1\`. - Single \`coqtop\` session amortises ML startup across four prints. - Tail bumped from 40 to 160 lines to fit four print blocks (~40 lines each). ## What changes The Print Assumptions step now invokes all four Phase D top-level results: | Theorem | File | Landed in | |---|---|---| | \`preservation_l1\` | \`Semantics_L1.v\` | (pre-Phase D — Theorem) | | \`preservation_l2_via_l1\` | \`TypingL2.v\` | (L2-hybrid landing) | | \`preservation_l2_app_eff_beta_linear\` | \`TypingL2.v\` | #228 | | \`preservation_l2_app_eff_beta_ground_nonlinear\` | \`TypingL2.v\` | #233 | ## Why this matters \`coq-build.yml\` shipped in #231 (closed #227) covered only \`preservation_l2_via_l1\`. The original acceptance criteria called for the four-theorem set so that **every** Phase D result the build oracle backs is audited per-PR. Without this, a new \`Admitted.\` in a β-case lemma would slip through with green CI as long as \`preservation_l2_via_l1\`'s assumption set didn't change. ## Relationship to other in-flight work - **#236** (queued, auto-merge SQUASH armed): adds \`container.options: --user root\` to fix EACCES on \`actions/checkout\`. Touches a different section of \`coq-build.yml\` than this PR — no expected conflict either way. Once #236 lands, the workflow becomes actually-runnable; once this lands, the runnable workflow audits four theorems instead of one. - **Ruleset gap** (owner action): the \`Base\` ruleset has \`deletion\`, \`non_fast_forward\`, \`required_signatures\`, \`pull_request\` rules but **no \`required_status_checks\` rule**. Even with this PR + #236 merged, \`Coq build — formal/_CoqProject\` remains informational. The standing rule \`[[feedback_proof_pr_build_oracle_is_only_truth]]\` remains load-bearing until the ruleset adds the check as required. ## Test plan - [ ] CI: \`Coq build — formal/_CoqProject\` reaches the Print Assumptions step (gated on #236 landing first to fix EACCES, then the four-theorem step runs) - [ ] Run log shows four \`Print Assumptions <theorem>.\` outputs in sequence (one consolidated coqtop transcript) - [ ] No build regression — \`coq_makefile\` step still passes (no Coq source changed) ## Refs - Closes the per-theorem coverage gap from #227's original acceptance criteria - Stacks complementarily on #236 (different file region) - Standing rule [[feedback_proof_pr_build_oracle_is_only_truth]] still applies 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
May 30, 2026
…locker) (#254) ## Summary Closes the path-filter blocker that prevents #244 (the owner action that adds `required_status_checks` to the Base ruleset on `main`) from landing without collateral damage. The `Coq build — formal/_CoqProject` check shipped in #231 is path-filtered to `formal/**` + the workflow file itself. Once the Base ruleset (id `14285235`) gains a `required_status_checks` rule naming that context, every PR that does NOT touch `formal/**` would leave the check "expected but not reported" and be blocked indefinitely. Fix: refactor to the always-on aggregator pattern. The single new context that goes into the ruleset is `coq-build-gate`; it ALWAYS runs and reports the right verdict regardless of which paths changed. ## Mechanics 1. `on:` block drops the `paths:` filter — workflow triggers on every PR and every push to main. 2. New `detect-relevant-changes` job diffs against the PR base (`github.event.pull_request.base.sha` / `github.event.before`) and sets `outputs.relevant = true|false` using the same path list that lived in the on-trigger (`formal/` + this workflow). 3. The existing `coq-build` job gains `needs: detect-relevant-changes` + `if: needs.detect-relevant-changes.outputs.relevant == 'true'`. It skips cleanly on irrelevant PRs. 4. New `coq-build-gate` job (`if: always()`, `needs: [detect, coq-build]`): - SUCCESS immediately if `RELEVANT != 'true'`. - FAILURE if `detect-relevant-changes` itself didn't succeed (fail-safe). - SUCCESS iff `coq-build.result == 'success'` on a relevant change. - FAILURE otherwise. The `coq-build` job itself is untouched: same `coqorg/coq:8.18` container with `--user root`, same `coq_makefile -f _CoqProject -o build.mk && make`, same `Print Assumptions` audit of the four Phase D theorems (`preservation_l1`, `preservation_l2_via_l1`, `preservation_l2_app_eff_beta_linear`, `preservation_l2_app_eff_beta_ground_nonlinear`). ## Ruleset migration (post-merge, separate operation) Once this PR is green on main, the #244 owner action becomes: ```bash # Append required_status_checks naming `coq-build-gate` (NOT the underlying # `Coq build — formal/_CoqProject`, because that one skips on irrelevant PRs # and would re-introduce the exact "expected but not reported" block this # PR exists to avoid). gh api -X PUT repos/hyperpolymath/ephapax/rulesets/14285235 --input /tmp/patch.json ``` ## Test plan - [ ] `coq-build-gate` green on this PR (path is `.github/workflows/coq-build.yml` → relevant=true → coq-build runs → gate aggregates). - [ ] After merge, a no-formal-change PR shows `coq-build-gate` green while `coq-build` is "Skipped". - [ ] After merge, the #244 owner action switches the Base ruleset's required context from absent to `coq-build-gate`. - [ ] An intentional broken-build PR (touching `formal/`) shows `coq-build-gate` red and admin-merge blocked. ## Prior-art panic-attack#90 (already landed) applied the same pattern to `chapel-ci.yml`. This PR is the single-job variant of the same refactor. ## Refs - #244 — owner action this PR unblocks - #231 — `coq-build.yml` ships - #236 — EACCES container fix - #243 — Print Assumptions audit expansion - panic-attack#90 — prior-art (chapel-ci aggregator) Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #227 — adds
.github/workflows/coq-build.ymlso the standing directive "for proof PRs, the toolchain build is the ONLY merge oracle" is machine-enforced.Today's incident motivating this
PR #224 admin-merged 7 min after PR #223; all 17 governance/scanner checks were green;
coqcwas not run; main broke withtfuneff_lambda_retype_l1_m already exists. Hotfix #226 restored compilation; #227 was filed in parallel asking for the oracle gap to be closed at workflow-level.What the workflow does
formal/**paths (only fires when proofs change).coqorg/coq:8.18(matchesformal/Justfile's pinned toolchain).formal/Justfile alluses:Print Assumptions preservation_l2_via_l1and prints the tail. Any newAdmitted./Axiomslippage shows as a workflow-output diff.What this PR does NOT do
Test plan
formal/_CoqProjectcleanPrint Assumptions preservation_l2_via_l1output appears in the logCoq build — formal/_CoqProjectas a required checkRefs
🤖 Generated with Claude Code