Skip to content

ci: add Coq build oracle workflow (closes #227)#231

Merged
hyperpolymath merged 1 commit into
mainfrom
ci/coq-build-oracle-227
May 30, 2026
Merged

ci: add Coq build oracle workflow (closes #227)#231
hyperpolymath merged 1 commit into
mainfrom
ci/coq-build-oracle-227

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Closes #227 — adds .github/workflows/coq-build.yml so the standing directive "for proof PRs, the toolchain build is the ONLY merge oracle" is machine-enforced.

Today's incident motivating this

PR #224 admin-merged 7 min after PR #223; all 17 governance/scanner checks were green; coqc was not run; main broke with tfuneff_lambda_retype_l1_m already exists. Hotfix #226 restored compilation; #227 was filed in parallel asking for the oracle gap to be closed at workflow-level.

What the workflow does

  • Triggers on PR + push-to-main, scoped to formal/** paths (only fires when proofs change).
  • Container: coqorg/coq:8.18 (matches formal/Justfile's pinned toolchain).
  • Builds via the same recipe formal/Justfile all uses:
    coq_makefile -f _CoqProject -o build.mk
    make -f build.mk
    
  • After a clean build, runs Print Assumptions preservation_l2_via_l1 and prints the tail. Any new Admitted. / Axiom slippage shows as a workflow-output diff.

What this PR does NOT do

  • Branch-protection / required-check designation is owner's call (admin-only in GitHub Settings). The workflow file ships first so the check identity exists; one green run on main lets you mark it required.

Test plan

  • First CI run on this PR builds formal/_CoqProject clean
  • Print Assumptions preservation_l2_via_l1 output appears in the log
  • (Manual, post-merge) Mark Coq build — formal/_CoqProject as a required check

Refs

🤖 Generated with Claude Code

Adds `.github/workflows/coq-build.yml` so the standing directive "for
proof PRs, the toolchain build is the ONLY merge oracle"
(`feedback_proof_pr_build_oracle_is_only_truth`) is actually
machine-enforced rather than relying on human discipline.

Today's incident (#226 hotfix, #227 issue, #229 follow-up): PR #224
admin-merged 7 min after PR #223 because the existing 17
governance/scanner checks were all green, while `coqc` was not run at
all. Result: `tfuneff_lambda_retype_l1_m already exists` on main,
blocking every other proof PR until the hotfix landed.

The new workflow:

- Runs on PR + push-to-main, scoped to `formal/**` paths (cheap, only
  fires when proofs actually change).
- Uses `coqorg/coq:8.18` (matches `formal/Justfile`'s pinned
  toolchain; matches local dev setup).
- Builds `formal/_CoqProject` via the same `coq_makefile -f
  _CoqProject -o build.mk && make -f build.mk` recipe `formal/Justfile
  all` uses — drops the `just` dependency since it's two lines.
- After a clean build, runs `Print Assumptions
  preservation_l2_via_l1` and prints the tail of its output. Any new
  `Admitted.` / `Axiom` feeding the load-bearing Phase D top-level
  theorem shows up as a workflow-output diff between PRs — the
  canonical "did this PR introduce a new admit" signal.

Branch-protection update to mark this REQUIRED is the owner's call;
the workflow file itself ships first so the gate exists at all (one
green run on main establishes the check identity for branch-rule
configuration).

Closes #227
Refs #229 (sibling — captures the broader "no Coq oracle" surface
the present workflow now closes)
Refs #226 (the hotfix that motivated this — without #227 in place,
another #224-style admin-merge could break main again tomorrow)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath enabled auto-merge (squash) May 30, 2026 14:36
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 75 issues detected

Severity Count
🔴 Critical 13
🟠 High 13
🟡 Medium 49

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in abi-verify.yml",
    "type": "unknown",
    "file": "abi-verify.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in boj-build.yml",
    "type": "unknown",
    "file": "boj-build.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql.yml",
    "type": "unknown",
    "file": "codeql.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in coq-build.yml",
    "type": "unknown",
    "file": "coq-build.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in governance.yml",
    "type": "unknown",
    "file": "governance.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in hypatia-scan.yml",
    "type": "unknown",
    "file": "hypatia-scan.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in instant-sync.yml",
    "type": "unknown",
    "file": "instant-sync.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in mirror.yml",
    "type": "unknown",
    "file": "mirror.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in rust-ci.yml",
    "type": "unknown",
    "file": "rust-ci.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

@hyperpolymath hyperpolymath merged commit 6db1f6d into main May 30, 2026
17 of 18 checks passed
@hyperpolymath hyperpolymath deleted the ci/coq-build-oracle-227 branch May 30, 2026 14:37
hyperpolymath added a commit that referenced this pull request May 30, 2026
…rexample for TFunEff β (#234)

## Summary

Ships a **mechanised proof** that preservation_l2 fails for TFunEff
substituends inside fresh-region scopes, plus a design-doc addendum
prescribing the conditional Phase 4c closure path.

This is the L1/L2 analogue of \`formal/Counterexample.v\` (which
witnesses the legacy preservation's soundness gap), providing rigorous
justification for the **conditional** preservation_l2 statement that
Phase 4c will ship.

## What this PR ships

### \`formal/Counterexample_L2.v\` (new)

Three Qed lemmas (**zero axioms, zero admits**) witnessing the soundness
gap:

- \`e_before_typed\` — input \`EApp outer v2\` types via T_App_L2_Eff at
\`TFunEff TUnit TUnit [] []\`.
- \`e_step\` — operational step S_App_Fun reduces to \`ERegion r2 v2\`.
- \`e_after_untypable\` — no L1 derivation exists for the β-result at
the same outer type.

**Configuration:**

\`\`\`coq
T1_inner = TFunEff TUnit TUnit [] [] (* R_in_v = [] *)
outer = ELam T1_inner (ERegion r2 (EVar 0)) (* body introduces fresh r2
*)
v2 = ELam TUnit EUnit (* a value of type T1_inner *)
e_before = EApp outer v2          (* well-typed *)
e_after  = ERegion r2 v2          (* β-result, NOT well-typed *)
\`\`\`

**Structural mechanism:** T_Region_L1's \`~ In r (free_regions T)\`
premise prevents the fresh r from being in \`R_in_v ⊆
free_regions(TFunEff)\`, so the post-β term's inner TFunEff value cannot
re-type at the new threaded R = \`r :: R\`.

### \`formal/SUBST-LEMMA-GENERALIZATION-DESIGN.md\` addendum

New §"Phase 4c addendum (2026-05-30) — conditional preservation_l2 for
TFunEff β" documents the counterexample and prescribes three resolution
paths:

1. **Conditional preservation_l2 (recommended)** — Phase 3b lemma takes
the \`regions_introduced_by(e) ⊆ R_in_v\` precondition; Phase 4c β-case
requires it; programs outside form a documented soundness-gap class.
Aligns with the \`Counterexample.v\` precedent for legacy preservation.
2. **Region-polymorphic TFunEff** — type-system change deferred to a
future redesign.
3. **L2 region-transfer combinator** — adds L2 expressiveness in a
future PR.

### \`formal/_CoqProject\`

Adds \`Counterexample_L2.v\` after \`TypingL2.v\` in the build order.

### \`.machine_readable/6a2/STATE.a2ml\`

\`next_action\` shifted to Phase 3b implementation (option-(a)
precondition codified by this PR's counterexample); \`last_action\`
records the counterexample landing.

## Build + assumption audit

- \`coqc -R . Ephapax\` clean rebuild across all 11 .v files.
- \`Print Assumptions e_before_typed\` / \`e_step\` /
\`e_after_untypable\` → "Closed under the global context" for all three.
**Zero axioms, zero admits.**

## Owner-directive compliance (CLAUDE.md 2026-05-27)

- ✅ Does not modify \`formal/Semantics.v\` or
\`formal/Counterexample.v\` (the legacy soundness-gap artifacts).
- ✅ Does not patch \`formal/Typing.v\` (legacy judgment).
- ✅ Does not close any residual \`formal/Semantics_L1.v\` admit.
- ✅ Adds NEW infrastructure (a new file) orthogonal to legacy admits,
mirroring the precedent of \`Counterexample.v\` for the legacy
preservation.
- ✅ No new \`Axiom\` or \`Admitted\` markers.

## Why this matters

Before Phase 3b implementation (~400 lines per PR #230's analysis), this
counterexample **mechanically witnesses** the structural constraint the
lemma must satisfy. The conditional preservation_l2 design is now
rigorously justified rather than asserted on paper.

## Test plan

- [x] \`coqc -R . Ephapax\` rebuild across all 11 .v files, zero errors
/ warnings.
- [x] \`Print Assumptions\` audit shows zero axioms on all three
counterexample lemmas.
- [x] Configuration in \`Counterexample_L2.v\` matches the on-paper
analysis in #225 comment 4583148707.
- [x] First PR to be gated by the new CI Coq build oracle from PR #231.

## Refs

- ephapax#225 — Phase 3b tracking issue (option (a) selected).
- ephapax#225 comment 4583148707 — on-paper prototype that motivated
this mechanical formalisation.
- PR #230 — \`regions_introduced_by\` scaffold (option (a)'s syntactic
helper).
- PR #228 — Phase 4a (linear T1, no soundness gap).
- PR #233 — Phase 4b (ground-non-linear T1, no soundness gap).
- PR #231 — CI Coq build oracle (this PR is its first beneficiary).
- \`formal/Counterexample.v\` — legacy preservation counterexample
(precedent for this file's structure).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 30, 2026
…243)

## Summary

- Expands the \`coq-build.yml\` Print Assumptions audit from 1 theorem
to 4 — surfaces \`Admitted\` / \`Axiom\` slippage on EVERY Phase D
top-level result, not just \`preservation_l2_via_l1\`.
- Single \`coqtop\` session amortises ML startup across four prints.
- Tail bumped from 40 to 160 lines to fit four print blocks (~40 lines
each).

## What changes

The Print Assumptions step now invokes all four Phase D top-level
results:

| Theorem | File | Landed in |
|---|---|---|
| \`preservation_l1\` | \`Semantics_L1.v\` | (pre-Phase D — Theorem) |
| \`preservation_l2_via_l1\` | \`TypingL2.v\` | (L2-hybrid landing) |
| \`preservation_l2_app_eff_beta_linear\` | \`TypingL2.v\` | #228 |
| \`preservation_l2_app_eff_beta_ground_nonlinear\` | \`TypingL2.v\` |
#233 |

## Why this matters

\`coq-build.yml\` shipped in #231 (closed #227) covered only
\`preservation_l2_via_l1\`. The original acceptance criteria called for
the four-theorem set so that **every** Phase D result the build oracle
backs is audited per-PR. Without this, a new \`Admitted.\` in a β-case
lemma would slip through with green CI as long as
\`preservation_l2_via_l1\`'s assumption set didn't change.

## Relationship to other in-flight work

- **#236** (queued, auto-merge SQUASH armed): adds \`container.options:
--user root\` to fix EACCES on \`actions/checkout\`. Touches a different
section of \`coq-build.yml\` than this PR — no expected conflict either
way. Once #236 lands, the workflow becomes actually-runnable; once this
lands, the runnable workflow audits four theorems instead of one.
- **Ruleset gap** (owner action): the \`Base\` ruleset has \`deletion\`,
\`non_fast_forward\`, \`required_signatures\`, \`pull_request\` rules
but **no \`required_status_checks\` rule**. Even with this PR + #236
merged, \`Coq build — formal/_CoqProject\` remains informational. The
standing rule \`[[feedback_proof_pr_build_oracle_is_only_truth]]\`
remains load-bearing until the ruleset adds the check as required.

## Test plan

- [ ] CI: \`Coq build — formal/_CoqProject\` reaches the Print
Assumptions step (gated on #236 landing first to fix EACCES, then the
four-theorem step runs)
- [ ] Run log shows four \`Print Assumptions <theorem>.\` outputs in
sequence (one consolidated coqtop transcript)
- [ ] No build regression — \`coq_makefile\` step still passes (no Coq
source changed)

## Refs

- Closes the per-theorem coverage gap from #227's original acceptance
criteria
- Stacks complementarily on #236 (different file region)
- Standing rule [[feedback_proof_pr_build_oracle_is_only_truth]] still
applies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 30, 2026
…locker) (#254)

## Summary

Closes the path-filter blocker that prevents #244 (the owner action that
adds `required_status_checks` to the Base ruleset on `main`) from
landing
without collateral damage.

The `Coq build — formal/_CoqProject` check shipped in #231 is
path-filtered
to `formal/**` + the workflow file itself. Once the Base ruleset (id
`14285235`) gains a `required_status_checks` rule naming that context,
every
PR that does NOT touch `formal/**` would leave the check "expected but
not
reported" and be blocked indefinitely.

Fix: refactor to the always-on aggregator pattern. The single new
context
that goes into the ruleset is `coq-build-gate`; it ALWAYS runs and
reports
the right verdict regardless of which paths changed.

## Mechanics

1. `on:` block drops the `paths:` filter — workflow triggers on every PR
   and every push to main.
2. New `detect-relevant-changes` job diffs against the PR base
(`github.event.pull_request.base.sha` / `github.event.before`) and sets
`outputs.relevant = true|false` using the same path list that lived in
   the on-trigger (`formal/` + this workflow).
3. The existing `coq-build` job gains `needs: detect-relevant-changes` +
   `if: needs.detect-relevant-changes.outputs.relevant == 'true'`. It
   skips cleanly on irrelevant PRs.
4. New `coq-build-gate` job (`if: always()`, `needs: [detect,
coq-build]`):
   - SUCCESS immediately if `RELEVANT != 'true'`.
- FAILURE if `detect-relevant-changes` itself didn't succeed
(fail-safe).
   - SUCCESS iff `coq-build.result == 'success'` on a relevant change.
   - FAILURE otherwise.

The `coq-build` job itself is untouched: same `coqorg/coq:8.18`
container
with `--user root`, same `coq_makefile -f _CoqProject -o build.mk &&
make`,
same `Print Assumptions` audit of the four Phase D theorems
(`preservation_l1`, `preservation_l2_via_l1`,
`preservation_l2_app_eff_beta_linear`,
`preservation_l2_app_eff_beta_ground_nonlinear`).

## Ruleset migration (post-merge, separate operation)

Once this PR is green on main, the #244 owner action becomes:

```bash
# Append required_status_checks naming `coq-build-gate` (NOT the underlying
# `Coq build — formal/_CoqProject`, because that one skips on irrelevant PRs
# and would re-introduce the exact "expected but not reported" block this
# PR exists to avoid).
gh api -X PUT repos/hyperpolymath/ephapax/rulesets/14285235 --input /tmp/patch.json
```

## Test plan

- [ ] `coq-build-gate` green on this PR (path is
`.github/workflows/coq-build.yml`
      → relevant=true → coq-build runs → gate aggregates).
- [ ] After merge, a no-formal-change PR shows `coq-build-gate` green
while
      `coq-build` is "Skipped".
- [ ] After merge, the #244 owner action switches the Base ruleset's
      required context from absent to `coq-build-gate`.
- [ ] An intentional broken-build PR (touching `formal/`) shows
      `coq-build-gate` red and admin-merge blocked.

## Prior-art

panic-attack#90 (already landed) applied the same pattern to
`chapel-ci.yml`.
This PR is the single-job variant of the same refactor.

## Refs

- #244 — owner action this PR unblocks
- #231 — `coq-build.yml` ships
- #236 — EACCES container fix
- #243 — Print Assumptions audit expansion
- panic-attack#90 — prior-art (chapel-ci aggregator)

Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: add Coq build oracle to workflows — admin-merge can land broken main today

1 participant