Skip to content

CI gap: coqc is the merge oracle for proof PRs but no CI job runs it #229

@hyperpolymath

Description

@hyperpolymath

Context

PRs #223 and #224 both added `tfuneff_lambda_retype_l1_m` from parallel sessions on 2026-05-30. Both passed all CI gates and squash-merged successfully. Resulting main state had two definitions of the same lemma name, and `coqc -R . Ephapax Semantics_L1.v` failed with `Error: tfuneff_lambda_retype_l1_m already exists.`

Fixed in #228 (deduped + Phase 4a).

The seam

For proof PRs, `coqc` is the only authoritative merge oracle (per CLAUDE.md). Yet none of the current CI jobs run it. Every other gate (CodeQL, Hypatia, governance, secret scanners, gitleaks) is orthogonal to proof correctness.

This means:

  • Two PRs that touch `formal/` can each be coqc-clean in isolation.
  • Both can pass all CI gates simultaneously.
  • The squash-merge of one introduces a context that breaks the other's compile, but neither is rebased.
  • Main is silently broken on coqc rebuild until someone tries it.

Proposal

Add a minimal Coq-build job to CI (e.g. `.github/workflows/coq-formal.yml`):

```yaml
name: Coq formal proofs
on:
push:
paths: ['formal/']
pull_request:
paths: ['formal/
']
jobs:
build:
runs-on: ubuntu-latest
container: rocq/coq:8.18.0
steps:
- uses: actions/checkout@v4
- run: cd formal && coqc -R . Ephapax $(cat _CoqProject | grep -v '^#' | grep '\.v$')
```

Constraints to think about (per estate-language policy):

  • Container choice matches the version pinned in this repo (8.18.0).
  • Trigger only on `formal/**` paths to keep CI minutes bounded.
  • Optionally add an `assumptions-audit` follow-up job that runs `Print Assumptions` on every top-level Theorem/Lemma and fails if a new admit/axiom appears.

Why this is worth fixing

The class of bug is "two PRs that are individually correct but jointly break main". It cannot be detected by per-PR static gates. The only defense is to actually build the artifact in CI. The blast radius is large for proof repos because coqc failure on main means no further proof PR can be reviewed until it's hot-fixed.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions