fp-stability: scale-free pass/fail + Verrou cancellation-origin reporting; auto-installed prebuilt by sbryngelson · Pull Request #1526 · MFlowCode/MFC

sbryngelson · 2026-06-02T00:16:48Z

Summary

Makes ./mfc.sh fp-stability (the Verrou-based floating-point–stability harness) trustworthy, self-installing, and self-explanatory:

Scale-free pass/fail — retires the per-case hand-tuned absolute thresholds for one significant-bits floor.
Cancellation-origin reporting — the report localizes where catastrophic cancellation arises, ranked by decimal digits lost, with fypp-expanded lines honestly flagged as instance-ambiguous.
Self-installing Verrou — first use downloads a pinned, hash-verified prebuilt (with a source build as fallback); no manual ~20-min build.

Toolchain-only — no Fortran/solver and no build-system changes (CMakeLists.txt / build.py untouched relative to master). gfortran is the fp-stability CI compiler.

What it does

Each small 1-D case is run under Verrou and compared to a nearest-rounding reference:

Scale-free PASS/FAIL. A case passes iff it retains sig_bits = -log2(max_dev / max|ref|) >= 24 (single-precision accuracy). Normalizing by field scale collapses the old six-order-of-magnitude threshold spread into a tight ~30-48-bit band, so one global floor replaces six hand-tuned per-case numbers.
Cancellation origins (--check-cancellation). Source lines that lose leading significant digits in near-equal subtraction, ranked worst-first by digits lost (a sweep of --cc-threshold-double levels buckets each site by the highest it survives; bits -> digits, since a double carries ~16). Reported in the console (count + worst severity), as inline GitHub annotations (file:line), and in the step-summary table.
fypp instance-ambiguity flag. fypp #:for/#:def expansion collapses many generated computations onto one .fpp line, so line-level attribution can't identify which instance is responsible. Any cancellation site inside such an expansion is flagged "may represent multiple instances" rather than presented as a single confident culprit. (In the CI suite ~75-80% of sites land inside expansions — which is exactly why per-line pinpointing is not trustworthy here; see the note below.)
Supporting passes: float-proxy (single-precision sensitivity, one run), VPREC mantissa-bit sweep (at what precision a case breaks), and float-max (double->float overflow sites). Each is skippable with a --no-* flag.

Verrou installation

Verrou is a Valgrind fork (not a pip package). fp-stability installs it automatically on first use via toolchain/bootstrap/verrou.sh, which downloads a pinned, hash-verified prebuilt from the companion repo sbryngelson/verrou-dist (built in manylinux_2_28 for broad glibc reach; relocatable via a generated env.sh setting VALGRIND_LIB), falling back to a ~20-min source build if no prebuilt is available. A failed install aborts (never a silent skip). CI consumes the prebuilt + caches it, so the from-source build now happens once in verrou-dist's own CI rather than on every run.

Code organization

The feature outgrew the 1000-line file guideline, so fp_stability.py was split (pure relocation, no behavior change) into fp_stability_metrics.py (leaf: pure parsing/metrics), fp_stability_runners.py (Verrou orchestration), fp_stability_report.py (GitHub emitters), and fp_stability.py (CLI entry, CASES, _run_case) — each under the guideline, no import cycles. 33 toolchain unit tests; ruff + format clean; ./mfc.sh precheck all 7.

History note (re: the branch name)

The branch is named …-dd-confirm-rank after an earlier design that used Verrou's verrou_dd delta-debug to bisect to, confirm, and sensitivity-rank a single "responsible" source line. That machinery was removed in this PR: fypp expansion makes single-line attribution instance-ambiguous by construction (it was the fragile part — the sensitivity leader was almost always the benign time integrator, not the cancellation origin). A later commit also dropped the MCA (mcaquad) pass as redundant with the random-rounding sig-bits gate. The net result keeps file:line attribution via the cancellation-origin lens (more honest, and the one that matters for ill-conditioning) and drops ~900 lines of bisection/ranking. The add/remove pairs are visible in the commit list; the net diff contains none of the removed tooling.

Caveats

The sig-bits floor is a flat regression bar: a clean case degrading from ~47 to ~25 bits would still pass. Auto-measured per-case baselines + a regression delta would add that sensitivity — deferred.
Per-site "digits lost" is a sweep lower bound (no false positives, may under-count). High local cancellation in flux/reconstruction differences is often by-design (the small difference is the physics) and only harms the result if it propagates.
Job pass/fail is governed solely by the sig-bits floor; the cancellation/vprec/float passes localize and characterize, they don't change exit semantics.
Coverage is N small 1-D cases — not a guarantee for unexercised multi-D / viscous / MHD / IGR / bubble paths.

🤖 Generated with Claude Code

dd_line reports a minimal set of source lines, but presented them as a flat, equally-weighted list of confident warnings. Three problems: (1) no check that the reported lines actually reproduce the instability; (2) fypp #:for/#:def expansion collapses many generated computations onto one .fpp line, so a hit can be the wrong instance; (3) a multi-op line did not say which op was at fault. This adds, reusing the verified Verrou --source mechanism (matches file+line+symbol, captured via --gen-source): - Confirmation: perturb only the suspect lines; lines that fail to reproduce the deviation are downgraded from ::warning:: to ::notice:: (unconfirmed). - Per-line ranking: perturb each line alone and rank by the share of float-proxy it reproduces, so the dominant computation is named (e.g. m_time_steppers.fpp:510 = 100%). - Cancellation cross-reference: label dd_line hotspots that coincide with a stage-F catastrophic-cancellation site. - Macro-expansion flag: mark hotspots whose .fpp line sits inside a #:for/#:def expansion as instance-ambiguous. Surfaced in console, the GitHub step summary (ranked, tagged list), and inline annotations. Pure helpers covered by toolchain/mfc/test_fp_stability.py (22 tests, TDD). Verified end-to-end on a serial debug build.

…Tier 2) dd_line attributes to .fpp source lines, but a #:for/#:def expansion collapses many generated computations onto one line, so a macro-ambiguous hotspot cannot be pinned to a single runtime instance. This adds an opt-in precision path that resolves it. Mechanism (validated against gfortran+Verrou): a new build flag --fp-precision-lines strips the fypp line markers from each generated .f90 so the compiler attributes every expanded instance to a distinct physical line, emitting a .linemap.json sidecar mapping each line back to (.fpp file, line, instance). Marker renumbering was tried first but hit gfortran's DWARF line-number ceiling (~300k) and 700-line shadow runs; stripping avoids both and survives the cpp #if layer. fp-stability gains --precision-sim-binary: for the most flagrant macro-ambiguous hotspot, each expanded instance is perturbed alone (Verrou --source) on the precision binary and ranked, naming the responsible instance and showing its concrete generated code. The strip is gated to the simulation target only (pre/post run on CPU). Validated end-to-end: m_weno.fpp:238 (3 #:for instances) resolved to instance #0 = s_cb(i+3)-s_cb(i+1). toolchain/mfc/fp_precision_lines.py is pure + TDD'd (12 tests); normal build path is byte-identical and unaffected.

…gin; surface caps + coverage The per-line --source ranking measures sensitivity (where reduced precision most moves the output), which is structurally dominated by the time integrator / final accumulation: perturbing the last write to q_cons hits the output 1:1, while upstream errors get re-rounded there. Empirically, sod_standard's cancellation concentrates in m_weno.fpp (14 sites) and m_riemann_solvers.fpp (5), with m_time_steppers.fpp at just 1 — yet the time-stepper led the share ranking at 100%. Presenting it as 'most flagrant' conflated sensitivity with where ill-conditioning originates. Reframe: the dd_line/share view is relabeled 'single-precision sensitivity' with an explicit caveat (typically the time integrator, expected/benign, not a cancellation-origin finder); a new per-file cancellation-density line (_cancellation_by_file) headlines where cancellation actually concentrates; console + GitHub summary + inline annotations updated to keep the two signals distinct. Also: no silent caps (truncated dd_line/cancellation/float-max lists now report '…and N more'; annotations emit a dropped-count notice), and a coverage caveat in the summary header (N 1-D cases; a pass is not a guarantee for unexercised multi-D/viscous/MHD/IGR/bubble paths). _cancellation_by_file is pure + TDD'd.

…esolve continuations Two fixes prompted by review: (1) site COUNT is not severity — one catastrophic cancellation outweighs many mild ones; (2) attribution can land on a continuation fragment, so the labelled line was unclear. Severity: Verrou exposes no per-site bit-count, but --cc-threshold-double is itself a severity filter (a site is only reported if it lost >= threshold bits). A second pass at 26 bits identifies SEVERE sites with no false positives (may under-count). Severe sites are listed first and labelled; the count-by-file view is demoted with an explicit 'count != severity' caveat. On sod_standard this surfaces the real origins — flux divergence (m_rhs), divided differences and smoothness indicators (m_weno), HLLC wave speeds (m_riemann) — and correctly omits the time integrator. Continuations: _statement_bounds_in_lines follows free-form '&' continuations (leading or trailing) to the logical-statement start; cancellation sites are de-duplicated and displayed as the full statement at its canonical start line, so a hit on a fragment resolves to the whole expression. Pure helpers TDD'd (60 toolchain tests).

…hand-tuned thresholds Each case had a hand-tuned absolute L-inf threshold spanning 1e-13..2e-7 (six orders), driven by field magnitude and conditioning. Maintaining per-case thresholds is fragile. Normalizing the deviation by the field's peak magnitude removes the scale, so a single global criterion suffices. Pass/fail is now sig_bits = -log2(max_dev / max|ref|) >= MIN_SIG_BITS (24 = single precision retained under random rounding). The per-case 'threshold' field is removed from CASES; pass/fail, the VPREC FAIL marker, console, summary table, and inline annotations all report bits-retained vs the floor. The dd_sym/dd_line oracle keeps its own float-proxy-derived threshold (unchanged). Validated: max_dev spans 1e-14..7e-8 across the 6 cases but sig_bits is a tight 30.3..48.7 band, all >= 24 with margin; classification matches the prior thresholds (6/6 pass). Pure _sig_bits/_stability_pass are TDD'd (67 toolchain tests). A per-case auto-measured baseline + regression delta would add sensitivity for moderate drops; deferred as a heavier change.

… 'severe'), collapse sensitivity Review feedback on the summary: (1) it buried the interesting cancellation origins below the long, mostly-expected sensitivity list; (2) 'severe' is a binary label when an actual magnitude is far more useful; (3) 'bits lost' is not intuitive. Reorder: the cancellation-origins section now leads (right after the results table), ranked worst-first; the single-precision sensitivity list (dominated by the benign time integrator) is collapsed into a <details>. Severity as a number: a sweep of --cc-threshold-double levels [10,20,30,40,48] buckets each site by the highest it survives (_cancellation_severity), giving per-site bits lost (a lower bound). Bits are translated to decimal digits (a double carries ~16; _digits_left) so each entry reads e.g. '>= 12 digits lost (~4 of 16 left)' with the full statement. On sod_standard the worst origins (flux divergence, divided differences, HLLC wave speeds) lose ~14 of 16 digits; the sweep discriminates (23 sites >=10 bits, 11 >=48). 69 toolchain tests.

…vior change) Cleanup pass over the ~680 lines this branch added to fp_stability.py. Extracts shared helpers that had accreted across the six feature commits, with identical behavior (69 toolchain tests + ruff + precheck green; emitted console/summary text unchanged): - _resolve_source / _read_source_lines: the 'abs-path-or-glob-under-src(-then-tree)' + readlines block was repeated in _read_source_line, _macro_context, _statement_at, _get_source_context. A search_whole_tree flag preserves the one difference (only _get_source_context fell back to the whole tree). - _blank_result(name): the 15-field result dict was written verbatim twice. _find_dd_tool(verrou_bin, tool): merges _find_dd_sym/_find_dd_line. _setup_dd_run: shared dd_run.sh/dd_cmp.py setup + threshold-default for dd_sym and dd_line. _capture_gen_source: shared nearest --gen-source capture for confirmation and disambiguation. _more_md: the '…and N more' truncation footer used in three summary sections.

…(no behavior change) The PR had grown fp_stability.py well past the 1000-line soft guideline. Pure relocation — no function body, string, constant, or logic changed — into a clean dependency chain: - fp_stability_metrics.py (474, leaf): regexes/constants + pure parsing, source-reading, sig-bits, cancellation, ranking, statement-bounds helpers. Imports no sibling. - fp_stability_runners.py (530): Verrou subprocess orchestration (run/dd/vprec/cancellation/confirmation/disambiguation). Imports metrics. - fp_stability_report.py (244): GitHub summary + annotation emitters. Imports metrics. - fp_stability.py (715): CLI entry, CASES, _run_case, _blank_result; imports explicitly from the three. No import cycles. Also dropped the unused 'case' parameter from _run_cancellation_check/_run_float_max_check. Verified: 69 toolchain tests, ruff (incl. F-rules: no undefined names), precheck all 7, and a live fp-stability run confirming the cross-module orchestration (sig-bits pass/fail + cancellation sweep) is unchanged. Test import repointed to fp_stability_metrics.

Per the weight review, the precision-build-based per-instance disambiguation was the heaviest piece (its own module + a build flag + CMake plumbing + tests) for the narrowest trigger (fires only when the most-flagrant hotspot is also inside a #:for/#:def expansion). Removed in full: - deleted toolchain/mfc/fp_precision_lines.py and its tests; deleted _disambiguate_instances - reverted CMakeLists.txt and build.py to upstream (no MFC_FP_PRECISION_LINES option, no marker-strip step, no -D flag); dropped the --fp-precision-lines build arg and the --precision-sim-binary fp-stability arg - removed the E3 disambiguation stage, its docstring section, and the per-instance summary display Kept: the lightweight '#:for/#:def-expanded — may represent multiple instances' hotspot warning (cheap, honest, separate from the disambiguation machinery). 57 toolchain tests, ruff, precheck all 7 green; CMakeLists.txt and build.py are byte-identical to upstream.

…asibility guard Following the native convention (run/validate/viz take the case .py as a positional 'input'), fp-stability now does too — './mfc.sh fp-stability my_case.py' analyzes your case instead of the built-in suite; omitting it runs the suite as before. It loads the case via the shared loader (run.input.load), runs it as a single case, and auto-detects the files to diff from the reference run (_autodetect_compare: conserved-var .dat at the final step, prim fallback). Output is forced to serial .dat I/O (parallel_io=F) since the no-MPI binary is run as one process and the suite diffs serial files. Guard (Verrou is ~30x and the suite runs the sim many times): the case must be a small, short, single-process proxy — errors if cells > 100k or work (cells x t_step_stop) > 200k cell-steps, with guidance to coarsen. Validated end-to-end on a real case .py (auto-compare + sig-bits PASS + cancellation digits); guard correctly rejects 1D_sodshocktube (400k cell-steps). 60 toolchain tests, ruff, precheck all 7.

…/cancellation reframe The --help prose was stale: it claimed 'PASS/FAIL against per-case thresholds' (now scale-free sig-bits), didn't mention the case.py positional / its serial-CPU constraints / the feasibility guard, and listed an outdated case set. Rewrote the description to cover: running on a built-in suite or a user case .py (with constraints + guard), the >= 24-bit scale-free pass criterion, and the analysis passes (dd confirmation/ranking, cancellation origins by digits lost). Also updated the module-docstring Usage. The positional INPUT and a case.py example were already shown in --help; this fixes the surrounding prose.

…code, tests, comment rot From a multi-agent PR review: - Silent failures (critical): a crashed cancellation/float-max/MCA run was reported as a clean result ('none detected' / 'no overflows' / 'dev=0.0'). The run helpers now log the failure and return None (cancellation/float-max) or a completed-sample count (MCA, distinct from measured zero); _run_case reports 'run failed' instead of a false all-clear. - Crash (important): np.loadtxt(...)[:,1] raised IndexError on a single-row .dat (reachable via a 1-cell user case), aborting the whole suite. Added _dat_column using np.atleast_2d; also fixed the generated dd_cmp.py oracle. - Dead code: removed _cancellation_by_file (superseded by the digits-lost severity view) and _stability_pass (the orchestrator inlines the comparison), plus their tests. - Tests: smoke tests for the CI-only report emitters (blank + populated result, and unconfirmed->::notice:: downgrade), _digits_left clamp, and #:block/#:call/unbalanced macro cases. - Comment rot: stage-E docstring now says confirmation is set-level (not per-line); dropped the non-existent 'per-file density' reference; fixed the '48 ~ full mantissa' note (53-bit). 45 tests, ruff, precheck all 7.

…ssage Verrou is a compiled Valgrind fork (not a pip/uv/conda-friendly Python package), so it can't be installed into the venv. Add toolchain/bootstrap/verrou.sh — an explicit, opt-in installer that builds Valgrind+Verrou from source into $VERROU_HOME (default ~/.local/verrou), pinned to the same versions as the fp-stability CI workflow (Valgrind 3.26.0 + edf-hpc/verrou@a58d434). It is deliberately opt-in (never auto-built on a bare fp-stability run, since it's a ~20-min source build needing a C toolchain + autotools): idempotent (skips if already present, --force to rebuild), requires Linux (Valgrind has no modern-macOS/Apple-Silicon support), warns but proceeds on non-x86_64 (Valgrind builds on aarch64 etc., but Verrou's FP backends are best-validated on x86_64), and checks build deps up front with guidance. The fp-stability SKIP message now points at it and clarifies Verrou is not a pip package.

Replace the inline Valgrind+Verrou build in the workflow with a call to toolchain/bootstrap/verrou.sh, so the local installer and CI share one pinned recipe (no drift between them). Cache gating and the system-deps step are unchanged; the build step is still skipped on a cache hit. Tightened the verify step to '--tool=verrou --version', and noted that the cache key's pinned versions must track the script.

…ild as fallback verrou.sh now downloads a pinned, hash-verified prebuilt from sbryngelson/verrou-dist@v1 (seconds) and falls back to the ~20-min source build; idempotency check sources env.sh so a relocated prebuilt isn't re-fetched. fp_stability sets VALGRIND_LIB (via _verrou_env, reused by _dd_env) so a relocated tree's valgrind/verrou_dd_* calls resolve — harmless for source builds. CI installs zstd and sources env.sh before verifying.

…ard-fail if it can't Running ./mfc.sh fp-stability with no Verrou present now installs it via the bootstrap (downloads the pinned prebuilt from verrou-dist; source build as fallback) and proceeds, instead of SKIP+exit-0; a failed install is now a hard error. _find_verrou no longer accepts a bare system valgrind on PATH (it has no 'verrou' tool and would only fail at run time) — that case reads as 'Verrou absent' so it gets installed. CI drops the separate Install/Verify Verrou steps; the run does it. Tests added for the discovery logic.

…ROU_HOME tree, more tests verrou.sh try_prebuilt() runs as an if-condition (set -e suppressed), so a failed extract could fall through and the source build would install over a half-written tree; now extract+verify in a staging dir and swap into PREFIX atomically with explicit error checks. _find_verrou now verifies the $VERROU_HOME tree actually runs the verrou tool (with VALGRIND_LIB for a relocated prebuilt) so a broken/stale tree reads as absent and gets reinstalled, not used until it fails per-run. Fix comment rot (fp-stability now auto-installs). Add unit tests for _verrou_env (incl. preserve-user-VALGRIND_LIB), _dd_env PYTHONPATH composition, _install_verrou hard-fail guards, _has_verrou_tool OSError, and the broken-VERROU_HOME case.

…pp flag onto it The dd delta-debug stack (bisection + confirmation positive-control + sensitivity ranking) tried to pinpoint and rank the single most precision-sensitive source line, but fypp #:for/#:def expansion collapses many generated computations onto one .fpp line, so that attribution is instance-ambiguous by construction — the fragile part. Removed it (~900 lines). The cancellation pass stays and now carries the fypp instance-ambiguity flag: each cancellation origin is checked with _macro_context and, if its .fpp line sits inside a #:for/#:def, marked 'may represent multiple instances' in console, annotations, and summary. file:line attribution (cancellation origins, ranked by digits lost) is preserved; only the false-precision line-pinpointing is gone. Verified end-to-end: 27 cancellation sites, 23 flagged fypp-ambiguous.

…ig-bits gate) MCA (--backend=mcaquad) reported a significant-bits estimate that duplicated the metric the core random-rounding suite already produces for PASS/FAIL; on well-conditioned cases the two agree, so it added a slower second opinion without a distinct capability. Removed _run_mca_samples, the MCA pass + --no-mca flag + result keys + summary column. Cancellation/vprec/float-proxy/float-max and the PASS/FAIL core are unchanged. ~50 lines off; it was also the slowest pass (N extra mcaquad runs/case).

…bility, env on PATH probe, comment rot Copilot PR review (all valid): (1) gate explicit --verrou-binary on os.access(X_OK) so the 'not executable' message is honest; (6) pass VALGRIND_LIB via _verrou_env to the PATH-fallback verrou probe too, so a relocated prebuilt on PATH isn't misjudged absent; (3,5) fix two stale comments left by the dd/MCA removals (a 'sensitivity list' that no longer exists; a VPREC '❌ where bits<floor' that never matched the deviation table); (2) inline-code the fp-stability-logs/ path in the truncation footer. Tests updated for the 2-arg _has_verrou_tool stub.

…reation The makedirs(run_dir) sat after the --log-file path and the sim.out open that both need it, so it was a dead no-op that worked only because every caller pre-created run_dir — misleading, and a future caller would hit FileNotFoundError with no hint. Moved it to the top of _run_simulation_verrou (before those uses) and dropped the now-redundant pre-creates in all callers (_run_cancellation_check, _run_float_max_check, _run_float_proxy, _run_vprec_sweep, and _run_case's ref + sample runs). Single, clear ownership. (Claude Code Review finding.)

Kept the behavioral contracts (verrou-absent/broken detection — the case a real bug hit, scale-free pass/fail invariant + zero-scale safety, cancellation severity, output autodetect, blank-result emitter KeyError guard, fypp-ambiguity annotation, VALGRIND_LIB relocation incl. don't-clobber, install no-binary hard-fail) and the subtle edges. Dropped redundant enumerations (5 of 8 macro-context micro-cases, 3 of 5 sig-bits math cases) and trivial-math/empty-input/constant assertions (digits_left, min_sig_bits==24, *_empty, omits-when-absent, the obvious bootstrap-returncode guard). -125 lines; the dropped paths are covered by the kept tests' shared code or the end-to-end CI job.

…hand-synced version) The cache key hardcoded verrou-a58d434-valgrind-3.26.0, synced to the installer's pins only by a comment — if verrou.sh bumped but the key didn't, CI would restore the stale cached tree and silently never exercise the new version. Key off hashFiles('toolchain/bootstrap/verrou.sh') so any pin change (or edit) auto-busts the cache. Also dropped the version literals from the workflow's header comment; the pinned versions now live solely in verrou.sh.

…I only) Replaced the check/cross marks in the results list (PASS/FAIL), the summary table status (PASS/FAIL), and the VPREC 'crash' cell + its comment. No emoji in the toolchain output now.

…nts + output Replaced non-ASCII in the toolchain (em-dash/en-dash -> '-', '->' for arrows, '>=' for >=, '+/-', '~' for approx, 'inf'/'Linf' for the infinity glyph, '...' for ellipsis) across fp_stability*.py, verrou.sh, and the fp-stability command help. Display/comment text only; no logic change. The viz command's pre-existing glyphs are left untouched (not part of this PR).

codecov · 2026-06-03T16:24:19Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.63%. Comparing base (2d0d2fd) to head (0099674).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1526   +/-   ##
=======================================
  Coverage   60.63%   60.63%           
=======================================
  Files          73       73           
  Lines       20219    20219           
  Branches     2937     2937           
=======================================
  Hits        12259    12259           
  Misses       5972     5972           
  Partials     1988     1988

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sbryngelson added 5 commits June 1, 2026 19:01

sbryngelson changed the title ~~fp-stability: verify, rank, and resolve dd_line hotspots to the responsible instance~~ fp-stability: verify/rank/localize dd_line hotspots, per-instance disambiguation, scale-free pass/fail Jun 2, 2026

sbryngelson added 4 commits June 2, 2026 07:41

sbryngelson changed the title ~~fp-stability: verify/rank/localize dd_line hotspots, per-instance disambiguation, scale-free pass/fail~~ fp-stability: verify/rank dd_line hotspots, cancellation-origin reporting, scale-free pass/fail Jun 2, 2026

sbryngelson added 11 commits June 2, 2026 10:26

Merge branch 'master' into fp-stability-dd-confirm-rank

259d95c

sbryngelson marked this pull request as ready for review June 3, 2026 12:16

Copilot AI review requested due to automatic review settings June 3, 2026 12:16

Copilot started reviewing on behalf of sbryngelson June 3, 2026 12:17 View session

This comment was marked as resolved.

Sign in to view

sbryngelson added 2 commits June 3, 2026 09:02

sbryngelson changed the title ~~fp-stability: verify/rank dd_line hotspots, cancellation-origin reporting, scale-free pass/fail~~ fp-stability: scale-free pass/fail + Verrou cancellation-origin reporting; auto-installed prebuilt Jun 3, 2026

sbryngelson added 4 commits June 3, 2026 09:21

fp-stability: remove emoji from console + GitHub-summary output (ASCI…

c58d44f

…I only) Replaced the check/cross marks in the results list (PASS/FAIL), the summary table status (PASS/FAIL), and the VPREC 'crash' cell + its comment. No emoji in the toolchain output now.

sbryngelson merged commit f7afee9 into MFlowCode:master Jun 3, 2026
86 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp-stability: scale-free pass/fail + Verrou cancellation-origin reporting; auto-installed prebuilt#1526

fp-stability: scale-free pass/fail + Verrou cancellation-origin reporting; auto-installed prebuilt#1526
sbryngelson merged 26 commits into
MFlowCode:masterfrom
sbryngelson:fp-stability-dd-confirm-rank

sbryngelson commented Jun 2, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

codecov Bot commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

sbryngelson commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it does

Verrou installation

Code organization

History note (re: the branch name)

Caveats

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

codecov Bot commented Jun 3, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

sbryngelson commented Jun 2, 2026 •

edited

Loading