acgetchell · acgetchell · Jun 8, 2026 · Jun 8, 2026 · Jun 8, 2026 · Jun 8, 2026
@@ -5,9 +5,6 @@
 /cobertura.xml
 .DS_Store
 
-# Generated benchmark results (machine-specific)
-docs/PERFORMANCE.md
-
 # Python / uv
 **/__pycache__/
 *.egg-info/

@@ -43,6 +43,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   - Move error and tolerance contracts into first-class modules with prelude exports
   - Update exact benchmarks to distinguish strict Result paths from rounded f64 paths
   - Document and exercise the rounded fallback pattern for RequiresRounding errors
+- [**breaking**] Make exact f64 conversions strict [`89f3720`](https://github.com/acgetchell/la-stack/commit/89f3720ecde9f12d7a0f42e79394836615e8fd97)
+  - Make Matrix and Vector the finite-by-construction public types for exact arithmetic.
+  - Add rounded exact-to-f64 APIs for determinant and solve callers that want explicit lossy conversion.
+  - Return typed Unrepresentable reasons when strict exact-to-f64 conversion would round or become non-finite.
+  - Specialize D4 exact determinants and keep determinant/error-bound zero coefficients from evaluating overflowing absent terms.
+  - Update exact benchmark comparison reporting to compare strict and rounded APIs against legacy v0.4.2 rows.
 
 ### Changed
 
@@ -72,6 +78,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   Use a literal regex pattern for the malformed Criterion JSON diagnostic so
   Windows paths with backslashes do not break pytest's match expression.
 - Align ty with Python 3.13 [`b9e0ba0`](https://github.com/acgetchell/la-stack/commit/b9e0ba08e54a15d8eddd5c5c53edc37bbc03939a)
+- Preserve coordinates for overflowed accumulators [`1d976b3`](https://github.com/acgetchell/la-stack/commit/1d976b346172ad4eca37c68a3ec31817eeca8529)
+
+  - Return matrix-cell metadata when inf-norm row sums or symmetry tolerance scaling overflow.
+  - Avoid reparsing finite-by-construction RHS vectors in LU and LDLT solves.
 
 ## [0.4.2] - 2026-06-04
 

@@ -22,6 +22,20 @@ to an explicit allowlist, and kept with readable version comments for review.
 CI runs `just ci` on Ubuntu, macOS, and Windows to keep platform coverage
 aligned with the local comprehensive validation path.
 
+## Performance checks
+
+Performance-sensitive changes should compare the current tree against the
+latest published release:
+
+```bash
+just performance-local
+```
+
+This writes `target/bench-reports/performance.md` without changing committed
+release docs. Regressions are worth treating as design feedback: if a slowdown
+is intentional, document the correctness, API clarity, or composability benefit
+that justifies it.
+
 For coverage commands and report locations, see [`docs/COVERAGE.md`](docs/COVERAGE.md).
 For benchmark methodology, see [`docs/BENCHMARKING.md`](docs/BENCHMARKING.md).
 For the full set of developer commands, run `just --list`.

@@ -151,40 +151,28 @@ benchmarks on every iteration.
 ### Workflow
 
 ```bash
-# 1. Check out the old release and save its full baseline
-git checkout v0.2.0
-just bench-save-baseline v0.2.0
+# Current in-tree code vs latest published release, all measured locally
+just performance-local
 
-# 2. Switch to current code and run latest la-stack measurements
-git checkout main   # or your feature branch
-just bench-latest   # populates target/criterion/*/new/
-
-# 3. Generate a local comparison report
-just bench-compare v0.2.0
+# Stored GitHub Actions release assets, no local cargo runs
+just performance-github-assets
 ```
 
-You can save multiple baselines and compare against any of them.
+`performance-local` creates isolated temporary worktrees, generates the latest
+published release baseline locally, then benchmarks the current in-tree code on
+the same machine. It uses the current checkout's Rust toolchain for both sides
+unless `RUSTUP_TOOLCHAIN` is already set. `performance-github-assets` compares
+stored GitHub Actions release artifacts and does not run cargo locally.
 
-If the release baseline is already present in `target/criterion/`, skip the
-checkout step and compare directly. For example, to compare current code against
-the saved `v0.4.2` release baseline:
+For local scratch comparisons, you can save multiple baselines and compare
+against any of them. If the release baseline is already present in
+`target/criterion/`, compare directly:
 
 ```bash
 just bench-latest          # gather latest la-stack measurements
 just bench-compare v0.4.2  # compare latest measurements against v0.4.2
 ```
 
-If the release baseline is not present locally, download and restore the release
-asset first:
-
-```bash
-gh release download v0.4.2 --pattern "la-stack-v0.4.2-criterion-baseline.tar.gz"  # fetch archived release baseline
-mkdir -p target                                                                    # ensure Criterion parent directory exists
-tar -C target -xzf la-stack-v0.4.2-criterion-baseline.tar.gz                       # restore target/criterion baseline data
-just bench-latest                                                                  # gather latest la-stack measurements
-just bench-compare v0.4.2                                                          # compare latest measurements against v0.4.2
-```
-
 ### Output
 
 `just bench-compare` writes `target/bench-reports/performance.md` by
@@ -193,6 +181,31 @@ local. The report includes per-dimension tables showing median times,
 percent change, speedup, and last-release nalgebra/faer context where a
 matching `vs_linalg` peer exists.
 
+Release PRs promote one curated comparison into committed docs:
+
+```bash
+just performance-release
+```
+
+This infers the current release tag from `Cargo.toml`, discovers the previous
+stable published release, generates both sides locally in temporary worktrees,
+copies the finished report to `docs/PERFORMANCE.md`, and archives the previous
+committed report under `docs/archive/performance/`. Archive filenames are
+release-pair names such as `v0.4.2-vs-v0.4.1.md`, so the directory and generated
+index stay lexicographically sorted. For explicit release repair, pass both
+tags: `just performance-release v0.4.3 v0.4.2`.
+
+To compare the latest stored GitHub Actions release assets without touching the
+current checkout:
+
+```bash
+just performance-github-assets
+```
+
+The recipe discovers the latest stable published GitHub release and its previous
+stable release automatically. For explicit historical repair, pass both tags:
+`just performance-github-assets v0.4.2 v0.4.1`.
+
 For exact-arithmetic comparisons against v0.4.2 or older baselines, rows such
 as `det_exact_rounded_f64 (vs det_exact_f64)` mean the current rounded API is
 being compared to the historical lossy `*_exact_f64` benchmark. Rows such as
@@ -234,11 +247,12 @@ See `scripts/criterion_dim_plot.py --help` for options.
 At release time, save a local baseline so future work can compare against it:
 
 ```bash
-just bench-save-baseline $TAG
+just bench-save-baseline <tag>
 just bench-save-last
 ```
 
 When the GitHub Release is published, `.github/workflows/release-benchmarks.yml`
 saves a full release baseline and attaches
 `la-stack-$TAG-criterion-baseline.tar.gz` to the release as the durable archive.
-See `docs/RELEASING.md` step 5 for where this fits in the release process.
+See the `just performance-release` step in `docs/RELEASING.md` for where the
+curated `docs/PERFORMANCE.md` comparison fits in the release process.
@@ -0,0 +1,117 @@
+# Exact Arithmetic Performance
+
+**la-stack** v0.4.2 · `7e11f93` (HEAD) · 2026-06-08 20:39:03 UTC
+**Statistic**: median
+
+## Benchmark Results
+
+Comparison against baseline **v0.4.1**:
+
+Negative change = faster. Speedup > 1.00x = improvement.
+
+### D=2
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det | 0.6 ns | 0.9 ns | +61.1% | 0.62x |
+| det_direct | 0.7 ns | 1.0 ns | +44.7% | 0.69x |
+| det_exact | 315.5 ns | 318.4 ns | +0.9% | 0.99x |
+| det_exact_f64 | 555.7 ns | 555.7 ns | -0.0% | 1.00x |
+| det_sign_exact | 0.7 ns | 1.5 ns | +128.2% | 0.44x |
+| solve_exact | 7.05 µs | 7.06 µs | +0.2% | 1.00x |
+| solve_exact_f64 | 7.50 µs | 7.67 µs | +2.3% | 0.98x |
+
+### D=3
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det | 1.3 ns | 1.8 ns | +36.3% | 0.73x |
+| det_direct | 4.7 ns | 2.2 ns | **-51.9%** | 2.08x |
+| det_exact | 936.9 ns | 924.3 ns | **-1.3%** | 1.01x |
+| det_exact_f64 | 1.18 µs | 1.19 µs | +1.1% | 0.99x |
+| det_sign_exact | 2.4 ns | 4.2 ns | +78.1% | 0.56x |
+| solve_exact | 27.02 µs | 27.41 µs | +1.5% | 0.99x |
+| solve_exact_f64 | 28.06 µs | 27.98 µs | -0.3% | 1.00x |
+
+### D=4
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det | 2.4 ns | 3.3 ns | +36.8% | 0.73x |
+| det_direct | 2.4 ns | 4.1 ns | +70.2% | 0.59x |
+| det_exact | 2.33 µs | 2.33 µs | -0.0% | 1.00x |
+| det_exact_f64 | 2.59 µs | 2.58 µs | -0.7% | 1.01x |
+| det_sign_exact | 5.3 ns | 6.9 ns | +30.5% | 0.77x |
+| solve_exact | 67.14 µs | 67.99 µs | +1.3% | 0.99x |
+| solve_exact_f64 | 67.86 µs | 68.51 µs | +1.0% | 0.99x |
+
+### D=5
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det | 21.6 ns | 24.5 ns | +13.7% | 0.88x |
+| det_direct | 2.3 ns | 4.7 ns | +104.8% | 0.49x |
+| det_exact | 5.04 µs | 4.99 µs | -1.0% | 1.01x |
+| det_exact_f64 | 5.32 µs | 5.31 µs | -0.1% | 1.00x |
+| det_sign_exact | 4.97 µs | 4.99 µs | +0.3% | 1.00x |
+| solve_exact | 134.99 µs | 136.04 µs | +0.8% | 0.99x |
+| solve_exact_f64 | 137.11 µs | 138.97 µs | +1.4% | 0.99x |
+
+### Near-singular 3x3
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det_sign_exact | 871.8 ns | 877.6 ns | +0.7% | 0.99x |
+| det_exact | 907.3 ns | 904.4 ns | -0.3% | 1.00x |
+| solve_exact | 4.31 µs | 4.25 µs | **-1.5%** | 1.02x |
+| solve_exact_f64 | 4.29 µs | 4.32 µs | +0.7% | 0.99x |
+
+### Large entries 3x3
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det_sign_exact | 3.14 µs | 3.09 µs | **-1.3%** | 1.01x |
+| det_exact | 3.19 µs | 3.11 µs | **-2.3%** | 1.02x |
+| solve_exact | 84.77 µs | 83.89 µs | **-1.0%** | 1.01x |
+| solve_exact_f64 | 84.62 µs | 83.92 µs | -0.8% | 1.01x |
+
+### Hilbert 4x4
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det_sign_exact | 5.3 ns | 6.9 ns | +30.4% | 0.77x |
+| det_exact | 2.39 µs | 2.31 µs | **-3.2%** | 1.03x |
+| solve_exact | 51.69 µs | 52.27 µs | +1.1% | 0.99x |
+| solve_exact_f64 | 52.90 µs | 53.26 µs | +0.7% | 0.99x |
+
+### Hilbert 5x5
+
+| Benchmark | v0.4.1 | Current | Change | Speedup |
+|-----------|-------:|--------:|-------:|--------:|
+| det_sign_exact | 5.03 µs | 4.88 µs | **-2.9%** | 1.03x |
+| det_exact | 5.07 µs | 4.96 µs | **-2.1%** | 1.02x |
+| solve_exact | 105.35 µs | 102.72 µs | **-2.5%** | 1.03x |
+| solve_exact_f64 | 104.99 µs | 103.94 µs | -1.0% | 1.01x |
+
+## How to Update
+
+Local performance reports are generated in isolated temporary worktrees:
+
+```bash
+# Local development: compare the current tree with the latest release
+just performance-local
+
+# Release PR: update docs/PERFORMANCE.md and archive the previous report
+just performance-release
+
+# GitHub Actions release assets
+just performance-github-assets
+
+# Explicit repair
+just performance-release <current-tag> <previous-tag>
+```
+
+`just performance-local` writes `target/bench-reports/performance.md`.
+`just performance-github-assets` writes `target/bench-reports/github-assets-performance.md`.
+
+See `docs/BENCHMARKING.md` for the full comparison workflow.
@@ -20,6 +20,7 @@ Set these variables to avoid repeating the version string:
 # tag has the leading v, version does not
 TAG=vX.Y.Z
 VERSION=${TAG#v}
+PREVIOUS_TAG=vA.B.C
 ```
 
 Verify your git remotes:
@@ -100,7 +101,24 @@ just plot-vs-linalg-readme
 Review the updated table in `README.md` and the plot in `docs/assets/` for
 accuracy.
 
-5. Save benchmark baselines for this release
+5. Update the release performance comparison
+
+```bash
+# Infers TAG from Cargo.toml, compares it against the previous stable published
+# release, writes docs/PERFORMANCE.md, and archives the previous docs/PERFORMANCE.md
+# under docs/archive/performance/.
+just performance-release
+```
+
+Review `docs/PERFORMANCE.md` for the latest release-to-release comparison. Older
+committed comparisons are archived under `docs/archive/performance/` with
+lexicographically sorted filenames such as `v0.4.2-vs-v0.4.1.md`. Iterative
+local reports still live under `target/bench-reports/`. For an explicit release
+repair, run `just performance-release <current-tag> <previous-tag>`. To compare
+the stored GitHub Actions release assets instead of running cargo locally, use
+`just performance-github-assets`.
+
+6. Save benchmark baselines for this release
 
 ```bash
 # Save a named full baseline for this release
@@ -125,15 +143,15 @@ uploads a short-lived Actions artifact for debugging the run.
 
 See `docs/BENCHMARKING.md` for the full comparison workflow.
 
-6. Validate the release branch
+7. Validate the release branch
 
 ```bash
 just ci
 just citation-check
 cargo publish --locked --dry-run
 ```
 
-7. Stage and commit release artifacts
+8. Stage and commit release artifacts
 
 ```bash
 git add Cargo.toml Cargo.lock CITATION.cff pyproject.toml CHANGELOG.md README.md docs/
@@ -143,11 +161,11 @@ git commit -m "chore(release): release $TAG
 - Bump version to $TAG
 - Update citation and utility package metadata
 - Update changelog with latest changes
-- Update benchmark comparison table
+- Update benchmark comparison table and release performance report
 - Update documentation for release"
 ```
 
-8. Push the branch and open a PR
+9. Push the branch and open a PR
 
 ```bash
 git push -u origin "release/$TAG"

@@ -0,0 +1,6 @@
+# Archived Performance Reports
+
+Older release-to-release benchmark comparisons are archived here.
+`docs/PERFORMANCE.md` contains the latest curated comparison.
+
+- [v0.4.1-vs-v0.4.0](v0.4.1-vs-v0.4.0.md)