FlorianPfaff · FlorianPfaff · May 21, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -53,6 +53,11 @@ jobs:
         env:
           PYTHONPATH: ${{ github.workspace }}/src
 
+      - name: Check backend API matrix documentation
+        run: poetry run python scripts/check_backend_api_matrix.py
+        env:
+          PYTHONPATH: ${{ github.workspace }}/src
+
   package:
     runs-on: ubuntu-latest
     steps:

diff --git a/docs/backend-api-matrix.md b/docs/backend-api-matrix.md
@@ -13,8 +13,12 @@ To inspect the current matrix from a checkout or installed environment, run:
 ```bash
 pyrecest backends --format markdown
 python scripts/render_backend_api_matrix.py
+python scripts/check_backend_api_matrix.py
 ```
 
+The documentation table is checked against `src/pyrecest/_backend/capabilities.py`
+in CI so the user-facing matrix cannot silently drift from the executable metadata.
+
 ## Support Levels
 
 | Level         | Meaning                                                                                                                     |

diff --git a/docs/scientific-validation.md b/docs/scientific-validation.md
@@ -0,0 +1,72 @@
+# Scientific Validation
+
+PyRecEst tests should cover more than importability and API shape. Recursive
+Bayesian estimation code also needs validation against mathematical invariants,
+known statistical diagnostics, and backend-specific numerical behavior.
+
+This page defines the validation ladder used when adding or changing filters,
+distributions, samplers, trackers, or evaluation helpers.
+
+## Validation Layers
+
+| Layer | Purpose | Examples |
+|-------|---------|----------|
+| API smoke tests | Confirm public entry points exist and have stable capabilities. | Protocol capability matrices, import checks, CLI smoke tests. |
+| Deterministic algebraic checks | Verify identities that should hold without randomness. | Gaussian multiplication, Kalman covariance symmetry, normalized innovation squared consistency. |
+| Numerical invariant checks | Catch invalid estimates even when exact values are not known. | Positive semidefinite covariances, nonnegative probabilities, normalized weights, unit-norm directional states. |
+| Monte Carlo checks | Verify statistical behavior across repeated randomized runs. | NEES/NIS coverage, sampling moment convergence, resampling effective sample size behavior. |
+| Scenario regression checks | Preserve known behavior for complete workflows. | Scenario zoo expected outputs, benchmark regressions, tracker association edge cases. |
+
+## Core Invariants
+
+When a change affects a Kalman-style Gaussian estimator, check at least:
+
+- covariance matrices remain symmetric after prediction and update;
+- covariance matrices remain positive semidefinite up to numerical tolerance;
+- normalized innovation squared values are nonnegative and agree with the
+  innovation covariance solve;
+- rejected gated measurements leave the posterior unchanged;
+- diagnostics report the covariance scale and update action used by robust
+  updates.
+
+When a change affects particle or grid methods, check at least:
+
+- weights remain finite and normalized;
+- resampling never creates invalid particle shapes;
+- likelihood-only updates handle zero or underflowing likelihoods explicitly;
+- deterministic seeds are recorded for reproducibility.
+
+When a change affects circular, spherical, or manifold-valued states, check at
+least:
+
+- wrapped coordinates remain in the documented convention;
+- unit-vector or quaternion states remain normalized;
+- antipodal or periodic equivalences are tested where the distribution assumes
+  them;
+- moment and point estimates are invariant under representation-specific
+  symmetries.
+
+## Test Placement
+
+Use the most specific existing test directory when possible:
+
+- `tests/filters/` for filter and tracker invariants;
+- `tests/distributions/` for density, sampling, and conversion invariants;
+- `tests/protocols/` for API capability snapshots;
+- `tests/scenarios/` or scenario fixtures for complete reproducible workflows.
+
+Mark slower randomized coverage with `@pytest.mark.numerical_stress` so the fast
+matrix can remain focused while scheduled or manual runs exercise the heavier
+statistical checks.
+
+## Backend Expectations
+
+For APIs listed as `supported` in the backend API matrix, add or update focused
+tests that run under the NumPy, PyTorch, and JAX CI matrix. For APIs listed as
+`partial`, test the portable subset and document what is intentionally excluded.
+For `unsupported` APIs, prefer a clear unsupported-backend exception or
+`NotImplementedError` path.
+
+Backend-specific tolerances are acceptable, but they should be explicit in the
+test and justified by dtype, device, tracing, or bridge behavior rather than by
+an unexplained broad tolerance.
diff --git a/docs/tutorials/backend-portable-workflows.md b/docs/tutorials/backend-portable-workflows.md
@@ -0,0 +1,95 @@
+# Backend-Portable Workflows
+
+This tutorial shows the conventions for code that should run under the NumPy,
+PyTorch, and JAX backends without changing the estimator logic.
+
+## 1. Select the backend before importing PyRecEst
+
+Set `PYRECEST_BACKEND` before Python imports `pyrecest`:
+
+```bash
+PYRECEST_BACKEND=numpy python my_filter.py
+PYRECEST_BACKEND=pytorch python my_filter.py
+PYRECEST_BACKEND=jax JAX_ENABLE_X64=True python my_filter.py
+```
+
+For JAX workflows that compare numerical values against NumPy or PyTorch, enable
+64-bit mode when the test tolerance assumes double precision.
+
+## 2. Import arrays from `pyrecest.backend`
+
+Use the backend facade for arrays, matrices, and common numerical helpers:
+
+```python
+from pyrecest.backend import array, diag, eye
+from pyrecest.distributions import GaussianDistribution
+from pyrecest.filters import KalmanFilter
+
+
+initial = GaussianDistribution(
+    array([0.0, 1.0]),
+    diag(array([1.0, 0.25])),
+    check_validity=False,
+)
+kf = KalmanFilter(initial)
+
+system_matrix = array([[1.0, 1.0], [0.0, 1.0]])
+process_noise = diag(array([0.05, 0.01]))
+measurement_matrix = array([[1.0, 0.0]])
+measurement_noise = array([[0.25]])
+
+kf.predict_linear(system_matrix, process_noise)
+diagnostics = kf.update_linear(
+    array([0.9]),
+    measurement_matrix,
+    measurement_noise,
+    return_diagnostics=True,
+)
+
+print(kf.get_point_estimate())
+print(diagnostics["nis"])
+```
+
+Avoid importing NumPy, PyTorch, or JAX directly inside reusable estimator code
+unless the API is intentionally backend-specific.
+
+## 3. Keep shapes explicit
+
+Backend differences usually appear first as shape, dtype, or scalar-conversion
+issues. Prefer explicit one-dimensional vectors and two-dimensional matrices:
+
+| Quantity | Recommended shape |
+|----------|-------------------|
+| State mean | `(n,)` |
+| State covariance | `(n, n)` |
+| Measurement vector | `(m,)` |
+| Measurement matrix | `(m, n)` |
+| Measurement covariance | `(m, m)` |
+
+For a one-dimensional measurement, use `array([z])` rather than a scalar and
+`array([[r]])` rather than `array([r])`.
+
+## 4. Test the same script under each target backend
+
+Use the backend matrix as a contract, not a promise that every advanced helper is
+portable. For a compact smoke test, run:
+
+```bash
+for backend in numpy pytorch jax; do
+  PYRECEST_BACKEND="$backend" python my_filter.py
+done
+```
+
+If the workflow depends on backend metadata, inspect it directly:
+
+```bash
+pyrecest backends --format markdown
+python scripts/check_backend_api_matrix.py
+```
+
+## 5. Document intentional backend restrictions
+
+When an API cannot preserve backend semantics, update
+`src/pyrecest/_backend/capabilities.py`, the backend API matrix, and a focused
+test in the same patch. If an operation copies through NumPy or SciPy, document
+whether gradients, device placement, or JAX tracing are preserved.
diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
@@ -10,6 +10,8 @@ user code.
   Gaussian factors, and compare the result with the information-form update.
 - [Write a filter loop](write-a-filter-loop.md): run a Kalman predict/update
   loop and inspect the posterior state.
+- [Backend-portable workflows](backend-portable-workflows.md): write a compact
+  Kalman workflow that can be smoke-tested under NumPy, PyTorch, and JAX.
 - [Robust Kalman updates](robust-kalman-update.md): use NIS gating and
   heavy-tailed measurement updates for outlier-prone measurements.
 - [Run a tracker](run-a-tracker.md): initialize a labeled multi-Bernoulli

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -30,6 +30,7 @@ nav:
       - Ecosystem Positioning: ecosystem.md
       - Public Identity: public-identity.md
       - Reproducible Experiments: reproducible-experiments.md
+      - Scientific Validation: scientific-validation.md
       - Representation Conversion: representation-conversion.md
       - Model Objects: models.md
       - Public Protocols: protocols.md
@@ -40,6 +41,7 @@ nav:
       - Overview: tutorials/index.md
       - Use a Distribution: tutorials/use-a-distribution.md
       - Write a Filter Loop: tutorials/write-a-filter-loop.md
+      - Backend-Portable Workflows: tutorials/backend-portable-workflows.md
       - Robust Kalman Updates: tutorials/robust-kalman-update.md
       - Run a Tracker: tutorials/run-a-tracker.md
       - Evaluate a Simulation: tutorials/evaluate-a-simulation.md

diff --git a/scripts/check_backend_api_matrix.py b/scripts/check_backend_api_matrix.py
@@ -0,0 +1,156 @@
+#!/usr/bin/env python
+"""Check that the documented backend API matrix matches capability metadata.
+
+The checker intentionally loads ``src/pyrecest/_backend/capabilities.py`` from a
+file path instead of importing ``pyrecest``. That keeps it usable in lightweight
+documentation jobs that may not install the package's numerical dependencies.
+"""
+
+from __future__ import annotations
+
+import argparse
+import importlib.util
+import sys
+from pathlib import Path
+from types import ModuleType
+
+
+BACKEND_COLUMNS = ("numpy", "pytorch", "jax")
+
+
+def _repo_root() -> Path:
+    return Path(__file__).resolve().parents[1]
+
+
+def load_capability_module(source_path: Path | None = None) -> ModuleType:
+    """Load the backend capability module without importing the package."""
+    capabilities_path = source_path or _repo_root() / "src" / "pyrecest" / "_backend" / "capabilities.py"
+    spec = importlib.util.spec_from_file_location("_pyrecest_backend_capabilities", capabilities_path)
+    if spec is None or spec.loader is None:
+        raise RuntimeError(f"Cannot load backend capability metadata from {capabilities_path}")
+
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)
+    return module
+
+
+def parse_documented_matrix(path: Path) -> dict[str, dict[str, str]]:
+    """Parse the public API backend matrix table from ``docs/backend-api-matrix.md``."""
+    rows: dict[str, dict[str, str]] = {}
+    in_public_api_table = False
+
+    for raw_line in path.read_text(encoding="utf-8").splitlines():
+        line = raw_line.strip()
+        if not line.startswith("|"):
+            if in_public_api_table:
+                break
+            continue
+
+        cells = [cell.strip() for cell in line.strip("|").split("|")]
+        if cells == ["API", "NumPy", "PyTorch", "JAX", "Notes"]:
+            in_public_api_table = True
+            continue
+        if not in_public_api_table:
+            continue
+        if len(cells) == 5 and all(set(cell) <= {"-", ":"} for cell in cells):
+            continue
+        if len(cells) != 5:
+            continue
+
+        api_name = cells[0].strip("`")
+        rows[api_name] = {
+            "numpy": cells[1],
+            "pytorch": cells[2],
+            "jax": cells[3],
+            "notes": cells[4],
+        }
+
+    return rows
+
+
+def _normalize_expected_row(row: dict[str, str]) -> dict[str, str]:
+    return {
+        "numpy": row.get("numpy", "unknown"),
+        "pytorch": row.get("pytorch", "unknown"),
+        "jax": row.get("jax", "unknown"),
+        "notes": row.get("notes", ""),
+    }
+
+
+def validate_documented_matrix(
+    documented: dict[str, dict[str, str]],
+    capabilities: dict[str, dict[str, str]],
+    support_levels: tuple[str, ...],
+) -> list[str]:
+    """Return validation errors for mismatches between docs and metadata."""
+    errors: list[str] = []
+    documented_names = set(documented)
+    capability_names = set(capabilities)
+
+    for missing in sorted(capability_names - documented_names):
+        errors.append(f"docs/backend-api-matrix.md is missing API row `{missing}`")
+    for extra in sorted(documented_names - capability_names):
+        errors.append(f"docs/backend-api-matrix.md contains unknown API row `{extra}`")
+
+    for api_name in sorted(documented_names & capability_names):
+        documented_row = documented[api_name]
+        expected_row = _normalize_expected_row(capabilities[api_name])
+        for backend_name in BACKEND_COLUMNS:
+            expected = expected_row[backend_name]
+            observed = documented_row[backend_name]
+            if expected not in support_levels:
+                errors.append(f"metadata row `{api_name}` has invalid {backend_name} support level `{expected}`")
+            if observed != expected:
+                errors.append(
+                    f"docs/backend-api-matrix.md row `{api_name}` has {backend_name}={observed!r}; expected {expected!r}"
+                )
+        if documented_row["notes"] != expected_row["notes"]:
+            errors.append(
+                f"docs/backend-api-matrix.md row `{api_name}` has notes {documented_row['notes']!r}; expected {expected_row['notes']!r}"
+            )
+
+    return errors
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "--docs",
+        type=Path,
+        default=_repo_root() / "docs" / "backend-api-matrix.md",
+        help="Path to docs/backend-api-matrix.md.",
+    )
+    parser.add_argument(
+        "--source",
+        type=Path,
+        default=_repo_root() / "src" / "pyrecest" / "_backend" / "capabilities.py",
+        help="Path to src/pyrecest/_backend/capabilities.py.",
+    )
+    return parser
+
+
+def main(argv: list[str] | None = None) -> int:
+    args = build_parser().parse_args(argv)
+    module = load_capability_module(args.source)
+    documented = parse_documented_matrix(args.docs)
+
+    if not documented:
+        print(f"No public API matrix table found in {args.docs}", file=sys.stderr)
+        return 1
+
+    errors = validate_documented_matrix(
+        documented,
+        dict(module.API_BACKEND_CAPABILITIES),
+        tuple(module.BACKEND_SUPPORT_LEVELS),
+    )
+    if errors:
+        for error in errors:
+            print(error, file=sys.stderr)
+        return 1
+
+    print(f"Backend API matrix is synchronized with {args.source}")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())