Skip to content

chore(rust): bump rand 0.8 -> 0.10, rand_xoshiro 0.6 -> 0.8#391

Merged
igerber merged 3 commits intomainfrom
rust-rand-bump
Apr 26, 2026
Merged

chore(rust): bump rand 0.8 -> 0.10, rand_xoshiro 0.6 -> 0.8#391
igerber merged 3 commits intomainfrom
rust-rand-bump

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 26, 2026

Summary

Bumps two coupled Rust crates that the bootstrap kernel depends on:

  • rand 0.8 → 0.10
  • rand_xoshiro 0.6 → 0.8

These move together because rand_xoshiro 0.8 requires the rand_core shipped by rand 0.9+. Bumping only one would fail to compile.

Also bumps the crate's declared rust-version from 1.84 to 1.85 (both new dep versions require Rust 1.85+).

API churn

rand 0.9 renamed the gen* method family. Three call sites in rust/src/bootstrap.rs:

Old New
rng.gen::<bool>() rng.random::<bool>()
rng.gen::<f64>() rng.random::<f64>()
rng.gen_range(0..6) rng.random_range(0..6)

Bit-identity: per-weight-type breakdown

Verified by direct side-by-side comparison against main on the same (seed=42, n_bootstrap=2, n_units=4) call:

Weight type Byte stream vs main
Rademacher identical
Mammen identical
Webb shifted

Why Webb shifted but the other two didn't: random::<bool>() and random::<f64>() are byte-stable across rand 0.80.10. random_range (the integer range sampler used only by Webb) is not — rand 0.9 reworked the internal algorithm to use improved rejection sampling. So Xoshiro256PlusPlus::seed_from_u64(seed) followed by random_range(0..6) consumes RNG bytes differently than gen_range(0..6) did.

What this means:

  • Webb weight distributional properties: unchanged (still uniform over 6-point support)
  • Aggregate inference (SE, p-values, CI): converges identically for any reasonable n_bootstrap
  • Bit-for-bit Webb seeded reproducibility against pre-bump baselines: broken

Internal validation (no regressions)

Local on macOS (--features accelerate), DIFF_DIFF_BACKEND=rust:

Sweep Result
tests/test_rust_backend.py (full) 95 passed
pytest -k "bootstrap or wild" 372 passed
pytest -k "webb" 15 passed (incl. test_path_sup_t_seed_reproducibility[webb])
All 9 Webb-mentioning test files in full 794 passed

The test suite uses within-build seed-reproducibility (not cross-version baselines), so the Webb byte shift is invisible to existing tests. No baseline regeneration required.

New regression guard

Added TestRustBackend::test_bootstrap_weights_bit_identity_snapshot which pins fixed-seed weights for all three weight types. Future RNG drift across crate versions will fail this test loudly with a localized error.

CHANGELOG

Added an [Unreleased] entry documenting the Webb byte shift so users with external pinned Webb baselines are not surprised on upgrade.

Methodology references

N/A - dependency version bump with mechanical API rename. No identification, weighting, variance/SE family, or default behavior changed.

Validation

  • cargo build --features accelerate: clean
  • maturin develop --release --features accelerate: clean
  • Local pytest results above (1276 unique test passes across the directly-affected surfaces, with overlap across sweeps)
  • CI matrix: Linux/macOS/Windows × Python 3.11/3.13/3.14 × Rust + pure-Python fallback

Security / privacy

No secrets/PII. Pure dependency upgrade.

Related

These two crates are coupled through rand_core and must move together;
rand_xoshiro 0.8 requires the rand_core that ships with rand 0.9+.

API churn handled in bootstrap.rs (rand 0.9 renamed the gen* family):
- rng.gen::<bool>()    -> rng.random::<bool>()
- rng.gen::<f64>()     -> rng.random::<f64>()
- rng.gen_range(0..6)  -> rng.random_range(0..6)

Bit-identity preserved: Xoshiro256PlusPlus::seed_from_u64(seed) produces
the same byte stream in 0.6 and 0.8, so existing bootstrap baselines
remain valid. Verified locally on macOS Accelerate:
- tests/test_rust_backend.py: 95 passed (includes Rust<->Python parity
  tests on bootstrap weights and bootstrap-derived SE/CI quantiles)
- Bootstrap-focused sweep across the broader test suite: 372 passed
- 467 total passing across the directly affected surfaces

Supersedes #382 (rand) and #384 (rand_xoshiro). blas-src 0.10 -> 0.14
(#386) deferred pending an ndarray compatibility check; out of scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes. The highest-severity unmitigated issue is a P1 build-contract regression: rust/Cargo.toml:L5 still declares rust-version = "1.84", but the bumped RNG crates now require Rust 1.85+. I could not run Cargo end-to-end here because crates.io resolution is blocked in this environment, so this finding is based on the local manifest plus upstream published crate metadata. (docs.rs)

Executive Summary

  • P1: The dependency bump breaks the crate’s advertised MSRV. rand_xoshiro 0.8.0 publishes rust-version = "1.85", and rand 0.10 requires Rust 1.85+, but this crate still advertises 1.84 in rust/Cargo.toml:L5,L28-L29. (docs.rs)
  • No methodology defect found. The changed surface is the shared multiplier/wild-bootstrap weight kernel, and the diff keeps the registry-specified Rademacher/Mammen/Webb supports and probabilities unchanged in docs/methodology/REGISTRY.md:L332-L359 and rust/src/bootstrap.rs:L57-L146.
  • The existing Rust-vs-NumPy bootstrap RNG parity limitation is already tracked in TODO.md:L87, so it is informational rather than a blocker for this PR.
  • Current tests cover distributional properties and same-build reproducibility, but they do not pin fixed-seed output against a golden baseline. tests/test_rust_backend.py:L46-L101 and tests/test_rust_backend.py:L667-L715 would not catch a crate upgrade that preserves moments while changing the actual draw stream.

Methodology

  • Severity: P3. Impact: No methodology defect found. This PR affects the shared Rust multiplier-bootstrap generator exposed through diff_diff/bootstrap_utils.py:L162-L192, but the registry requirements for Rademacher, Mammen, and Webb weights remain satisfied in docs/methodology/REGISTRY.md:L332-L359 and rust/src/bootstrap.rs:L57-L146. Concrete fix: None.

Code Quality

  • No findings.

Performance

  • No findings. The patch does not change allocation shape, rayon chunking, or the row-wise generation loops in rust/src/bootstrap.rs:L59-L75, rust/src/bootstrap.rs:L94-L114, and rust/src/bootstrap.rs:L131-L150.

Maintainability

  • Severity: P1. Impact: rust/Cargo.toml:L5 still advertises Rust 1.84, but rand_xoshiro 0.8.0 publishes rust-version = "1.85" and the rand 0.10 upgrade guide says rand crates now require Rust 1.85+. On the declared MSRV, this dependency set will fail before users ever reach the bootstrap code. CI currently installs stable in .github/workflows/rust-test.yml:L46-L47 and .github/workflows/publish.yml:L33-L35, so this support-contract regression can slip through green CI. Concrete fix: either keep rand/rand_xoshiro on versions compatible with 1.84, or intentionally raise rust-version to 1.85 and update CI/docs/release notes to match. (docs.rs)

Tech Debt

  • No new tech-debt findings. The related Rust-vs-NumPy RNG parity limitation is already tracked in TODO.md:L87, so it does not block this PR.

Security

  • No findings.

Documentation/Tests

  • Severity: P2. Impact: The changed risk surface here is fixed-seed stream drift across RNG crate versions, but the current Rust backend tests only assert shape, moments, value support, and reproducibility within the same build (tests/test_rust_backend.py:L46-L101, tests/test_rust_backend.py:L667-L715). They would not fail if rand_xoshiro changed the emitted sequence while preserving the same distribution. Concrete fix: add a regression test that snapshots the first N generated weights, or a stable hash of the full matrix, for rademacher, mammen, and webb under a fixed seed.

Path to Approval

  1. Resolve the MSRV mismatch in rust/Cargo.toml: either keep dependency versions that remain compatible with Rust 1.84, or bump rust-version to 1.85 and treat that as an intentional support-policy change. (docs.rs)
  2. Add one fixed-seed regression test for the Rust bootstrap weight generator across all three weight types so the PR’s “no draw drift” claim is enforced in-tree.

…pshot

Two fixes plus a CHANGELOG correction surfacing a finding from the
review work itself:

1. P1 - MSRV bump (rust/Cargo.toml):
   rand_xoshiro 0.8 and rand 0.10 both declare rust-version = "1.85".
   Cargo.toml previously advertised 1.84, which would silently break
   anyone building from source on Rust 1.84 (CI uses stable so it
   wouldn't be caught there). Bump to "1.85" as a deliberate MSRV
   change, the cost of the upgrade.

2. P2 - bit-identity snapshot test (tests/test_rust_backend.py):
   New TestRustBackend::test_bootstrap_weights_bit_identity_snapshot
   pins fixed-seed (seed=42, n_bootstrap=2, n_units=4) bootstrap
   weights for all three weight types. Existing tests verify only
   shape, moments, and within-build seed-reproducibility, so a future
   crate upgrade that preserves the distribution while changing the
   draw stream would silently slip through. The snapshot test fails
   loudly with a localized error message instead.

3. Honest CHANGELOG entry on the Webb byte shift (NEW finding from
   side-by-side comparison against main):

   The bit-identity claim in the original PR description was correct
   for Rademacher and Mammen but WRONG for Webb. rand 0.9 reworked the
   internal algorithm for random_range (improved rejection sampling),
   so the same xoshiro byte stream produces different bucket selections
   than the old gen_range did. Distributional properties unchanged
   (still uniform over the 6-point support), aggregate inference
   converges identically, but per-seed Webb draws are now different.

   Verified against main on the same call:
     Rademacher: identical
     Mammen: identical
     Webb: DIFFERENT (verified across both rows of the snapshot)

   No internal test regression: 794 tests pass across all 9 webb-
   mentioning test files with DIFF_DIFF_BACKEND=rust. The test suite
   uses within-build seed-reproducibility, not cross-version baselines,
   so the byte shift is invisible to existing tests. Documented in
   CHANGELOG under [Unreleased] so users with external pinned Webb
   baselines aren't surprised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 26, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 06b3b177be43e44ad2c33305a706a1e4f5b0ef2b


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review status: the previous P1 MSRV/build-contract issue is fixed; rust-version now matches the bumped Rust deps in rust/Cargo.toml:L1-L29.
  • The affected methodology surface is the shared multiplier-bootstrap weight generator (generate_bootstrap_weights_batch plus the Rademacher/Mammen/Webb helpers). The PR preserves the registry-specified supports and probabilities in docs/methodology/REGISTRY.md:L332-L359 and rust/src/bootstrap.rs:L54-L150.
  • The only intentional runtime behavior shift is Webb fixed-seed byte-stream drift from the integer-range sampler. That is a valid implementation-level RNG change, not an undocumented methodology deviation, and it is documented in CHANGELOG.md:L8-L11.
  • The prior regression-test gap is addressed by the new fixed-seed snapshot test in tests/test_rust_backend.py:L103-L150.
  • Existing Rust-vs-NumPy same-seed multiplier parity remains pre-existing tracked debt in TODO.md:L87-L87; this PR does not newly introduce or hide that limitation.
  • Verification limit: I could not complete a local Cargo build in this sandbox because offline dependency resolution failed on uncached blas-src; assessment is based on the diff and surrounding repo context.

Methodology

  • Severity: P3 (informational). Impact: No methodology defect found. The changed code in rust/src/bootstrap.rs:L57-L150 still implements the registry contract for multiplier-bootstrap weights from docs/methodology/REGISTRY.md:L332-L359: Rademacher remains {−1,+1} with 1/2 each, Mammen retains its documented two-point distribution, and Webb retains the documented six-point equal-probability support. The Webb seed-stream shift changes reproducibility, not the bootstrap distribution or the estimator’s variance/SE methodology. No REGISTRY.md deviation note is required because the method contract itself is unchanged. Concrete fix: None.

Code Quality

  • No findings. I checked for a partial API migration and found no remaining gen::<...> / gen_range(...) call sites in the Rust sources; the touched module is internally consistent at rust/src/bootstrap.rs:L67-L70, L102-L109, L141-L146.

Performance

  • No findings.

Maintainability

  • No findings. The prior P1 build-contract problem is resolved by rust/Cargo.toml:L1-L29, and the support-policy change is surfaced to users in CHANGELOG.md:L8-L11.

Tech Debt

  • Severity: P3 (informational, tracked). Impact: Backend-specific same-seed multiplier RNG parity remains a pre-existing tracked limitation in TODO.md:L87-L87; diff_diff/bootstrap_utils.py:L162-L192 still dispatches to Rust when available and NumPy otherwise, so same-seed cross-backend identity is not an invariant this PR newly establishes. Concrete fix: None for approval; keep the existing follow-up tracked in TODO.md.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. Impact: tests/test_rust_backend.py:L103-L150 is described as a byte-for-byte / bit-identity snapshot, but the assertion uses np.testing.assert_allclose(..., atol=1e-14, rtol=1e-14). That still resolves the prior missing-snapshot concern, but it is slightly weaker than the wording suggests because a very small float-level drift could pass. Concrete fix: If true bit identity is the contract, switch this guard to np.testing.assert_array_equal or compare a raw-byte/hash snapshot instead.

…ay_equal

Reviewer noted the snapshot test docstring claims "byte-for-byte / bit
identity" but the assertion was assert_allclose(atol=1e-14, rtol=1e-14),
which is slightly weaker than the wording suggests.

Switch to np.testing.assert_array_equal so the assertion strength
matches the stated contract. Snapshot values are either exact f64
(Rademacher = +/-1.0) or computed once via correctly-rounded IEEE 754
sqrt in Rust (Mammen's golden-ratio constants, Webb's sqrt(N/2)
constants), so cross-platform bit-equality holds on conformant
hardware. Inline comment added explaining the safety argument.

If a runner ever fails this test on a hardware/sqrt drift basis (very
unlikely on x86_64 / aarch64 / ARM64 Windows), we'd learn something
useful about that runner before deciding to soften the check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 26, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 90f526166b97eb9499bdb4bdee3046a92d70e020


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review status: the prior MSRV/build-contract issue is resolved in rust/Cargo.toml:L1-L29, and the earlier snapshot-strength concern is resolved by tests/test_rust_backend.py:L103-L152.
  • The only methodology-adjacent surface touched is the Rust multiplier-bootstrap weight generator in rust/src/bootstrap.rs:L57-L150.
  • Cross-check against docs/methodology/REGISTRY.md:L332-L359 shows the Rademacher, Mammen, and Webb support/probability contracts are unchanged; no estimator math, identification logic, weighting formula, or variance/SE formula changed.
  • Webb same-seed byte-stream drift is documented in CHANGELOG.md:L8-L11; given unchanged Webb support/probabilities, this is an implementation-level RNG change, not an undocumented methodology deviation.
  • Pre-existing Rust-vs-NumPy same-seed multiplier-weight parity remains tracked in TODO.md:L87-L87 and is not newly introduced by this PR.
  • Verification scope: static diff/context review only; I did not run a local Cargo/Pytest build in this read-only, network-restricted sandbox.

Methodology

  • Severity: P3 (informational). Impact: The affected method is the Rust multiplier-bootstrap weight generator (generate_rademacher_batch, generate_mammen_batch, generate_webb_batch) in rust/src/bootstrap.rs:L57-L150. The updated calls still implement the registry contract in docs/methodology/REGISTRY.md:L332-L359: Rademacher remains ±1 with 1/2 mass, Mammen retains the documented two-point support/probabilities, and Webb remains uniform on the six-point support. The only behavior shift is Webb fixed-seed byte-stream drift from random_range(0..6), which is documented in CHANGELOG.md:L8-L11; this is not an undocumented methodology deviation or an incorrect SE/inference implementation. Concrete fix: None.

Code Quality

  • No findings. The API rename was applied consistently in the touched Rust module; I found no leftover gen::<...> or gen_range(...) call sites in rust/src/bootstrap.rs:L57-L150.

Performance

  • No findings.

Maintainability

  • No findings. The prior MSRV/build-contract issue is addressed by rust/Cargo.toml:L1-L29.

Tech Debt

  • Severity: P3 (informational, tracked). Impact: Same-seed Rust-vs-NumPy multiplier-weight parity remains a pre-existing tracked limitation in TODO.md:L87-L87, with the shared dispatcher still at diff_diff/bootstrap_utils.py:L162-L192. This PR does not newly introduce or hide that limitation. Concrete fix: None for approval; keep the existing TODO follow-up.

Security

  • No findings.

Documentation/Tests

  • No findings. The new regression guard in tests/test_rust_backend.py:L103-L152 now uses np.testing.assert_array_equal, so the earlier “bit-identity snapshot” wording is aligned with the assertion strength.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 26, 2026
@igerber igerber merged commit 199268b into main Apr 26, 2026
24 of 25 checks passed
@igerber igerber deleted the rust-rand-bump branch April 26, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant