Releases: milanofthe/rslab
Releases · milanofthe/rslab
Release list
v0.12.0
Block GMRES orthogonalization overhaul, unified thread control, and a Python multi-RHS iterative path.
Block GMRES (issue #3, item 1)
gmres_blocknow orthogonalizes the whole RHS panel with block-CGS2 — a parallel, panel-wide sweep over the vector dimension — instead of per-RHS Gram-Schmidt. The multi-RHS solve scales across threads (~2.5x at 12 cores on a deep-Krylov solve, where the old per-RHS path was flat), stays bit-identical across thread counts, and is memory-neutral.
Thread control
- Default thread policy is now
Threads::Auto { max: 4 }— predict per matrix, capped at 4 (the pareto-optimal throughput-per-core point). The numeric factor is unchanged. - New
Threads::Ambient— run the factorization on the current rayon pool (no new spawn), for re-factor-in-loop workloads. - New
rslab::with_threads(n, f)— one bounded pool for solver-in-the-loop: factor once, then run the multi-RHS GMRES loop capped atncores.
Python bindings
Ldlt.gmres_block(B, tol, maxit, restart)andLu.gmres_block(...)— preconditioned multi-RHS iterative solve from Python (real and complex).- Docstrings raised to full numpydoc across the package.
Benches
block_gmres_scalingwith thread-ladder and RHS-sweep modes.
v0.1.0
Full Changelog: https://github.com/milanofthe/rslab/commits/v0.1.0