Releases · milanofthe/rslab · GitHub

Release list

v0.12.0 Latest

Latest

milanofthe released this 02 Jul 13:08

88977fb

Block GMRES orthogonalization overhaul, unified thread control, and a Python multi-RHS iterative path.

Block GMRES (issue #3, item 1)

gmres_block now orthogonalizes the whole RHS panel with block-CGS2 — a parallel, panel-wide sweep over the vector dimension — instead of per-RHS Gram-Schmidt. The multi-RHS solve scales across threads (~2.5x at 12 cores on a deep-Krylov solve, where the old per-RHS path was flat), stays bit-identical across thread counts, and is memory-neutral.

Thread control

Default thread policy is now Threads::Auto { max: 4 } — predict per matrix, capped at 4 (the pareto-optimal throughput-per-core point). The numeric factor is unchanged.
New Threads::Ambient — run the factorization on the current rayon pool (no new spawn), for re-factor-in-loop workloads.
New rslab::with_threads(n, f) — one bounded pool for solver-in-the-loop: factor once, then run the multi-RHS GMRES loop capped at n cores.

Python bindings

Ldlt.gmres_block(B, tol, maxit, restart) and Lu.gmres_block(...) — preconditioned multi-RHS iterative solve from Python (real and complex).
Docstrings raised to full numpydoc across the package.

Benches

block_gmres_scaling with thread-ladder and RHS-sweep modes.

Assets 2

v0.1.0

milanofthe released this 28 Jun 15:39

5af99e3

Full Changelog: https://github.com/milanofthe/rslab/commits/v0.1.0

Assets 2