Skip to content

Releases: milanofthe/rslab

Release list

v0.12.0

Choose a tag to compare

@milanofthe milanofthe released this 02 Jul 13:08

Block GMRES orthogonalization overhaul, unified thread control, and a Python multi-RHS iterative path.

Block GMRES (issue #3, item 1)

  • gmres_block now orthogonalizes the whole RHS panel with block-CGS2 — a parallel, panel-wide sweep over the vector dimension — instead of per-RHS Gram-Schmidt. The multi-RHS solve scales across threads (~2.5x at 12 cores on a deep-Krylov solve, where the old per-RHS path was flat), stays bit-identical across thread counts, and is memory-neutral.

Thread control

  • Default thread policy is now Threads::Auto { max: 4 } — predict per matrix, capped at 4 (the pareto-optimal throughput-per-core point). The numeric factor is unchanged.
  • New Threads::Ambient — run the factorization on the current rayon pool (no new spawn), for re-factor-in-loop workloads.
  • New rslab::with_threads(n, f) — one bounded pool for solver-in-the-loop: factor once, then run the multi-RHS GMRES loop capped at n cores.

Python bindings

  • Ldlt.gmres_block(B, tol, maxit, restart) and Lu.gmres_block(...) — preconditioned multi-RHS iterative solve from Python (real and complex).
  • Docstrings raised to full numpydoc across the package.

Benches

  • block_gmres_scaling with thread-ladder and RHS-sweep modes.

v0.1.0

Choose a tag to compare

@milanofthe milanofthe released this 28 Jun 15:39