v0.10.1-beta — Channel-parallel graph path (multicore)
Pre-release
Pre-release
Performance
Channel-parallel graph evolve
- ADR-0184 D3:
evolve_batchedand batched adjoint now split channels across cores viastd::thread::scope - Gated behind
parallelfeature (C≥2); each worker has its ownScratchPool - Graph Krylov ML path was previously single-threaded (parallel feature only covered Strang2D/3D grid kernels)
- Bit-identical to serial (0-ULP; forward + ascending-index gradient reduction per ADR-0184 D4/D5)
- Measured ~5–6× on i7-12700K (612–1942% CPU)
- Closes the "speed" half of the memory↔speed contradiction for graph diffusion / SSM layers
See CHANGELOG.md for full technical details and gate validation.