Skip to content

v0.10.1-beta — Channel-parallel graph path (multicore)

Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 27 Jun 17:35

Performance

Channel-parallel graph evolve

  • ADR-0184 D3: evolve_batched and batched adjoint now split channels across cores via std::thread::scope
  • Gated behind parallel feature (C≥2); each worker has its own ScratchPool
  • Graph Krylov ML path was previously single-threaded (parallel feature only covered Strang2D/3D grid kernels)
  • Bit-identical to serial (0-ULP; forward + ascending-index gradient reduction per ADR-0184 D4/D5)
  • Measured ~5–6× on i7-12700K (612–1942% CPU)
  • Closes the "speed" half of the memory↔speed contradiction for graph diffusion / SSM layers

See CHANGELOG.md for full technical details and gate validation.