Skip to content

v0.5.0: Phase 6 — Deployment, Portability & Testing Infrastructure

Choose a tag to compare

@bwiemz bwiemz released this 18 Mar 20:14
· 1906 commits to main since this release

Multi-Backend KIR Foundation (M47a)

  • Kernel IR — 40+ instruction SSA-form intermediate representation
  • PTX Backend — KIR to PTX lowering with typed register allocation
  • GpuTarget — CUDA/ROCm/Metal/WebGPU with per-backend feature capability tables
  • GpuBackend trait — alloc/free/copy/launch/sync interface for all backends
  • target(backend) — conditional compilation per GPU target
  • --target — CLI flag for backend selection

vmap AST Transform (M39b)

  • VmapTransformer — FnDef-to-FnDef AST rewriting producing _batched variants
  • Matmul/reduction/transpose rewriting with batch status propagation
  • nsl_vmap_check_batch runtime FFI

Testing Infrastructure

  • Snapshot testing (insta) — 7 PTX/KIR/fusion snapshots catching silent codegen regressions
  • Differential oracle testing — same script with/without --disable-fusion, assert numerical equivalence

Full Changelog: v0.4.0...v0.5.0