krnl was developed to replace core functionality in autograph:
- Only targets Vulkan, more portable than Metal / DX12.
- Metal is supported via MoltenVK.
- GPGPU kernels implemented inline in Rust:
- Kernels can be defined in the same file, near where they are invoked.
- Modules allow sharing code between host and device.
- Kernel bindings are type safe, checked at compile time.
- Simple iterator patterns can be implemented without unsafe.
- Supports specialization constants provided at runtime.
- DeviceInfo includes useful properties:
- Max / default threads per group.
- Max / min threads per subgroup.
- With DebugPrintf, kernel panics produce errors on the host.
- krnlc generates a device crate and invokes spirv-builder.
- spirv-builder / spirv-tools are compiled once on install.
- Significantly streamlines and accelerates workflow.
- Kernels are compressed to reduce package and binary size.
- Device operations readily execute:
- Block until kernels / transfers can queue.
- An operation can be queued while another is executing.
- Reduced latency, better repeatability, reliability, and performance.
- Device buffers can be copied by the host if host visible.
- Large buffer copies are streamed rather than allocating a large temporary:
- Reuses a few small buffers for transfers.
- Overlaps host and device copies.
- Performance significantly closer to CUDA.
- Also streams between devices.
- Device buffers can be i32::MAX bytes (~2 GB, up from 256 MB).
- Scalar / ScalarBufferBase replaces Float / FloatBuffer:
- Streamlined conversions between buffers.
- Buffers can be sliced.
- Supports wasm (without device feature).
MSRV: 1.70.0