cuTile Rust v0.2.0: low-precision inference support and paper artifacts #164

elibol · 2026-06-16T21:51:42Z

elibol
Jun 16, 2026
Maintainer

cuTile Rust 0.2.0 is now available.

This release focuses on low-precision inference support. It adds CUDA 13.3-oriented support for NVFP4 packing and unpacking, block-scaled matrix multiply, runnable NVFP4 and MXFP8 examples, and the new cutile-kernels crate as a source of high-performance inference kernels written in cuTile Rust.

This release also accompanies our paper, Fearless Concurrency on the GPU: https://arxiv.org/abs/2606.15991. The paper artifacts are included in the repository under cutile-benchmarks/paper/.

Our results show that cuTile Rust adds safety without measurable runtime overhead. On NVIDIA B200, cuTile Rust reaches 7 TB/s for element-wise operations and 2 PFlop/s for GEMM, about 91% of peak memory bandwidth and 92% of dense f16 peak, respectively. Safe Rust persistent GEMM reaches 2.07 PFlop/s at M=N=K=8192, within 0.3% of the corresponding low-level Tile IR variant.

We also evaluated Grout, a Qwen3 inference engine built with cuTile Rust in collaboration with Hugging Face. In batch-1 Qwen3 decode, Grout reaches 171 tokens/s for Qwen3-4B on NVIDIA GeForce RTX 5090 and 82 tokens/s for Qwen3-32B on B200, showing strong performance on memory-bound inference tasks, consistent with our HBM roofline analysis.

cuTile Rust remains early-stage research software, but 0.2.0 is a meaningful step toward writing practical inference kernels in idiomatic Rust while preserving Rust's ownership discipline across the GPU launch boundary.

Release notes: https://github.com/NVlabs/cutile-rs/releases/tag/v0.2.0
Crates.io: https://crates.io/crates/cutile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuTile Rust v0.2.0: low-precision inference support and paper artifacts #164

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

cuTile Rust v0.2.0: low-precision inference support and paper artifacts #164

Uh oh!

elibol Jun 16, 2026 Maintainer

Replies: 0 comments

elibol
Jun 16, 2026
Maintainer