Skip to content

v1.17.0-release

Choose a tag to compare

@Anerudhan Anerudhan released this 20 Dec 00:10
· 24 commits to main since this release
b372d39

cuDNN Frontend v1.17.0 Release Notes

cuDNN Frontend v1.17.0 is the recommended version for cuDNN 9.17.0 and later releases.

New Features 🚀

Open-Source Kernels

  • Native Sparse Attention : The Native Sparse Attention (NSA) module implements Native Sparse attention as described in the Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention. Samples of usage for Blackwell architecture in test/python/fe_api/nsa

  • Gemm/Swiglu : Gemm_Swiglu now supports block-scaled FP8/FP4 datatypes.
    API changes:

    • Output tensors have been renamed from "C" and "Glu" to "AB12" and "C", respectively.
    • "use_2cta_intrs" Option has been removed. This will be inferred automatically from tile shape.

Enhancements ✨

Scaled Dot-Product Attention (SDPA)

Additional Improvements

  • Tensor properties: Added vector Dim and vectorization count to the tensor properties.
  • Graph wrapper: Fixed an issue in the native graph wrapper that caused BufferError in non-pytorch tensors.

Benchmarking 📊

  • Updated the benchmark results for the sdpa improvements added in cuDNN 9.17.0. GB200 and GB300 data.

Samples