Skip to content

v1.18.0-release

Choose a tag to compare

@Anerudhan Anerudhan released this 27 Jan 23:08
· 23 commits to main since this release
b8c0656

cuDNN Frontend v1.18.0 Release Notes

cuDNN Frontend v1.18.0 is the recommended version for cuDNN 9.18.1 and later releases.

General Improvements 🚀

  • Move away from internally using the v0.x API. Rather, now the cudnn backend API is directly called.
  • Improve the execution overhead by caching repeated graph query.

Open-Source Kernels

New open source kernel for Grouped Gemm and Swiglu fussion

Enhancements ✨

Scaled Dot-Product Attention (SDPA)

  • New Features: Allows support for dynamic shapes for fprop. This will help reduce the graph building across different batch and sequence lengths.

  • Support Surface:

    • Now allows deterministic bprop for SDPA
    • Added support for bprop for ragged tensors in A100
  • More samples:

    • Open sourcing our sdpa test harness. Showcase additional testing for determinism, fp8 sizes for MLA
    • Added samples to showcase chunked prefill.

Mixture of Expers (MoE)

  • New API: Added support for moe_grouped_matmul. See cpp sample and documentation for API reference.

Matmul

Convolution

Additional Improvements

Benchmarking 📊

  • Updated the benchmark results for the sdpa improvements added in cuDNN 9.18.1