cuDNN v8 API tracking issue

cuDNN v8 introduces a completely new API, and the old v7 API will no longer be supported at some point in the future. PyTorch needs to adopt this new API, which requires a complete rewriting of PyTorch's cuDNN convolution binding.

The new cuDNN v8 implementation of convolution will be based on [cudnn-frontend](https://github.com/NVIDIA/cudnn-frontend), which is the officially recommended way to use cuDNN v8.

To make it easy to write and review, this adoption will be incremental. Below is a roadmap about how we are planning to do the work:

## Stage 0: build a sketch

The purpose of this stage is to start the adoption process and add some basic infrastructure to PyTorch so that future work can be done incrementally and in parallel.

In https://github.com/pytorch/pytorch/pull/51390 and https://github.com/pytorch/pytorch/pull/50827, I have already refactored our convolution bindings into separate files `ConvPlaceholders.cpp`, `ConvShared.cpp`, `ConvShared.h`, `Conv_v7.cpp`, `Conv_v8.cpp`. Future work will be mostly on `Conv_v8.cpp`.

In https://github.com/pytorch/pytorch/pull/51390 a new build flag called `USE_EXPERIMENTAL_CUDNN_V8_API` is added. When PyTorch is built with `USE_EXPERIMENTAL_CUDNN_V8_API=1`, the convolution forward and transposed convolution backward will use a basic implementation of cuDNN v8 convolution API. This basic implementation provides most features of convolution forward, except the cuDNN benchmark. This implementation doesn't mean to be fast or ready to use. Some basic correctness check with unit tests is done to make sure there are no obvious mistakes, but we won't run any benchmark. We won't run any correctness check in any real model either. Some non-obvious unit test failures are allowed.

At this stage, we don't recommend any user to use `USE_EXPERIMENTAL_CUDNN_V8_API=1`.

## Stage 1: feature complete

The purpose of this stage is to have a complete implementation. At end of this stage, the v8 implementation should support all features that PyTorch supports and the microbenchmark should be comparable or faster compared to the v7 API. All unit tests should pass when PyTorch is built with `USE_EXPERIMENTAL_CUDNN_V8_API=1`. We will also add a new CircleCI pipeline to test PyTorch with `USE_EXPERIMENTAL_CUDNN_V8_API=1`.

Here is a list of features to implement based on https://github.com/pytorch/pytorch/pull/51390:
- [x] cuDNN benchmark (#58859)
- [x] convolution backward / transposed convolution forward (#58858)
- [x] conv-bias-activation fusion (#58860)
- [ ] better error message with debugging information and python repro when failing (on par with v7, reference PR #45023) (#58862)
- [x] BFloat16 support (#58861)

Here is a list of issues to resolve based on https://github.com/pytorch/pytorch/pull/51390:
- [x] The heuristic/benchmark cache of engine config is not thread-safe.
- [x] NHWC/NDHWC tests are failing.
- [x] Use `uintptr_t` to compute alignment
- [x] Code is still not well organized. For example, in `ATen/cudnn/Descriptors.h` there are till lots of v7 only things.

At the end of this stage, interested early users might want to try `USE_EXPERIMENTAL_CUDNN_V8_API=1`, but v8 API support is still not considered ready to use because the validation is still limited to unit tests. Test on real applications is required before consider enabling `USE_EXPERIMENTAL_CUDNN_V8_API` by default.

## Stage 2: thorough testing  
We will enable `USE_EXPERIMENTAL_CUDNN_V8_API=1` internally to run PyTorch with cuDNN v8 API with real models, to make sure there is no performance or functional issues. We will also seek further improvements in performance.

## Stage 3: enable by default [WE ARE HERE]

At this stage, the flag `USE_EXPERIMENTAL_CUDNN_V8_API` will be removed from PyTorch. PyTorch will use cuDNN v8 API whenever available.

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @csarofeen @ptrblck @xwang233 @ngimel 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuDNN v8 API tracking issue #58414

Stage 0: build a sketch

Stage 1: feature complete

Stage 2: thorough testing

Stage 3: enable by default [WE ARE HERE]

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cuDNN v8 API tracking issue #58414

Description

Stage 0: build a sketch

Stage 1: feature complete

Stage 2: thorough testing

Stage 3: enable by default [WE ARE HERE]

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions