Allow CuDNN 7.2 to convert float -> fp16 for tensor core ops

<em>Please make sure that this is a feature request. As per our [GitHub Policy](https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md), we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em>


**System information**
- TensorFlow version (you are using): Tensorflow using cuDNN 7.2 and above
- Are you willing to contribute it (Yes/No): Yes



**Describe the feature and the current behavior/state.**
Now, if users want, cuDNN can convert fp32 inputs to fp16 to allow using tensor cores even when inputs are fp32. Before, users had to explicitly convert fp32 to fp16 using AMP. [Link](https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/). This is as simple as setting `CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION` enum value and passing it to `cudnnSetConvolutionMathType()` functioin.

```cpp
cudnnSetConvolutionMathType(cudnnConvDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION)
```

and this is the counterpart for RNN:

```cpp
cudnnSetRNNMatrixMathType(cudnnRnnDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION)
```


We should let users chose if they want this option, since it involves converting fp32 to fp16.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow CuDNN 7.2 to convert float -> fp16 for tensor core ops #36036

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow CuDNN 7.2 to convert float -> fp16 for tensor core ops #36036

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions