-
Notifications
You must be signed in to change notification settings - Fork 75k
Closed
Labels
comp:gpuGPU related issuesGPU related issuesstat:awaiting tensorflowerStatus - Awaiting response from tensorflowerStatus - Awaiting response from tensorflowertype:featureFeature requestsFeature requests
Description
Please make sure that this is a feature request. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template
System information
- TensorFlow version (you are using): Tensorflow using cuDNN 7.2 and above
- Are you willing to contribute it (Yes/No): Yes
Describe the feature and the current behavior/state.
Now, if users want, cuDNN can convert fp32 inputs to fp16 to allow using tensor cores even when inputs are fp32. Before, users had to explicitly convert fp32 to fp16 using AMP. Link. This is as simple as setting CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION enum value and passing it to cudnnSetConvolutionMathType() functioin.
cudnnSetConvolutionMathType(cudnnConvDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION)and this is the counterpart for RNN:
cudnnSetRNNMatrixMathType(cudnnRnnDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION)We should let users chose if they want this option, since it involves converting fp32 to fp16.
Metadata
Metadata
Labels
comp:gpuGPU related issuesGPU related issuesstat:awaiting tensorflowerStatus - Awaiting response from tensorflowerStatus - Awaiting response from tensorflowertype:featureFeature requestsFeature requests