BFloat16 CUDA GEMM ops unsupported on Nvidia P100 (SM_60) on CUDA 11.3 #57773
Labels
module: bfloat16
module: cublas
Problem related to cublas support
module: cuda
Related to torch.cuda, and CUDA support in general
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃悰 Bug in CUDA?
As per #50442, Bfloat16 CUDA GEMM ops are supposed to be supported on Nvidia SM_53 GPUs & above.
However, with an Nvidia P100 (SM_60 GPU) & CUDA 11.3, such ops produce an error stating that they're unsupported.
Please confirm which Nvidia Compute-capability GPUs support them. Thanks!
To Reproduce
Steps to reproduce the behavior:
python test_ops.py
inpytorch/test
.test_supported_backward_addbmm_cuda_bfloat16
would fail with the following error:The corresponding test for
addmm
,baddbmm
&bmm
also failed with the same error.For
addmm
, this test had even passed on SM_52 with CUDA 11.1.The rest 3 are only enabled in
common_methods_invocations.py
, if SM version is above 52.Expected behavior
These BFloat16 CUDA GEMM tests should pass on
SM_60
GPUs (GPUs above SM_52), as they do withSM_75
(Tesla T4) on CI.FWIW,
addmm
's corresponding test passes onSM_52
with CUDA 11.1, whilebaddbmm
's fails onSM_52
.However, both are failing on Nvidia P100 with CUDA 11.3.
Environment
PyTorch version: 1.9.0a0+gitebd2c0a
Is debug build: False
CUDA used to build PyTorch: 11.3
OS: Ubuntu 18.04.1 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA Tesla P100-PCIE-12GB
Nvidia driver version: 465.19.01
cuDNN version: Could not collect
Additional context
cc @csarofeen @ptrblck @xwang233 @ngimel @zasdfgbnm
cc @mruberry @anjali411, as some OpInfos check for Nvidia SM_53 or above, so they might've to be modified based on updated info.
cc @ngimel
The text was updated successfully, but these errors were encountered: