-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Open
Labels
module: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generalmodule: performanceIssues related to performance, either of kernel code or framework glueIssues related to performance, either of kernel code or framework gluetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🐛 Bug
When I update the pytorch to 1.7, the cudatoolkit is updated automaticlly to 11.0, and I find the speed of the same code is slower too much than before. So I change the version of the cudatoolkit back to 10.2, the speed is normal. Maybe I should update the cudnn version in Ubuntu?
To Reproduce
I just use the same code in the same device with the same environment only change the version of the cudatoolkit, the speed is slower too much.
cudatoolkit 10.2
Speed: 13.4/1.3/14.6 ms inference/NMS/total per 640x640 image at batch-size 1
cudatoolkit 11.0
Speed: 27.0/1.2/28.2 ms inference/NMS/total per 640x640 image at batch-size 1
Expected behavior
The speed of 11.0 should be no more slower than 10.2.
Environment
Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).
You can get the script and run it with:
Collecting environment information...
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 11.0
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2
Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 9.1.85
GPU models and configuration:
GPU 0: TITAN V
GPU 1: TITAN V
GPU 2: TITAN V
GPU 3: TITAN V
GPU 4: TITAN V
GPU 5: TITAN V
Nvidia driver version: 450.80.02
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.7.0
[pip3] torchvision==0.8.1
[conda] blas 1.0 mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] cudatoolkit 11.0.221 h6bb024c_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] libblas 3.8.0 20_mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] libcblas 3.8.0 20_mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] liblapack 3.8.0 20_mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] liblapacke 3.8.0 20_mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] mkl 2020.2 256 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl-service 2.3.0 py38he904b0f_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl_fft 1.2.0 py38h23d657b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl_random 1.1.1 py38h0573a6f_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] numpy 1.19.1 py38hbc911f0_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] numpy-base 1.19.1 py38hfa32c7d_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] pytorch 1.7.0 py3.8_cuda11.0.221_cudnn8.0.3_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch
[conda] torchvision 0.8.1 py38_cu110 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch
Additional context
benfei, Sonata165, BlackHC, Ken1256, quickgrid and 11 moremheriyanto
Metadata
Metadata
Assignees
Labels
module: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generalmodule: performanceIssues related to performance, either of kernel code or framework glueIssues related to performance, either of kernel code or framework gluetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module