Skip to content

The speed of pytorch with cudatoolkit 11.0 is slower than cudatoolkit 10.2 #47908

@klyjm

Description

@klyjm

🐛 Bug

When I update the pytorch to 1.7, the cudatoolkit is updated automaticlly to 11.0, and I find the speed of the same code is slower too much than before. So I change the version of the cudatoolkit back to 10.2, the speed is normal. Maybe I should update the cudnn version in Ubuntu?

To Reproduce

I just use the same code in the same device with the same environment only change the version of the cudatoolkit, the speed is slower too much.

cudatoolkit 10.2
Speed:  13.4/1.3/14.6 ms inference/NMS/total per 640x640 image at batch-size 1
cudatoolkit 11.0
Speed: 27.0/1.2/28.2 ms inference/NMS/total per 640x640 image at batch-size 1

Expected behavior

The speed of 11.0 should be no more slower than 10.2.

Environment

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

Collecting environment information...
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 11.0
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 9.1.85
GPU models and configuration:
GPU 0: TITAN V
GPU 1: TITAN V
GPU 2: TITAN V
GPU 3: TITAN V
GPU 4: TITAN V
GPU 5: TITAN V

Nvidia driver version: 450.80.02
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.7.0
[pip3] torchvision==0.8.1
[conda] blas                      1.0                         mkl    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] cudatoolkit               11.0.221             h6bb024c_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] libblas                   3.8.0                    20_mkl    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] libcblas                  3.8.0                    20_mkl    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] liblapack                 3.8.0                    20_mkl    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] liblapacke                3.8.0                    20_mkl    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
[conda] mkl                       2020.2                      256    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl-service               2.3.0            py38he904b0f_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl_fft                   1.2.0            py38h23d657b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl_random                1.1.1            py38h0573a6f_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] numpy                     1.19.1           py38hbc911f0_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] numpy-base                1.19.1           py38hfa32c7d_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] pytorch                   1.7.0           py3.8_cuda11.0.221_cudnn8.0.3_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch
[conda] torchvision               0.8.1                py38_cu110    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch

Additional context

cc @ngimel @VitalyFedyunin

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: cudaRelated to torch.cuda, and CUDA support in generalmodule: performanceIssues related to performance, either of kernel code or framework gluetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions