New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance difference between the conda and pip version in io.read_image #6782
Comments
@Leon5x can you provide your exact numbers when running repro code. Thanks! |
Do you mean the exact script output?
For torchvision installed from pip:
|
Seeing exactly your repro code, you labelled as "Torchvision PIL" and "Torchvision accimage" the time taken by vision/torchvision/io/image.py Lines 235 to 254 in 0610b13
So, it is OK to see the same numbers for "Torchvision PIL" and "Torchvision accimage". As for difference between conda vs pip binaries, this is something to explore in details from our side. Thanks for reporting! |
Alright, thanks for the explanation. Then I guess I understood the backend wrong. For what is the accimage/PIL backend used then? I also uploaded my image files here if needed: https://easyupload.io/8163b6 Good luck in the further search. |
I would guess that torchvision and PIL are linked against libjpeg-turbo when installed with
As for Accimage, I don't think it's supported anymore (our README probably needs an update) |
@Leon5x this selects a way to load images for datasets: vision/torchvision/datasets/folder.py Lines 262 to 268 in 0610b13
|
In one conda environment the .so is linked into a different environment, which is super weird. On this pc I have tried installing pil-simd from source, so this seems to have broken something. In the pip environment it is linked against libjpeg in the Pillow.libs folder. Not libjpeg-turbo in sight. But i guess it doesn't matter. |
Okay. Then I find it just weird that the function is not put into the datasets module if it only sets the backend for the datasets. And then it is also misleading that the documentation shows the functions on the first torchvision doc page and doesn't mention that they only set the backend for the datasets. It might help to change the function description to include that information. |
IMO global settings like backend are most often confusing... (especially in multithreading context) |
馃悰 Describe the bug
There is a big performance difference in reading jpg images using the conda or pip version of torchvision using the function torchvision.io.read_image.
When benchmarking reading 1000 images from a folder the pip version is more than 2x faster than the version installed from conda!
For the test I created 2 new conda environments using
conda create --name tvpip python=3.10
In one environment I installed torchvision using conda:
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
and in the other using pip:
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
Then I used the following code to benchmark torchvision.io.read_image, Pillow and accimage:
Findings:
I hope you can reproduce the behavior or give some insights why this might be the case. Thanks already.
Versions
Environment pip
Collecting environment information...
PyTorch version: 1.12.1+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.10.6 (main, Oct 7 2022, 20:19:58) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.23.4
[pip3] torch==1.12.1+cu113
[pip3] torchvision==0.13.1+cu113
[conda] numpy 1.23.4 pypi_0 pypi
[conda] torch 1.12.1+cu113 pypi_0 pypi
[conda] torchvision 0.13.1+cu113 pypi_0 pypi
Environment conda
Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.10.6 (main, Oct 7 2022, 20:19:58) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] torch==1.12.1
[pip3] torchvision==0.13.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py310h7f8727e_0
[conda] mkl_fft 1.3.1 py310hd6ae3a3_0
[conda] mkl_random 1.2.2 py310h00e6091_0
[conda] numpy 1.23.1 py310h1794996_0
[conda] numpy-base 1.23.1 py310hcba007f_0
[conda] pytorch 1.12.1 py3.10_cuda11.3_cudnn8.3.2_0 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchvision 0.13.1 py310_cu113 pytorch
The text was updated successfully, but these errors were encountered: