Performance difference between the conda and pip version in io.read_image #6782

Leon5x · 2022-10-17T22:28:35Z

🐛 Describe the bug

There is a big performance difference in reading jpg images using the conda or pip version of torchvision using the function torchvision.io.read_image.
When benchmarking reading 1000 images from a folder the pip version is more than 2x faster than the version installed from conda!
For the test I created 2 new conda environments using
conda create --name tvpip python=3.10
In one environment I installed torchvision using conda:
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
and in the other using pip:
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113

Then I used the following code to benchmark torchvision.io.read_image, Pillow and accimage:

import os, torchvision
from time import time as t

f = "test"
files = [file for file in os.listdir(f)]
test_images = len(files)

def test(files, fct):
    s = t()
    for file in files:
        image = fct(os.path.join(f,file))
    return t()-s

torchvision.set_image_backend("PIL")
time_needed = test(files, torchvision.io.read_image)
print(f"Torchvision {torchvision.get_image_backend():13s} Loading {test_images} files took {time_needed:.1f}s")

torchvision.set_image_backend("accimage")
time_needed = test(files, torchvision.io.read_image)
print(f"Torchvision {torchvision.get_image_backend():13s} Loading {test_images} files took {time_needed:.1f}s")

from PIL import Image
s = t()
for file in files:
    image = Image.open(os.path.join(f,file)).convert("RGB")
time_needed = t() - s
print(f"{'Pillow':25s} Loading {test_images} files took {time_needed:.1f}s")

import accimage
time_needed = test(files, accimage.Image)
print(f"{'AccImage':25s} Loading {test_images} files took {time_needed:.1f}s")

Findings:

In the conda environment the torchvision.io.read_image takes 4.6s, in the pip environment it takes 1.9s, Should be the same. I couln't figure out where the speed difference comes from, from the timings it looks like pip is using pillow-simd or libjpeg-turbo somehow.
When using the accimage backend with torchvision (torchvision.set_image_backend) the time to load the images doesn't change at all. Which seems like the same bacend is used. That behavior is the same in the pip and conda environment.
Installing pillow-simd and accimage in the environment before installing torchvision doesn't change anything apart from the pillow time.
When installing accimage in the conda environment, the time for torchvision.io.read_image with the accimage backend doesn't change, which in my understanding it should.

I hope you can reproduce the behavior or give some insights why this might be the case. Thanks already.

Versions

Environment pip

Collecting environment information...
PyTorch version: 1.12.1+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.10.6 (main, Oct 7 2022, 20:19:58) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.4
[pip3] torch==1.12.1+cu113
[pip3] torchvision==0.13.1+cu113
[conda] numpy 1.23.4 pypi_0 pypi
[conda] torch 1.12.1+cu113 pypi_0 pypi
[conda] torchvision 0.13.1+cu113 pypi_0 pypi

Environment conda

Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.10.6 (main, Oct 7 2022, 20:19:58) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] torch==1.12.1
[pip3] torchvision==0.13.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py310h7f8727e_0
[conda] mkl_fft 1.3.1 py310hd6ae3a3_0
[conda] mkl_random 1.2.2 py310h00e6091_0
[conda] numpy 1.23.1 py310h1794996_0
[conda] numpy-base 1.23.1 py310hcba007f_0
[conda] pytorch 1.12.1 py3.10_cuda11.3_cudnn8.3.2_0 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchvision 0.13.1 py310_cu113 pytorch

The text was updated successfully, but these errors were encountered:

vfdev-5 · 2022-10-18T12:43:41Z

@Leon5x can you provide your exact numbers when running repro code. Thanks!

Leon5x · 2022-10-18T12:51:49Z

Do you mean the exact script output?
For torchvision installed from conda:

Torchvision PIL           Loading 1000 files took 4.6s, making it 0.00461 s/file
Torchvision accimage      Loading 1000 files took 4.6s, making it 0.00462 s/file
Pillow                    Loading 1000 files took 4.9s, making it 0.00493 s/file
AccImage                  Loading 1000 files took 1.7s, making it 0.00173 s/file

For torchvision installed from pip:

Torchvision PIL           Loading 1000 files took 1.9s, making it 0.00191 s/file
Torchvision accimage      Loading 1000 files took 1.9s, making it 0.00188 s/file
Pillow                    Loading 1000 files took 2.0s, making it 0.00199 s/file
AccImage                  Loading 1000 files took 1.7s, making it 0.00175 s/file

vfdev-5 · 2022-10-18T13:01:55Z

Seeing exactly your repro code, you labelled as "Torchvision PIL" and "Torchvision accimage" the time taken by torchvision.io.read_image which is not using neither PIL, nor accimage:

vision/torchvision/io/image.py

Lines 235 to 254 in 0610b13

    
           def read_image(path: str, mode: ImageReadMode = ImageReadMode.UNCHANGED) -> torch.Tensor: 
        
               """ 
        
               Reads a JPEG or PNG image into a 3 dimensional RGB or grayscale Tensor. 
        
               Optionally converts the image to the desired format. 
        
               The values of the output tensor are uint8 in [0, 255]. 
        
               Args: 
        
                   path (str): path of the JPEG or PNG image. 
        
                   mode (ImageReadMode): the read mode used for optionally converting the image. 
        
                       Default: ``ImageReadMode.UNCHANGED``. 
        
                       See ``ImageReadMode`` class for more information on various 
        
                       available modes. 
        
               Returns: 
        
                   output (Tensor[image_channels, image_height, image_width]) 
        
               """ 
        
               if not torch.jit.is_scripting() and not torch.jit.is_tracing(): 
        
                   _log_api_usage_once(read_image) 
        
               data = read_file(path) 
        
               return decode_image(data, mode)

torchvision.io.read_image loads and decodes data using torch io ops: https://github.com/pytorch/vision/tree/main/torchvision/csrc/io/image/cpu

So, it is OK to see the same numbers for "Torchvision PIL" and "Torchvision accimage".

As for difference between conda vs pip binaries, this is something to explore in details from our side. Thanks for reporting!

Leon5x · 2022-10-18T13:15:31Z

Alright, thanks for the explanation. Then I guess I understood the backend wrong. For what is the accimage/PIL backend used then?
Where my misunderstanding comes from is the pytorch documentation:
torchvision.set_image_backend(backend) - Specifies the package used to load images.
So probably that should also be changed?

I also uploaded my image files here if needed: https://easyupload.io/8163b6

Good luck in the further search.

NicolasHug · 2022-10-18T13:21:51Z

I would guess that torchvision and PIL are linked against libjpeg-turbo when installed with pip. You can check by using ldd on their respective .so files e.g.

ldd [...]/site-packages/Pillow-9.0.1-py3.9-linux-x86_64.egg/PIL/_imaging.cpython-39-x86_64-linux-gnu.so

As for Accimage, I don't think it's supported anymore (our README probably needs an update)

vfdev-5 · 2022-10-18T13:30:32Z

torchvision.set_image_backend(backend) - Specifies the package used to load images.

@Leon5x this selects a way to load images for datasets:

vision/torchvision/datasets/folder.py

Lines 262 to 268 in 0610b13

    
           def default_loader(path: str) -> Any: 
        
               from torchvision import get_image_backend 
        
               if get_image_backend() == "accimage": 
        
                   return accimage_loader(path) 
        
               else: 
        
                   return pil_loader(path)

Leon5x · 2022-10-19T10:19:35Z

I would guess that torchvision and PIL are linked against libjpeg-turbo when installed with pip. You can check by using ldd on their respective .so files e.g.

In one conda environment the .so is linked into a different environment, which is super weird. On this pc I have tried installing pil-simd from source, so this seems to have broken something. In the pip environment it is linked against libjpeg in the Pillow.libs folder. Not libjpeg-turbo in sight. But i guess it doesn't matter.

Leon5x · 2022-10-19T10:22:26Z

@Leon5x this selects a way to load images for datasets:

Okay. Then I find it just weird that the function is not put into the datasets module if it only sets the backend for the datasets. And then it is also misleading that the documentation shows the functions on the first torchvision doc page and doesn't mention that they only set the backend for the datasets. It might help to change the function description to include that information.
Anyways thanks for the explanation!

vadimkantorov · 2022-10-28T13:17:36Z

IMO global settings like backend are most often confusing... (especially in multithreading context)

pmeier added needs reproduction module: io topic: binaries Perf For performance improvements labels Oct 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance difference between the conda and pip version in io.read_image #6782

Performance difference between the conda and pip version in io.read_image #6782

Leon5x commented Oct 17, 2022

vfdev-5 commented Oct 18, 2022

Leon5x commented Oct 18, 2022

vfdev-5 commented Oct 18, 2022

Leon5x commented Oct 18, 2022

NicolasHug commented Oct 18, 2022

vfdev-5 commented Oct 18, 2022

Leon5x commented Oct 19, 2022

Leon5x commented Oct 19, 2022

vadimkantorov commented Oct 28, 2022

Performance difference between the conda and pip version in io.read_image #6782

Performance difference between the conda and pip version in io.read_image #6782

Comments

Leon5x commented Oct 17, 2022

🐛 Describe the bug

Versions

Environment pip

Environment conda

vfdev-5 commented Oct 18, 2022

Leon5x commented Oct 18, 2022

vfdev-5 commented Oct 18, 2022

Leon5x commented Oct 18, 2022

NicolasHug commented Oct 18, 2022

vfdev-5 commented Oct 18, 2022

Leon5x commented Oct 19, 2022

Leon5x commented Oct 19, 2022

vadimkantorov commented Oct 28, 2022