Potential improvements to jpeg decoding on GPU

A minimal version of jpeg decoding on GPUs was implemented in https://github.com/pytorch/vision/pull/3792. Here's a list of potential future improvements:

- Support for A100 devices
- Support for batch decoding (I didn't see any speed improvement in my experiments in https://github.com/pytorch/vision/pull/2786#issuecomment-832148710, but perhaps I missed something)
- Use a finer-grained API for the decoding phases, and potentially change the decoding backend depending on the image size, taking inspiration from https://github.com/NVIDIA/CUDALibrarySamples/tree/master/nvJPEG/nvJPEG-Decoder-MultipleInstances
- As per https://github.com/pytorch/vision/pull/3792#discussion_r629290933, we could:
  - Avoid creating tensor views and use some pointer arithmetic
  - investigate whether the layout (CHW vs HWC) has an impact on performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential improvements to jpeg decoding on GPU #3848

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential improvements to jpeg decoding on GPU #3848

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions