Cuda refactor #197

scotts · 2024-08-28T12:58:24Z

Currently, the code in VideoDecoder.cpp has a lot of #ifdef ENABLE_CUDA directly in functions. This refactor applies the following principles:

The logic in VideoDecoder.cpp should not need #ifdef ENABLE_CUDA inside of it.
CUDA code should be localized to explicit CUDA source files.
We should limit the entry points from generic code into CUDA code through an explicit API. Functions in this API will throw an exception if called from generic code that was not compiled with CUDA support.

Note: this was originally #193, but I abandoned that because #196 temporarily moved GPU support into a feature branch.

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

…a_refactor

ahmadsharif1 · 2024-08-30T14:26:28Z

I like the change to the NVTX macro. Even the other code changes are fine, but I don't see what problem is solved by moving the code to a new set of files. That makes reading the code harder because cmd+f doesn't work and you have to do a search to find it. It may also create merge conflicts when we make other changes to the VideoDecoder.cpp file when we are merging this branch back into the main branch. My vote is to avoid tons of code movement on this branch to keep it mergable into the main branch. We can revisit this once this branch is merged back to main.

In other words I don't agree with principle (1) and (2).

Principle (3) I agree with but then the API should be a generic one. I suspect eventually we will land on an API like:

an API to decode to an AVFrame
an API to do memory allocation on- or off-device
an API to color convert from AVFrame to whatever output destination with minimal copying

(2) and (3) probably need to be generic and accept a torch::Device.

(3) could then be a trampoline function to do cuda conversion if needed. That way it's not specific to CUDA anymore -- and we can later on add support for other hardware.

Also it seems a bit weird to have more ifdefs inside cuda-specific files. If a file is cuda specific already, why does it need ifdefs? It seems better to put the ifdefs in the trampoline code so cuda files can just assume that cuda is available and wont even be compiled if cuda is not available. Though that may require some template magic or other way to write the trampoline functions.

This type of API was done by me in an internal diff to improve performance for batch decoding because currently we are doing inefficient copies because we entangle the creation of the output tensor to color conversion. By decoupling memory allocation, decoding and color conversion we can reduce the decode time by half for some frames.

Lastly, Fbcode uses buck TARGETS files that would also need to be updated with this change, but we can worry about those later.

scotts · 2024-09-03T19:00:19Z

@ahmadsharif1, I think you make a good point that we should avoid changes on this branch until we're ready to merge into main. I also think whatever we do to refactor the organization of CUDA code should follow the APIs you proposed - we should wait until we have those APIs.

On how to avoid the ifdefs: I think we can do that by pushing the decision up to compilation and linking. We create libraries that always include CUDA and always don't include CUDA. We decide which libraries to link against at build time. We wouldn't have a trampoline, but we would need a clear API to implement.

Closing, and we'll revisit when we can follow those APIs and we're already working on main.

scotts added 9 commits August 26, 2024 19:28

refactor CUDA code into its own sets of files

5b9eb09

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

refactor CUDA code into its own sets of files

afc2d53

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

a1f5451

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

f4a49ba

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

8b6bcb3

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

e5d2c54

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

94b16c0

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

1126541

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

ebda7c4

…a_refactor

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 28, 2024

scotts marked this pull request as draft August 28, 2024 12:58

scotts marked this pull request as ready for review August 29, 2024 20:00

scotts added 2 commits August 29, 2024 13:25

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

addad1d

…a_refactor

Merge branch 'cuda_refactor' of github.com:scotts/torchcodec into cud…

b0e1517

…a_refactor

scotts requested a review from ahmadsharif1 August 29, 2024 20:33

scotts closed this Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cuda refactor #197

Cuda refactor #197

Uh oh!

scotts commented Aug 28, 2024

Uh oh!

ahmadsharif1 commented Aug 30, 2024 •

edited

Loading

Uh oh!

scotts commented Sep 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Cuda refactor #197

Cuda refactor #197

Uh oh!

Conversation

scotts commented Aug 28, 2024

Uh oh!

ahmadsharif1 commented Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scotts commented Sep 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ahmadsharif1 commented Aug 30, 2024 •

edited

Loading