Add CI for linux cuda #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

ahmadsharif1 merged 29 commits into meta-pytorch:main from ahmadsharif1:cuda1

Aug 21, 2024

Contributor

ahmadsharif1 commented Aug 15, 2024 •

edited

Loading

This diff adds GPU CI to TorchCodec.

It installs the NVFFMPEG headers, builds FFMPEG from source, builds TorchCodec and then runs test on the GPU tests.

This will be useful to catch GPU recording regressions.

I also added an explicit dependency on NPP image processing library because it was failing to build inside the container environment.


          Add CI for linux cuda

bbad58a

facebook-github-bot added the CLA Signed label

ahmadsharif1 added 22 commits

August 15, 2024 13:37

ahmadsharif1 assigned NicolasHug

ahmadsharif1 marked this pull request as ready for review

August 20, 2024 15:44

NicolasHug reviewed

View reviewed changes

.github/workflows/test_linux_cuda.yaml Outdated Show resolved Hide resolved

.github/workflows/test_linux_cuda.yaml Show resolved Hide resolved

.github/workflows/test_linux_cuda.yaml Outdated

Comment on lines 32 to 34

+                      conda create --yes --name test
+                      conda activate test
+                      conda install --yes pip cmake pkg-config nasm

Contributor

NicolasHug Aug 20, 2024

Also use --quiet where possible in every conda call, otherwise the logs get really long

.github/workflows/test_linux_cuda.yaml Outdated Show resolved Hide resolved

.github/workflows/test_linux_cuda.yaml Outdated

+                      # This fails because conda reactivate fails inside this env
+                      # conda install --yes conda-forge::compilers
+                      pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124

Contributor

NicolasHug Aug 20, 2024

We shouldn't need to install torchaudio, and we shouldn't need to install torchvision at this point as well

Contributor Author

ahmadsharif1 Aug 20, 2024

Torchvision is needed by gpu_benchmark.py

.github/workflows/test_linux_cuda.yaml Outdated Show resolved Hide resolved

.github/workflows/test_linux_cuda.yaml Show resolved Hide resolved

.github/workflows/test_linux_cuda.yaml

+                      # We skip certain tests because they are not relevant to GPU decoding and they always fail with
+                      # a custom FFMPEG build.
+                      pytest -k "not (test_get_metadata or get_ffmpeg_version)"

Contributor

NicolasHug Aug 20, 2024

So, fun fact: it's currently hard to tell whether the GPU tests are being ran at all, because they're protected within an "if" block. I'll share pointers on how to address this offline

Contributor Author

ahmadsharif1 Aug 20, 2024

This is accurate and a real problem. Should I address this in a different diff?

ahmadsharif1 added 3 commits

August 20, 2024 09:07

eddb128

567baac

61717a6

ahmadsharif1 commented

View reviewed changes

.github/workflows/test_linux_cuda.yaml Show resolved Hide resolved

.github/workflows/test_linux_cuda.yaml

+                      # We skip certain tests because they are not relevant to GPU decoding and they always fail with
+                      # a custom FFMPEG build.
+                      pytest -k "not (test_get_metadata or get_ffmpeg_version)"

Contributor Author

ahmadsharif1 Aug 20, 2024

This is accurate and a real problem. Should I address this in a different diff?

.github/workflows/test_linux_cuda.yaml

+                      conda activate test
+                      conda install --quiet --yes pip cmake pkg-config nasm
+                      pip install --quiet --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu124

Contributor Author

ahmadsharif1 Aug 20, 2024

Torchvision is required for the gpu benchmark which uses functional transforms from vision. See gpu_benchmark.py for details.

https://github.com/pytorch/torchcodec/blob/f4065f1b477148cfb0ef94167fb0bf3a63803e55/benchmarks/decoders/gpu_benchmark.py#L8

Contributor

NicolasHug Aug 20, 2024

but we don't need it to build, right? So we might prefer to install it later. This is OK though, since this job is just a temporary setup I guess.

Contributor Author

ahmadsharif1 Aug 20, 2024

That's a good point. Should I move it below after the build step so it doesn't accidentally leak as a build-time dep? WDYT?

ahmadsharif1 added 3 commits

August 20, 2024 09:27

a1882e0

7e02734

NicolasHug approved these changes

View reviewed changes

Contributor

NicolasHug left a comment

Thanks @ahmadsharif1 . Let's follow-up about https://github.com/pytorch/torchcodec/pull/187/files#r1723560554 in a separate PR.

Eventually we'll want to make sure wheels can be built (and tested) and integrate that with our existing wheel.yml workflows, but this is a great first step to test GPU-related changes.

ahmadsharif1 merged commit cafea81 into meta-pytorch:main

ahmadsharif1 deleted the cuda1 branch

August 21, 2024 13:59

NicolasHug added a commit to NicolasHug/torchcodec that referenced this pull request


          Revert "Add CI for linux cuda (meta-pytorch#187)"

a822a5d

This reverts commit cafea81.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels