Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

force building torchvision with CUDA support if CUDA is included as dependency by setting $FORCE_CUDA #2931

Merged

Conversation

boegel
Copy link
Member

@boegel boegel commented May 23, 2023

(created using eb --new-pr)

If torchvision was built without CUDA support, then the included sanity check command will fail with:

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend

See also https://discuss.pytorch.org/t/notimplementederror-could-not-run-torchvision-nms-with-arguments-from-the-cuda-backend-this-could-be-because-the-operator-doesnt-exist-for-this-backend/132352

Setting $FORCE_CUDA bypasses the checks that are done to determine whether or not torchvision should be built with CUDA support.

@boegel boegel added the bug fix label May 23, 2023
@boegel boegel added this to the next release (4.7.2) milestone May 23, 2023
@boegel boegel force-pushed the 20230523153228_new_pr_torchvision branch from 20e2199 to d48fdc9 Compare May 23, 2023 14:35
@boegel
Copy link
Member Author

boegel commented May 23, 2023

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS torchvision-0.13.1-foss-2022a.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
node3133.skitty.os - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz (skylake_avx512), Python 3.6.8
See https://gist.github.com/boegel/bef4a5353ae4659fdb03ec03457c5a2e for a full test report.

@boegel
Copy link
Member Author

boegel commented May 23, 2023

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS torchvision-0.12.0-foss-2021a-PyTorch-1.11.0-CUDA-11.3.1.eb
  • SUCCESS torchvision-0.13.1-foss-2022a-CUDA-11.7.0.eb

Build succeeded for 2 out of 2 (2 easyconfigs in total)
node3307.joltik.os - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), 1 x NVIDIA Tesla V100-SXM2-32GB, 530.30.02, Python 3.6.8
See https://gist.github.com/boegel/44524441c2faccc9f67dfd9f0ceb9b54 for a full test report.

@boegel boegel changed the title force building torchvision with CUDA support if CUDA is included as dependency by setting $FORCE_CUDA force building torchvision with CUDA support if CUDA is included as dependency by setting $FORCE_CUDA May 23, 2023
Copy link
Contributor

@akesandgren akesandgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akesandgren
Copy link
Contributor

Going in, thanks @boegel!

@akesandgren akesandgren merged commit 411f81c into easybuilders:develop May 24, 2023
59 checks passed
@boegel boegel deleted the 20230523153228_new_pr_torchvision branch May 24, 2023 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants