Skip to content

Conversation

@peterbell10
Copy link
Contributor

find_package(CUDA) is deprecated in newer versions of cmake. This adds the GLOO_USE_CUDA_TOOLKIT option to build with enable_language(CUDA) and find_package(CUDAToolkit) which are the modern cmake replacements.

cc @malfet

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, but I wonder how do you plan to test it

@facebook-github-bot
Copy link

@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@peterbell10
Copy link
Contributor Author

I have pytorch/pytorch#83199 setup to build with my fork of gloo and with GLOO_USE_CUDA_TOOLKIT set.

@peterbell10
Copy link
Contributor Author

I notice there's no build with CUDA 11 here. Maybe A new CI job for CUDA 11 with GLOO_USE_CUDA_TOOLKIT defined would be useful?

@peterbell10
Copy link
Contributor Author

@malfet I've got the CI job running. The build log shows:

-- Found CUDAToolkit: /usr/local/cuda/include (found suitable version "11.7.99", minimum required is "7.0") 

and also files show Building CUDA object, whereas when using cuda_add_library they show up as Building NVCC (Device) object.

@facebook-github-bot
Copy link

@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@malfet
Copy link
Contributor

malfet commented Sep 20, 2022

Got internal approval, landing...

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Dec 8, 2022
With CUDA-10.2 gone we can finally do it!

This PR mostly contains build system related changes, invasive functional ones are to be followed.
Among many expected tweaks to the build system, here are few unexpected ones:
 - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it
 - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code.
 - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious
 - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it.

Some prerequisites:
 - #89297
 - #89605
 - #90228
 - #90389
 - #90379
 - #89570
 - pytorch/gloo#336
 - pytorch/gloo#343
 - pytorch/builder@919676f

Fixes #56055

Pull Request resolved: #85969
Approved by: https://github.com/ezyang, https://github.com/kulinseth
kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022
With CUDA-10.2 gone we can finally do it!

This PR mostly contains build system related changes, invasive functional ones are to be followed.
Among many expected tweaks to the build system, here are few unexpected ones:
 - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it
 - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code.
 - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious
 - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it.

Some prerequisites:
 - pytorch#89297
 - pytorch#89605
 - pytorch#90228
 - pytorch#90389
 - pytorch#90379
 - pytorch#89570
 - pytorch/gloo#336
 - pytorch/gloo#343
 - pytorch/builder@919676f

Fixes pytorch#56055

Pull Request resolved: pytorch#85969
Approved by: https://github.com/ezyang, https://github.com/kulinseth
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Dec 20, 2022
With CUDA-10.2 gone we can finally do it!

This PR mostly contains build system related changes, invasive functional ones are to be followed.
Among many expected tweaks to the build system, here are few unexpected ones:
 - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it
 - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code.
 - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious
 - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it.

Some prerequisites:
 - pytorch#89297
 - pytorch#89605
 - pytorch#90228
 - pytorch#90389
 - pytorch#90379
 - pytorch#89570
 - pytorch/gloo#336
 - pytorch/gloo#343
 - pytorch/builder@919676f

Fixes pytorch#56055

Pull Request resolved: pytorch#85969
Approved by: https://github.com/ezyang, https://github.com/kulinseth
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jan 3, 2023
With CUDA-10.2 gone we can finally do it!

This PR mostly contains build system related changes, invasive functional ones are to be followed.
Among many expected tweaks to the build system, here are few unexpected ones:
 - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it
 - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code.
 - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious
 - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it.

Some prerequisites:
 - pytorch#89297
 - pytorch#89605
 - pytorch#90228
 - pytorch#90389
 - pytorch#90379
 - pytorch#89570
 - pytorch/gloo#336
 - pytorch/gloo#343
 - pytorch/builder@919676f

Fixes pytorch#56055

Pull Request resolved: pytorch#85969
Approved by: https://github.com/ezyang, https://github.com/kulinseth
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants