-
Notifications
You must be signed in to change notification settings - Fork 338
Add option to build with CUDAToolkit and enable_language(CUDA) #336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
malfet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, but I wonder how do you plan to test it
|
@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
I have pytorch/pytorch#83199 setup to build with my fork of gloo and with |
|
I notice there's no build with CUDA 11 here. Maybe A new CI job for CUDA 11 with |
90de735 to
5bc9db0
Compare
5bc9db0 to
f355214
Compare
0374dca to
5f69b55
Compare
|
@malfet I've got the CI job running. The build log shows: and also files show |
|
@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Got internal approval, landing... |
With CUDA-10.2 gone we can finally do it! This PR mostly contains build system related changes, invasive functional ones are to be followed. Among many expected tweaks to the build system, here are few unexpected ones: - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code. - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it. Some prerequisites: - #89297 - #89605 - #90228 - #90389 - #90379 - #89570 - pytorch/gloo#336 - pytorch/gloo#343 - pytorch/builder@919676f Fixes #56055 Pull Request resolved: #85969 Approved by: https://github.com/ezyang, https://github.com/kulinseth
With CUDA-10.2 gone we can finally do it! This PR mostly contains build system related changes, invasive functional ones are to be followed. Among many expected tweaks to the build system, here are few unexpected ones: - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code. - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it. Some prerequisites: - pytorch#89297 - pytorch#89605 - pytorch#90228 - pytorch#90389 - pytorch#90379 - pytorch#89570 - pytorch/gloo#336 - pytorch/gloo#343 - pytorch/builder@919676f Fixes pytorch#56055 Pull Request resolved: pytorch#85969 Approved by: https://github.com/ezyang, https://github.com/kulinseth
With CUDA-10.2 gone we can finally do it! This PR mostly contains build system related changes, invasive functional ones are to be followed. Among many expected tweaks to the build system, here are few unexpected ones: - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code. - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it. Some prerequisites: - pytorch#89297 - pytorch#89605 - pytorch#90228 - pytorch#90389 - pytorch#90379 - pytorch#89570 - pytorch/gloo#336 - pytorch/gloo#343 - pytorch/builder@919676f Fixes pytorch#56055 Pull Request resolved: pytorch#85969 Approved by: https://github.com/ezyang, https://github.com/kulinseth
With CUDA-10.2 gone we can finally do it! This PR mostly contains build system related changes, invasive functional ones are to be followed. Among many expected tweaks to the build system, here are few unexpected ones: - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code. - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it. Some prerequisites: - pytorch#89297 - pytorch#89605 - pytorch#90228 - pytorch#90389 - pytorch#90379 - pytorch#89570 - pytorch/gloo#336 - pytorch/gloo#343 - pytorch/builder@919676f Fixes pytorch#56055 Pull Request resolved: pytorch#85969 Approved by: https://github.com/ezyang, https://github.com/kulinseth
find_package(CUDA)is deprecated in newer versions of cmake. This adds theGLOO_USE_CUDA_TOOLKIToption to build withenable_language(CUDA)andfind_package(CUDAToolkit)which are the modern cmake replacements.cc @malfet