-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Thrust 1.16 #10489
Update to Thrust 1.16 #10489
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.06 #10489 +/- ##
=============================================
Coverage 86.33% 86.33%
=============================================
Files 140 140
Lines 22300 22300
=============================================
Hits 19253 19253
Misses 3047 3047 Continue to review full report at Codecov.
|
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.16 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. It looks like rmm may be able to build with Thrust 1.16 even without these changes, but I think this changeset may help prevent future problems arising from inconsistency and reliance on `detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) - Conor Hoekstra (https://github.com/codereport) URL: #1011
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me assuming that you don't find anything surprising in new benchmarks. Please request my review again if benchmarks reveal an unexpected issue.
Java CI is failing. I see that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java approval
@gpucibot merge |
Fixes `thrust.patch` to patch the CUB source for `sort` to minimize the inlining of the comparator functor. The build was updated in #10489 to thrust-1.16 which includes change to thrust sort using CUB's `DeviceMergeSort`. This means the previous patch does not apply to the new thrust/cub source. This dramatically increased the build for `sort.cu` and other related source files as can be seen in this Build Metrics Report from #10489: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/8633/Build_20Metrics_20Report/ This PR moves the `pragma unroll` changes into the appropriate CUB source files reducing the build time back to the previous levels (or close to it I hope). Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) URL: #10577
PR #10489 updated from Thrust 1.15 to Thrust 1.16. However, this appears to be causing conflicts with other repositories -- [cuSpatial](rapidsai/cuspatial#511 (comment)) and cuGraph have reported issues where their builds are finding Thrust 1.16 from libcudf instead of Thrust 1.15 which is [currently pinned by rapids-cmake](https://github.com/rapidsai/rapids-cmake/blob/06a657281cdd83781e49afcdbb39abc491eeab17/rapids-cmake/cpm/versions.json#L26). This PR is intended to unblock local builds and CI builds for other RAPIDS packages until we are able to identify the root cause (which may be due to CMake include path orderingsrapids-cmake). Last time Thrust was updated, [rapids-cmake was updated](rapidsai/rapids-cmake#138) one day before [libcudf was updated](#9912). That may explain why we didn't notice this problem with the 1.15 update. The plan I currently have in mind is: 1. Merge this PR to roll back libcudf to Thrust 1.15 (and revert the patch for Thrust 1.16 [10577](#10577)). This will hopefully unblock CI for cugraph and cuspatial. 2. Try to work out whatever issues with CMake / include paths may exist. 3. Prepare all rapids-cmake repos for Thrust 1.16 compatibility. I've [done this for RMM already](rapidsai/rmm#1011), and I am working on [PR 4675](rapidsai/cuml#4675) to cuML now. I am planning to make the same fixes for `#include`s in cuCollections, raft, cuSpatial, and cuGraph so they will be compatible with Thrust 1.16. 4. Try to upgrade libcudf to Thrust 1.16 again (and re-apply the updated patch). If (2) has been resolved, I hope we won't see any issues in other RAPIDS libraries 5. Upgrade rapids-cmake to Thrust 1.16. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Mark Harris (https://github.com/harrism) URL: #10586
Fixes `thrust.patch` to patch the CUB source for `sort` to minimize the inlining of the comparator functor. The build was updated in rapidsai#10489 to thrust-1.16 which includes change to thrust sort using CUB's `DeviceMergeSort`. This means the previous patch does not apply to the new thrust/cub source. This dramatically increased the build for `sort.cu` and other related source files as can be seen in this Build Metrics Report from rapidsai#10489: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/8633/Build_20Metrics_20Report/ This PR moves the `pragma unroll` changes into the appropriate CUB source files reducing the build time back to the previous levels (or close to it I hope). Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#10577
PR rapidsai#10489 updated from Thrust 1.15 to Thrust 1.16. However, this appears to be causing conflicts with other repositories -- [cuSpatial](rapidsai/cuspatial#511 (comment)) and cuGraph have reported issues where their builds are finding Thrust 1.16 from libcudf instead of Thrust 1.15 which is [currently pinned by rapids-cmake](https://github.com/rapidsai/rapids-cmake/blob/06a657281cdd83781e49afcdbb39abc491eeab17/rapids-cmake/cpm/versions.json#L26). This PR is intended to unblock local builds and CI builds for other RAPIDS packages until we are able to identify the root cause (which may be due to CMake include path orderingsrapids-cmake). Last time Thrust was updated, [rapids-cmake was updated](rapidsai/rapids-cmake#138) one day before [libcudf was updated](rapidsai#9912). That may explain why we didn't notice this problem with the 1.15 update. The plan I currently have in mind is: 1. Merge this PR to roll back libcudf to Thrust 1.15 (and revert the patch for Thrust 1.16 [10577](rapidsai#10577)). This will hopefully unblock CI for cugraph and cuspatial. 2. Try to work out whatever issues with CMake / include paths may exist. 3. Prepare all rapids-cmake repos for Thrust 1.16 compatibility. I've [done this for RMM already](rapidsai/rmm#1011), and I am working on [PR 4675](rapidsai/cuml#4675) to cuML now. I am planning to make the same fixes for `#include`s in cuCollections, raft, cuSpatial, and cuGraph so they will be compatible with Thrust 1.16. 4. Try to upgrade libcudf to Thrust 1.16 again (and re-apply the updated patch). If (2) has been resolved, I hope we won't see any issues in other RAPIDS libraries 5. Upgrade rapids-cmake to Thrust 1.16. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Mark Harris (https://github.com/harrism) URL: rapidsai#10586
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, version 1.16 of Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also makes it more obvious where cugraph depends on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Brad Rees (https://github.com/BradReesWork) - Seunghwa Kang (https://github.com/seunghwak) URL: #2310
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also removes dependence on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - William Hicks (https://github.com/wphicks) URL: #4675
Thrust 1.16 removed internal header inclusions that libcudf relied on. This PR adds missing `#include`s that were found automatically by a script I wrote. See notes on #10489. This was previously applied in #10489 but the script became more sophisticated (and libcudf has changed) since I last applied it, so more missing `#include`s were found. Required for #11437 to upgrade to Thrust 1.17. This change has been separated from #11437 to minimize that PR's diff. Some additional changes will be needed on that PR but we don't want to hold off on fixing these includes, as recommended by @davidwendt. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Nghia Truong (https://github.com/ttnghia) - Robert Maynard (https://github.com/robertmaynard) URL: #11457
Updates the bundled version of Thrust to 1.17.0. I will run benchmarks and include results in a comment below. Depends on #11457. Supersedes #10489, #10577, #10586. Closes #10841. **This should be merged concurrently with rapidsai/rapids-cmake#231 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) - Robert Maynard (https://github.com/robertmaynard) URL: #11437
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [rapidsai#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also removes dependence on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#4675
This PR updates the version of Thrust from 1.15 to 1.16 (changelog). This update is needed to fix compilation with GCC 11, because of some warnings-as-errors present in Thrust 1.15 with GCC 11 (such as this one from Thrust's copy of cub: NVIDIA/cub#418).
Notably, Thrust reduced the number of internal header inclusions:
This change illuminated many missing includes in libcudf, so I added
#include <thrust/...>
for all thrust features used in each file (with help from a Python script).I included raw benchmarks that I recorded below.
Benchmarks:
Additional benchmarking from @randerzander and @GregoryKimball indicate that sort and quantile benchmarks show improvements for large data sizes, as much as 34% reduction in time for "Rank nulls 67108864." The benchmark "Quantiles nulls 67108864" shows roughly a 6% reduction in runtime. Small sizes sometimes showed slowdowns, like "Rank nulls 1024" going from 98 microseconds to 177 microseconds. However, these small data sizes are typically not the cases we are optimizing for. I discussed these results in detail with @GregoryKimball and we decided the benchmarks were "green light."