-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use lower bounds to avoid traversals in MST #631
Conversation
if (radius < _lower_bounds(i - n + 1)) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to the version with the scan you showed me? I thought you said it was faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought so too. But I repeated experiments on Summit yesterday, and could not reproduce. Maybe I ran so many experiments that different results got confused in my head. So, the current patch does not actually seem to help on GPU. On CPU it's better than what I showed you for two reasons: a) we don't do parallel_scan, and b) more importantly, radius may have already been updated by another thread that converged, so it even more threads drop out.
a52eb33
to
c82c9c8
Compare
I changed to using lower bound only for Serial. I investigated using it on Nvidia A100 (I used OACISS Saturn, where I get consistent timings; on Perlmutter, the timings from run to run are very inconsistent). The summary is:
In short, there is no benefit on GPU, so I disabled it there. On CPU (AMD EPYC 7763) in Serial it speeds up the MST construction by 25%. |
c82c9c8
to
e5b7dad
Compare
@@ -482,6 +513,16 @@ struct MinimumSpanningTree | |||
Kokkos::view_alloc(Kokkos::WithoutInitializing, "ArborX::MST::radii"), | |||
n); | |||
|
|||
bool const use_lower_bounds = | |||
(std::is_same<ExecutionSpace, Kokkos::Serial>{}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kokkos::Serial
is undefined when the serial backend is not enabled. It is worrisome that the CI passed...
@dalg24 Did you mean to push seemingly unrelated stuff in |
No. Will fix. |
1ed7ef4
to
a40981a
Compare
This does not compile:
|
I was sure that having So I did the final salute by just explicitly stripping the check completely out the functor by introducing tags, and dispatching based on the execution space. I did not guard recomputing lower bounds for GPU, as it takes < 0.1%. |
No description provided.