-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fence ArborX calls in the benchmark and (optionally) DBSCAN #450
Conversation
It may happen that the last kernel in ArborX is a Kokkos::parallel_for. As ArborX does have a guaranteed fence() at the end, what could happen is that that kernel is launched, and ArborX returns. Thus, when the ending timer is called, the kernel has not been completed yet. Thus, the measured time would be wrong in this case.
The reason why I also added fences prior to ArborX calls in the loop is to make sure that if now or in the future we add some setup inside the loop, it does not creep in the timers. |
This PR also fixes #446, as a side effect. It is still not clear what caused that hanging, though. A guess could be that in this mode we issued too many kernel launches (essentially, we continuously issued launches in the loop), and something happened when we issued too many of them. |
So far, I have not observed any significant changes to timings on my workstation, but I've launched some jobs on Summit for observation. |
There's no difference on Summit, either 🤷 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, this looks reasonable.
1053e07
to
27eea17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of the timing lambdas you introduced
Anything else you guys want to be addressed here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't like your lambda but the rest is fine. Make sure it does not get copy/pasted elsewhere in the future.
It may happen that the last kernel in ArborX is a Kokkos::parallel_for.
As ArborX does have a guaranteed fence() at the end, what could happen
is that that kernel is launched, and ArborX returns. Thus, when the
ending timer is called, the kernel has not been completed yet. Thus, the
measured time would be wrong in this case.