-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autotuning for TeamPolicy team sizes and vector lengths #3206
Conversation
… OpenMP. To be done: OMPTarget, HIP, and Threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main objection is how we choose to do the tuning. I'm not convinced it's correct to call the configuration list a "ratio tuning variable"?
The other concern is connecting different invocations of the "same" kernel. Getting metaphysical, how do we determine 'same'?
core/src/Kokkos_Parallel.hpp
Outdated
} | ||
|
||
ExecPolicy inner_policy = policy; | ||
Kokkos::Tools::Experimental::begin_parallel_for(inner_policy, functor, str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been following the profiling changes well enough. This seems like a major behavioral change that isn't intrinsic to the autotuning piece.
I'm not sure what the final plan for the PR is. Putting namespace 'Experimental' inside the most important Kokkos function seems problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not intrinsic, but it is required (in my opinion). And if we're putting Tuning in the most important Kokkos function, and Tuning is Experimental, we're putting Experimental in that function in some sense. That's a question I'd like @crtrott to answer on, because if the answer is "this is a problem" I just need to close the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be in the public namespace (actually that is probably also true for beginParallelFor come to speak of it - what reason do we have to expose that call to users ?? [and yes I know David that is WELL before your time ...]). Generally I don't have a problem with this use of experimental here otherwise. But again I don't see a reason to expose this new function to users - and rather want to remove the other ones from the public namespace too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crtrott: Kokkos::Tools::Impl
or Kokkos::Tools::Experimental::Impl
? I'm in favor of the first
&kpID); | ||
} | ||
#ifdef KOKKOS_ENABLE_TUNING | ||
size_t context_id = Kokkos::Tools::Experimental::get_new_context_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we getting a new context ID every time we execute a parallel_for? Doesn't this mean that the "same" parallel will be a different context every time? How do we 'tune' if we can't connect parallel_for's to each other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might just need to redo #2422, since everybody seems to think that a context is some permanent thing. A context is a just a way of saying when values go in and out of scope. You know to connect parallel_x's to each other if you are providing the same output types to the same set of input types
Yikes. I used to just tune the team size, which is ratio data, and didn't change when I moved to tuning both. Good catch. I think the rest of your comments are good and I'll chase them, but thanks so much for catching that |
Getting errors in Jenkins:
I don't recognize this, anybody know what's up? |
Conflicts: core/src/Cuda/Kokkos_Cuda_Parallel.hpp core/src/OpenMP/Kokkos_OpenMP_Team.hpp
Current state of play:
|
34ec427
to
f52d1f0
Compare
Retest this please |
2 similar comments
Retest this please |
Retest this please |
size_t get_current_context_id(); | ||
} // namespace Experimental | ||
|
||
namespace Impl { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm double Impl namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a follow up issue to change how the tuning stuff is enabled? I.e. add a runtime option which enables it and remove the compile time thing?
|
Apologies for what will be a large pull request. The end goal of all of this is to make it so that we can tune the team sizes and vector lengths as I mentioned in #3155. My ask of reviewers is that you look at this series of steps, and see if they're the right steps, and also that I've done them correctly. So far I've only done Serial, CUDA, and OpenMP (HPX, Threads, HIP, and OMPTarget remain).
parallel_x
calls. I don't want to bloat this function, so what I've done is addbegin/end
functions for each pattern in Kokkos_Profiling.hpp that receive the arguments to that function (by non-const reference, they can change the behavior of the call)There are a couple of details I'm still figuring out (how to make this all work when Tuning is turned off), but before I went ham on all of this I wanted to make sure this was an acceptable path forward on this