-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use runtime values in KokkosExp_MDRangePolicy.hpp #3626
Conversation
Retest this please. |
b38a6f7
to
1e504d8
Compare
Retest this please. |
7fbd4b9
to
f0355c2
Compare
f0355c2
to
ac50ddd
Compare
ddc2625
to
b321dc8
Compare
Please resolve conflicts now that #3481 has been merged |
b321dc8
to
5f93d42
Compare
Done. |
That was fast :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't break anything I need, as far as I can tell (commenting that the merge from my PR seems good)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK: one question and one naming request:
question: should we move the partial specializations into the execution space specific files? That will also play better with additions of external execution spaces. In fact: shouldn't there now be a thing we can use to include the header files for the defined execspaces?
Renaming: some of the names in the init variants feel very misleading. Like max_tile_size is not actually max_tile_size a user could request for example. Can we identify better names for that?
core/src/KokkosExp_MDRangePolicy.hpp
Outdated
static constexpr Iterate value = Iterate::Right; | ||
}; | ||
|
||
#ifdef KOKKOS_ENABLE_CUDA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be in the execution space specific files, so that there is less ifdefing in this one? Just declare the default version above the inclusion of the ExecSpace headers and stick the specializations maybe even into Kokkos_Cuda.hpp etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
core/src/KokkosExp_MDRangePolicy.hpp
Outdated
const index_type default_tile_size = 2; | ||
init_helper(max_threads, max_tile_size, default_tile_size, max_threads); | ||
} | ||
|
||
#if defined(KOKKOS_ENABLE_CUDA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this should this live in execution space specific headers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
core/src/KokkosExp_MDRangePolicy.hpp
Outdated
template <typename ExecutionSpace> | ||
void init(const ExecutionSpace&) { | ||
const index_type max_threads = std::numeric_limits<int>::max(); | ||
const index_type max_tile_size = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The largest tile size was the maximum of the extent and 1. The other execution spaces use a value independent of the extent, though. To make both of these cases work, I decided to treat the case max_tile_size==0
differently below.
core/src/KokkosExp_MDRangePolicy.hpp
Outdated
void init(const Kokkos::Cuda& space) { | ||
const index_type max_threads = | ||
space.impl_internal_space_instance()->m_maxThreadsPerSM; | ||
const index_type max_tile_size = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that doesn't make much sense to me, why don't we allow for 32x8 ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or is this just whats left for the laregest dimension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is what we are using for the largest dimension.
core/src/KokkosExp_MDRangePolicy.hpp
Outdated
const index_type max_threads = | ||
space.impl_internal_space_instance()->m_maxThreadsPerSM; | ||
const index_type max_tile_size = 16; | ||
const index_type default_tile_size = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is default tile size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tile size for the dimensions that are not the largest one.
Done.
This is #3603.
I changed |
Ready from my side. I addressed the issues in @crtrott's blocking review. |
Also, has three approvals already. |
This pull request queries the maximum number of threads at runtime and uses that value for setting up default tile sizes. Also, unify the implementation a little better.