-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More extensive disentangling of Policy Traits (Replaces #3564) #3829
Conversation
Right now, the size of the code for AnalyzePolicy scales quadratically with the number of traits we analyze. With this change, the scaling is linear and adding a new trait doesn't have to touch any other code.
The details of the runtime storage that may be associated with a policy trait is now implemented only as part of the trait-specific specializations. Traits are now fully disjoint with the exception of defaults in AnalyzePolicy<void> and PolicyTraitsWithDefaults
e900362
to
49c4925
Compare
- made implementation of `require` and `prefer` much easier - introduced the notion of a TraitSpecification, which helper templates use to make these things easier - moved specification of defaults to a place more directly associated with the trait itself.
d21c7b4
to
435e71b
Compare
435e71b
to
200dc06
Compare
b76e43a
to
eac7b4a
Compare
eac7b4a
to
c0753aa
Compare
6f7c6b8
to
4460416
Compare
Reproducer: https://godbolt.org/z/r7GT6c Toy example of workaround strategy: https://godbolt.org/z/YWjx83 Basically, we need to linearize the series of base classes to get MSVC to do empty base optimization. It makes things a little uglier, but I've tried to isolate as much of the workaround as possible
4460416
to
0cf44b9
Compare
Retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally, this all looks good. I'm temporarily putting a block on this, I've seen some (not-verified, likely-just-me-screwing-up-methodology) bad results on compile times and binary sizes
Frustratingly, the results are compiler-dependent. Clang, this performs just fine with. Intel 19, it does pretty poorly, unfortunately. Daisy and I are discussing |
@DavidPoliakoff's results appear to be in error, and we've talked about them. For Intel 19.0.5 with OpenMP and Serial enabled and
Both #3829 and #3832 reduce the size of the compiled output. The build times for all three are within the noise; about 2:10 seconds for |
I'd cosign that, actually. I could see a methodology error that leads to different build times, but not binary sizes. I frankly don't have the time to reproduce, so go with Daisy's time, write it up to methodology errors, my bad |
template <class Policy, class ScheduleType> | ||
constexpr auto require(Policy const& p, Kokkos::Schedule<ScheduleType>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is a new functionality. (I do not have any objection just wanted to highlight it.)
d8d2c96
to
0cf44b9
Compare
Retest this please. |
Retest this please. |
(Builds on #3829) More disentangling of AnalyzePolicy
require
andprefer
much easierTraitSpecification
, which helper templates use to make these things easierrequire
andprefer
overloads to be with the traits they interact with.