-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[adams2019] Add caching to autoscheduler #5697
Conversation
…lex/add_autosched_caching
…lex/add_autosched_caching
…lex/add_autosched_caching
…lex/add_autosched_caching
are the environment vars controlling this (HL_USE_MEMOIZED_FEATURES=1, HL_MEMOIZE_BLOCKS=1, etc) meant to be a long-term API, or just a short-term expedient? |
@steven-johnson Short-term, input on thoughts for other APIs would be great. I also am not set on having caching off by default, it could just as easily be turned on by default |
I have no opinion on on vs off by default. I am concerned about the very large reliance that our autoschedulers have on env vars as a de facto 'api' for controlling a lot of things -- I don't have an alternate suggestion at this time, but I'd love to eventually have an alternative that doesn't require setting what are (effectively) mutable globals to do this sort of thing. |
Apologies for how delayed I am in updating this - the start-of-the-semester craziness hit pretty hard. @abadams Please let me know if the comments added to |
Failure seems to be an un-related build failure |
…lex/add_autosched_caching
…lex/add_autosched_caching
…lex/add_autosched_caching
src/autoschedulers/adams2019/Cache.h
Outdated
If cache_features is enabled (i.e. HL_DISABLE_MEMOIZED_FEATURES!=1) then this function caches | ||
the feautizations of its children, and if called again, reuses those cached feauturizations. | ||
The features are saved in a LoopNest's member, std::map<> feature_cache. Some features do not | ||
persist, and the FeaturesIntermediates strucct (see Featurization.h) is used to cache useful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strucct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in dccd642
@@ -13,6 +13,45 @@ namespace Halide { | |||
namespace Internal { | |||
namespace Autoscheduler { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a big-picture overview comment as well as the list of changes below. E.g. there seem to be two types of caching: feature caching and block caching, but I'm still confused about what block caching is from reading the text below. Say what kinds of caching exist, what values are cached, what the key is, and why this saves work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully addressed by dccd642 ? Let me know what you think.
@rootjalex Would be good to get this in for the 12.0 release. Just needs a few more comment tweaks I think. |
…lex/add_autosched_caching
@abadams I think generally the tilings are faster to save than to re-generate. The speedup that Luke sees on the GPU autoscheduler is much more than we see here though, which I assume is because he's generating more tiling options. |
The thing I'm still confused about is what is being cached in the blocks case. Reading the code it looks like it's the set of child LoopNest nodes for things scheduled compute_root. So what you're saving is loop nest construction time. Is that correct? The comment made me think it was just saving the tile sizes, which didn't sound useful. |
Those LoopNests are those generated from the tiling options - I think it's a combination of saving LoopNest construction as well as tiling generation. |
I just updated the comment, hopefully it makes the cache description more clear? |
In the blocks case, it's saving all the compute_root level loop nests. Most importantly, this includes their featurizations, which is the main motivation. Maybe update the comments to clarify this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes I requested look good.
* add feature caching and block caching to adams2019 autoscheduler * added caching verification for feautures * add caching docstrings
Adds caching of features and schedule enumerations to the adams2019 autoscheduler. Backported from @aekul 's autoscheduler work.
On my machine, I see a x2 speedup on the time needed to autoschedule lens blur and local laplacian, about a x1.5 speedup for resnet50, and anywhere between x1-1.5 for other pipelines (caching is only useful for larger pipelines, it has little to no effect on smaller pipelines).
To enable caching (it is disabled by default), these parameters should be set:HL_USE_MEMOIZED_FEATURES=1
HL_MEMOIZE_BLOCKS=1
Additionally, in order to test feature caching, setting the following value will enable feature caching verification (this will be quite slow):HL_VERIFY_MEMOIZED_FEATURES=1
Caching schedule enumerations and features is enabled by default. To disable them, set the following environment variables:
Tests were also added to verify these caching methods (Note that there likely will be no speed-up on these tests, as the pipelines are not large enough).
This PR was originally #5654 before it was split into two.