Let scan tuning policy choose warpspeed or not#8158
Draft
bernhardmgruber wants to merge 11 commits intoNVIDIA:mainfrom
Draft
Let scan tuning policy choose warpspeed or not#8158bernhardmgruber wants to merge 11 commits intoNVIDIA:mainfrom
bernhardmgruber wants to merge 11 commits intoNVIDIA:mainfrom
Conversation
Contributor
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Contributor
😬 CI Workflow Results🟥 Finished in 35m 35s: Pass: 3%/255 | Total: 1d 02h | Max: 23m 29s | Hits: 89%/8237See results here. |
7246814 to
6b84735
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The motivation for this change is that now users can pick the underlying implementation and enforce that choice, which leads to compilation errors if the chosen implementation is not viable (e.g. exceeds 48KiB SMEM, or exceeds registers, or the PTX ISO is too low).
This is also what we need for tuning. We want to enforce a certain scan implementation and fail to compile if the tuning framework selects parameters that are not viable.
cub.bench.scan.exclusive.sum.baseon SM75;80;86;90;100;120Fixes: #8028