-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test only smaller block_k for mm_plus_mm #96385
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/96385
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Merge Blocking SEVsThere is 1 active merge blocking SEVs. Please view them below:
If you must merge, use ❌ 1 FailuresAs of commit fe000f8: NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you for figuring this out!
Just a few comments inline:
torch/_inductor/kernel/mm_plus_mm.py
Outdated
# Splitting this into two loops causes an internal triton LLVM error | ||
# https://github.com/openai/triton/issues/967 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is stale now, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, deleted
torch/_inductor/kernel/mm_plus_mm.py
Outdated
# rematerialize rm and rn to save registers | ||
rm = pid_m * BLOCK_M + tl.arange(0, BLOCK_M) | ||
rn = pid_n * BLOCK_N + tl.arange(0, BLOCK_N) | ||
#rm = pid_m * BLOCK_M + tl.arange(0, BLOCK_M) | ||
#rn = pid_n * BLOCK_N + tl.arange(0, BLOCK_N) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this rematerialization a bad idea now, or is it temporary? Probably we should either delete it (or drop in a comment describing why it's temporary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't see any difference with or without, deleted
torch/_inductor/kernel/mm_plus_mm.py
Outdated
(mat1, mat2, mat3, mat4), | ||
layout, | ||
**mm_options(config, k, layout), | ||
if config.kwargs['BLOCK_K'] < k: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment with a pointer to the triton issue so we can revisit someday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
@pytorchbot merge -f "dla102 test flaky" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Trim number of tested mm_plus_mm configs to work around triton-lang/triton#1298 Pull Request resolved: pytorch#96385 Approved by: https://github.com/bertmaher, https://github.com/jansel
Trim number of tested mm_plus_mm configs to work around triton-lang/triton#1298 Pull Request resolved: pytorch/pytorch#96385 Approved by: https://github.com/bertmaher, https://github.com/jansel
Trim number of tested mm_plus_mm configs to work around triton-lang/triton#1298 Pull Request resolved: pytorch#96385 Approved by: https://github.com/bertmaher, https://github.com/jansel
Trim number of tested mm_plus_mm configs to work around triton-lang/triton#1298
cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire