-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[logging] dynamo_timed the synchronize in CachingAutotuner make_launchers #157747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…hers Summary: There's some evidence that some very long compile times are actually attributable to the sync. This should make it easier to say for sure. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/157747
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit a8098df with merge base 28aae93 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This is already recorded in ODS - do we want it here too? (One advantage of being here is that it'll be in the perfetto trace) |
I was only shooting for perfetto. But I didn't know it was already recorded in ODS. Where is that happening? Is there some underlying counter that captures all cuda.synchronize? |
It's in c10/cuda/CUDAFunctions.cpp::device_synchronize(): all STATIC_SCOPED_WAIT_COUNTERS automatically get logged to ods. |
|
@aorenste , ah thanks. This PR is in response to https://fburl.com/workplace/it2q0qzs. I thought it would be easier to see how much overhead there is in this specific cuda synchronize. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
Summary: There's some evidence that some very long compile times are actually attributable to the sync. This should make it easier to say for sure.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov