Skip to content

Conversation

@eqy
Copy link
Collaborator

@eqy eqy commented Oct 2, 2025

Was originally landed in #159915 but was implicitly reverted #161957 perhaps due to incorrect rebase

Avoid failures that look like 1 / 25165824 (0.0%) mismatches

cc @ptrblck @msaroufim @jerryzh168 @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

@eqy eqy added module: cuda Related to torch.cuda, and CUDA support in general open source topic: not user facing topic category Blackwell Specific failures or issues related to sm100 + Cuda arches ciflow/b200 labels Oct 2, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164480

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit f514aca with merge base c632952 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@eqy
Copy link
Collaborator Author

eqy commented Oct 2, 2025

CC @Aidyn-A who re-encountered this issue

@eqy eqy added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 3, 2025
@eqy
Copy link
Collaborator Author

eqy commented Oct 9, 2025

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch commit --author="Eddie Yan <eddiey@nvidia.com>" -m [CUDA][Inductor][B200] re-bump tolerances for test_baddmmintest_max_autotune.py` (#164480)

Was originally landed in #159915 but was implicitly reverted #161957 perhaps due to incorrect rebase

Avoid failures that look like 1 / 25165824 (0.0%) mismatches

Pull Request resolved: #164480
Approved by: https://github.com/Skylion007, https://github.com/Aidyn-A
` returned non-zero exit code 1

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
Details for Dev Infra team Raised by workflow job

@Aidyn-A
Copy link
Collaborator

Aidyn-A commented Oct 9, 2025

Seems like the same change has landed in #164022
Once concerning thing is that they increased tolerances for mm_plus_mm too high. We will need to follow up on that.

@eqy please consider closing this PR.

@eqy eqy closed this Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Blackwell Specific failures or issues related to sm100 + Cuda arches ciflow/b200 ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request module: cuda Related to torch.cuda, and CUDA support in general module: inductor open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants