[CUDA][Inductor][B200] re-bump tolerances for `test_baddmm` in `test_max_autotune.py` #164480

eqy · 2025-10-02T17:01:28Z

Was originally landed in #159915 but was implicitly reverted #161957 perhaps due to incorrect rebase

Avoid failures that look like 1 / 25165824 (0.0%) mismatches

cc @ptrblck @msaroufim @jerryzh168 @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

pytorch-bot · 2025-10-02T17:01:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164480

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit f514aca with merge base c632952 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / inductor-test / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
vision_maskrcnn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

eqy · 2025-10-02T17:06:39Z

CC @Aidyn-A who re-encountered this issue

eqy · 2025-10-09T04:27:58Z

@pytorchmergebot merge

pytorchmergebot · 2025-10-09T04:29:49Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-10-09T04:35:30Z

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch commit --author="Eddie Yan <eddiey@nvidia.com>" -m [CUDA][Inductor][B200] re-bump tolerances for test_baddmmintest_max_autotune.py` (#164480)

Was originally landed in #159915 but was implicitly reverted #161957 perhaps due to incorrect rebase

Avoid failures that look like 1 / 25165824 (0.0%) mismatches

Pull Request resolved: #164480
Approved by: https://github.com/Skylion007, https://github.com/Aidyn-A
` returned non-zero exit code 1

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

Details for Dev Infra team

Raised by workflow job

Aidyn-A · 2025-10-09T14:00:39Z

Seems like the same change has landed in #164022
Once concerning thing is that they increased tolerances for mm_plus_mm too high. We will need to follow up on that.

@eqy please consider closing this PR.

check in

f514aca

eqy added module: cuda Related to torch.cuda, and CUDA support in general open source topic: not user facing topic category Blackwell Specific failures or issues related to sm100 + Cuda arches ciflow/b200 labels Oct 2, 2025

pytorch-bot bot added ciflow/inductor module: inductor labels Oct 2, 2025

Skylion007 approved these changes Oct 2, 2025

View reviewed changes

Aidyn-A approved these changes Oct 2, 2025

View reviewed changes

eqy added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 3, 2025

pytorchmergebot added the merging label Oct 9, 2025

pytorchmergebot removed the merging label Oct 9, 2025

eqy closed this Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUDA][Inductor][B200] re-bump tolerances for `test_baddmm` in `test_max_autotune.py` #164480

[CUDA][Inductor][B200] re-bump tolerances for `test_baddmm` in `test_max_autotune.py` #164480

Uh oh!

eqy commented Oct 2, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

eqy commented Oct 2, 2025

Uh oh!

eqy commented Oct 9, 2025

Uh oh!

pytorchmergebot commented Oct 9, 2025

Uh oh!

pytorchmergebot commented Oct 9, 2025

Uh oh!

Aidyn-A commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CUDA][Inductor][B200] re-bump tolerances for test_baddmm in test_max_autotune.py #164480

[CUDA][Inductor][B200] re-bump tolerances for test_baddmm in test_max_autotune.py #164480

Uh oh!

Conversation

eqy commented Oct 2, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164480

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

eqy commented Oct 2, 2025

Uh oh!

eqy commented Oct 9, 2025

Uh oh!

pytorchmergebot commented Oct 9, 2025

Merge started

Uh oh!

pytorchmergebot commented Oct 9, 2025

Merge failed

Uh oh!

Aidyn-A commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CUDA][Inductor][B200] re-bump tolerances for `test_baddmm` in `test_max_autotune.py` #164480

[CUDA][Inductor][B200] re-bump tolerances for `test_baddmm` in `test_max_autotune.py` #164480

eqy commented Oct 2, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Oct 2, 2025 •

edited

Loading