Skip to content

Conversation

lw
Copy link
Contributor

@lw lw commented Jan 16, 2025

Stack from ghstack (oldest at bottom):

This will make it work correctly with the partitioner's AutoAC

[ghstack-poisoned]
@lw lw requested a review from albanD as a code owner January 16, 2025 16:50
Copy link

pytorch-bot bot commented Jan 16, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/144973

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b8ea47e with merge base 62ce3e6 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lw added a commit that referenced this pull request Jan 16, 2025
This will make it work correctly with the partitioner's AutoAC

ghstack-source-id: de015a1
Pull Request resolved: #144973
@lw lw added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category ciflow/rocm Trigger "default" config CI on ROCm labels Jan 16, 2025
lw added a commit that referenced this pull request Jan 16, 2025
This will make it work correctly with the partitioner's AutoAC

ghstack-source-id: de015a1
Pull Request resolved: #144973
Comment on lines +844 to +853
def test_scaled_mm(self):
dtype = torch.float8_e4m3fnuz if torch.version.hip else torch.float8_e4m3fn
with FlopCounterMode() as mode:
torch._scaled_mm(
torch.randn((3 * 16, 5 * 16), device="cuda").to(dtype),
torch.randn((7 * 16, 5 * 16), device="cuda").to(dtype).t(),
scale_a=torch.ones((), device="cuda"),
scale_b=torch.ones((), device="cuda"),
out_dtype=torch.bfloat16,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar change suggested in previous PR.

#144872 (review)

@albanD albanD requested a review from Chillee January 16, 2025 16:58
Copy link
Collaborator

@jeffdaily jeffdaily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I just noticed you're not putting this test in test_matmul_cuda.py. Your change here looks good. Thanks for supporting MI300.

"mm_flop",
"normalize_tuple",
"register_flop_formula",
"scaled_mm_flop",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding new entry to this file and the allowlist below are NEVER ok.
If you want this function to be public, you need to properly expose and document it.
My guess is that in this case you don't want it to be public so you should prepend an _* before it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm following the convention of all other functions in that file. Did I miss some aspects of it or are you saying that all those functions are also "bad" and I should rather break the convention?

[ghstack-poisoned]
lw added a commit that referenced this pull request Jan 16, 2025
This will make it work correctly with the partitioner's AutoAC

ghstack-source-id: fffd3fd
Pull Request resolved: #144973
@lw
Copy link
Contributor Author

lw commented Jan 17, 2025

@pytorchbot merge -f "Still failing because of torchtune, unrelated"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants