Skip to content

Conversation

drisspg
Copy link
Contributor

@drisspg drisspg commented Oct 23, 2023

Summary

Adds the option to use fast_accumulation_mode for the fp8 matmul in scaled_mm

Information can be found here: https://docs.nvidia.com/cuda/cublas/#cublasltmatmuldescattributes-t
defaults to 0 (off)

cc @yanbing-j @vkuzo @albanD @kadeng

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 23, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/111847

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 78c31e8 with merge base 93a9b13 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@drisspg drisspg requested review from malfet and ipiszy October 23, 2023 21:39
Copy link
Contributor

@ipiszy ipiszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @drisspg !

ScalarType bias_dtype,
void* result_ptr,
const void* result_scale_ptr,
const void *result_scale_ptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: void* to keep it consistent with lines above and below?

@drisspg
Copy link
Contributor Author

drisspg commented Oct 23, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 23, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@drisspg drisspg added topic: not user facing topic category module: floatx (formerly float8) For torch.float8_e5m2 and torch.float8_e4m3 and other sub 8-bit float types labels Oct 23, 2023
@drisspg
Copy link
Contributor Author

drisspg commented Oct 23, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: linux-binary-manywheel / manywheel-py3_8-cuda11_8-test / test

Details for Dev Infra team Raised by workflow job

@malfet
Copy link
Contributor

malfet commented Oct 24, 2023

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 0 checks:

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

xuhancn pushed a commit to xuhancn/pytorch that referenced this pull request Nov 7, 2023
# Summary
Adds the option to use fast_accumulation_mode for the fp8 matmul in scaled_mm

Information can be found here: https://docs.nvidia.com/cuda/cublas/#cublasltmatmuldescattributes-t
defaults to 0 (off)

Pull Request resolved: pytorch#111847
Approved by: https://github.com/ipiszy, https://github.com/malfet
Skylion007 pushed a commit to Skylion007/pytorch that referenced this pull request Nov 14, 2023
# Summary
Adds the option to use fast_accumulation_mode for the fp8 matmul in scaled_mm

Information can be found here: https://docs.nvidia.com/cuda/cublas/#cublasltmatmuldescattributes-t
defaults to 0 (off)

Pull Request resolved: pytorch#111847
Approved by: https://github.com/ipiszy, https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: floatx (formerly float8) For torch.float8_e5m2 and torch.float8_e4m3 and other sub 8-bit float types topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants