-
Notifications
You must be signed in to change notification settings - Fork 25k
Enable BFloat16 for logaddexp
& logaddexp2
on CUDA
#57908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 700d612 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
Compilation fails on Windows |
Hey @zasdfgbnm, while enabling BFloat16 for
Can you please help fix this? Thank you! |
Let me try something |
I think And you can workaround this by doing #include <ATen/AccumulateType.h>
using accscalar_t = at::acc_type<scalar_t, /*is_cuda=*/true>;
::isinf(static_cast<accscalar_t>(a)); This workaround is already being used in https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/IGammaKernel.cu#L403 |
Codecov Report
@@ Coverage Diff @@
## master #57908 +/- ##
=======================================
Coverage 76.83% 76.83%
=======================================
Files 1984 1984
Lines 197144 197144
=======================================
+ Hits 151471 151480 +9
+ Misses 45673 45664 -9 |
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Thanks! |
Summary: Enabled BFloat16 for `logaddexp` & `logaddexp2` on CUDA, with a [workaround](pytorch#57908 (comment)) suggested by zasdfgbnm. Pull Request resolved: pytorch#57908 Reviewed By: mruberry Differential Revision: D28344976 Pulled By: ngimel fbshipit-source-id: edef654b5819b236fbd9996f962115beb6e147e1
Enabled BFloat16 for
logaddexp
&logaddexp2
on CUDA, with a workaround suggested by @zasdfgbnm.