Skip to content

Enable BFloat16 for logaddexp & logaddexp2 on CUDA #57908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

imaginary-person
Copy link
Contributor

@imaginary-person imaginary-person commented May 9, 2021

Enabled BFloat16 for logaddexp & logaddexp2 on CUDA, with a workaround suggested by @zasdfgbnm.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 9, 2021

💊 CI failures summary and remediations

As of commit 700d612 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@imaginary-person
Copy link
Contributor Author

Compilation fails on Windows

@imaginary-person imaginary-person marked this pull request as ready for review May 10, 2021 19:58
@imaginary-person
Copy link
Contributor Author

Hey @zasdfgbnm, while enabling BFloat16 for logaddexp & logaddexp2 on CUDA, Windows build fails with the following error -

C:/Users/circleci/project/aten/src/ATen/native/cuda/LogAddExpKernel.cu(13): error: calling a
 __host__ function("isinf< ::c10::BFloat16> ") from 
a __host__ __device__ function("at::native::logaddexp_kernel_cuda(::at::TensorIteratorBase &)::[lambda() (instance 1)]::operator 
()() const::[lambda() (instance 6)]::operator ()() const::[lambda( ::c10::BFloat16,  ::c10::BFloat16) 
(instance 1)]::operator () const") is not allowed

Can you please help fix this? Thank you!

@zasdfgbnm
Copy link
Collaborator

Let me try something

@zasdfgbnm
Copy link
Collaborator

I think ::isinf doesn't work for half as well, see: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/test/cuda_half_test.cu#L70

And you can workaround this by doing

#include <ATen/AccumulateType.h>
using accscalar_t = at::acc_type<scalar_t, /*is_cuda=*/true>;
::isinf(static_cast<accscalar_t>(a));

This workaround is already being used in https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/IGammaKernel.cu#L403

@bdhirsh bdhirsh requested a review from ngimel May 10, 2021 22:02
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 10, 2021
@codecov
Copy link

codecov bot commented May 11, 2021

Codecov Report

Merging #57908 (700d612) into master (e8fb167) will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #57908   +/-   ##
=======================================
  Coverage   76.83%   76.83%           
=======================================
  Files        1984     1984           
  Lines      197144   197144           
=======================================
+ Hits       151471   151480    +9     
+ Misses      45673    45664    -9     

@facebook-github-bot
Copy link
Contributor

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ngimel
Copy link
Collaborator

ngimel commented May 11, 2021

Thanks!

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in 067147a.

krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
Summary:
Enabled BFloat16 for `logaddexp` & `logaddexp2` on CUDA, with a [workaround](pytorch#57908 (comment)) suggested by zasdfgbnm.

Pull Request resolved: pytorch#57908

Reviewed By: mruberry

Differential Revision: D28344976

Pulled By: ngimel

fbshipit-source-id: edef654b5819b236fbd9996f962115beb6e147e1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants