Skip to content

Conversation

malfet
Copy link
Contributor

@malfet malfet commented Jul 10, 2020

For Vec256<bfloat16>::blendv() operator to work correctly, float32 -nan (0xfffffffff) must be converted to bfloat16 -nan (0xffff).
But cvtfp32_bf16 converts -nan to nan (0x7fc0)
TODO: Fix float32 +-nan conversion: i.e. float32 nan (0x7fffffff) must be converted to bfloat16 (0x7fff) nan

Closes #41238

@dr-ci
Copy link

dr-ci bot commented Jul 10, 2020

💊 CI failures summary and remediations

As of commit f42106f (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 5 times.

@mruberry
Copy link
Collaborator

This is cool and nice bonus on the spelling correction, but can you elaborate on why this solves the issue?

Also, as for testing, would you like to wait until #41249 goes in? That's the new test suite I'm developing that identified this issue. It should be available in a few days.

@malfet
Copy link
Contributor Author

malfet commented Jul 13, 2020

@mruberry I've provided the explanation in PR description, haven't I? And as you can see, I'm not adding the test but simply waiting for you to land it on your end.

Copy link
Contributor

@VitalyFedyunin VitalyFedyunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but can you please add test with the case which were incorrect before.

@mruberry
Copy link
Collaborator

This can be tested by adding an OpInfo for sign now that #42965 is in. Ping me if you'd like an example.

@malfet malfet force-pushed the malfet/fix-bfloat16-sign branch from 7492199 to d640a68 Compare August 24, 2020 18:29
@malfet malfet requested a review from VitalyFedyunin August 25, 2020 02:40
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@malfet
Copy link
Contributor Author

malfet commented Aug 25, 2020

I've added the test, although in the old format

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device_type = torch.device(device).type
... include_half=(device_type=='cuda')
... include_bfloat16=(device_type=='cpu')

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @malfet!

For `Vec256<bfloat16>::blendv()` operator to work correctly, float32 -nan (0xfffffffff) must be convered to bfloat16 (0xffff) -nan
TODO: Fix float32 +-nan conversion: i.e. float32 nan (0x7fffffff) must be converted to bfloat16 (0x7fff) nan
@malfet malfet force-pushed the malfet/fix-bfloat16-sign branch from d640a68 to f42106f Compare December 12, 2020 23:24
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@codecov
Copy link

codecov bot commented Dec 13, 2020

Codecov Report

Merging #41280 (f42106f) into master (33b7970) will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #41280   +/-   ##
=======================================
  Coverage   80.58%   80.59%           
=======================================
  Files        1873     1873           
  Lines      202711   202711           
=======================================
+ Hits       163360   163368    +8     
+ Misses      39351    39343    -8     

@malfet malfet deleted the malfet/fix-bfloat16-sign branch December 14, 2020 17:07
@facebook-github-bot
Copy link
Contributor

@malfet merged this pull request in 8397a62.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

torch.sign(bfloat16) on CPU is wrong on largish tensors

4 participants