-
Notifications
You must be signed in to change notification settings - Fork 25k
Enable Add, sub, mul, and div on CPU for bfloat16 type. #22851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few things plus I can't reproduce the "= {0};" change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
I'd kind of like to see some property-based tests for these. Maybe stuff like, if a floating point contains no significant information in the truncated part of the representation, you should get equivalent results adding floats and adding their corresponding bfloat16's. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
test/test_torch.py
Outdated
@@ -1707,6 +1707,11 @@ def test_add(self): | |||
expected = torch.zeros(2, 3, device=device).bool() | |||
self.assertEqual(res, expected) | |||
|
|||
# bfloat16 | |||
m1 = torch.tensor([1, 2], dtype=torch.bfloat16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wouldn't torch.tensor([1., 2.], dtype=torch.bfloat16)
be the canonical representation?
test/test_torch.py
Outdated
self.assertRaises(RuntimeError, lambda: m1 - m2) | ||
elif (dtype == torch.bfloat16): | ||
# bfloat16 has a lower precision so we have to have a separate check for it | ||
self.assertEqual(m1 - m2, torch.tensor([1.1016, 2.1094], dtype=dtype)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why we do this instead of just assertClose or whatever with a different atol/rtol.
test/test_torch.py
Outdated
if device == 'cpu': | ||
a1 = torch.tensor([0.1, 0.1], dtype=torch.bfloat16, device=device) | ||
a2 = torch.tensor([1.1, 0.1], dtype=torch.bfloat16, device=device) | ||
self.assertEqual(a1 * a2, torch.tensor([0.1089, 0.0099], dtype=torch.bfloat16, device=device)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, also there should be a comment here.
test/test_torch.py
Outdated
@@ -1851,6 +1877,10 @@ def test_div(self): | |||
res2[i, 3] = res2[i, 3] / 2 | |||
self.assertEqual(res1, res2) | |||
|
|||
a1 = torch.tensor([4.2, 6.2], dtype=torch.bfloat16) | |||
a2 = torch.tensor([2., 2.], dtype=torch.bfloat16) | |||
self.assertEqual(a1 / a2, torch.tensor([2.0938, 3.0938], dtype=torch.bfloat16)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, also there should be a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
self.assertRaises(RuntimeError, lambda: m1 - m2) | ||
else: | ||
# bfloat16 has a lower precision so we have to have specified precision | ||
self.assertEqual(m1 - m2, torch.tensor([1.11, 2.11], dtype=dtype), 0.01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm, isn't this setting lower precision for all dtypes except half and bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
if device == 'cpu': | ||
a1 = torch.tensor([0.1, 0.1], dtype=torch.bfloat16, device=device) | ||
a2 = torch.tensor([1.1, 0.1], dtype=torch.bfloat16, device=device) | ||
self.assertEqual(a1 * a2, torch.tensor([0.11, 0.01], dtype=torch.bfloat16, device=device), 0.01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because this test doesnt test all dtypes
test/test_torch.py
Outdated
@@ -1851,6 +1872,10 @@ def test_div(self): | |||
res2[i, 3] = res2[i, 3] / 2 | |||
self.assertEqual(res1, res2) | |||
|
|||
a1 = torch.tensor([4.2, 6.2], dtype=torch.bfloat16) | |||
a2 = torch.tensor([2., 2.], dtype=torch.bfloat16) | |||
self.assertEqual(a1 / a2, torch.tensor([2.1, 3.1], dtype=torch.bfloat16), 0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because this test doesnt test all dtypes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Enable Add, sub, mul, and div on CPU for bfloat16 type. Tested via unit tests. Pull Request resolved: pytorch/pytorch#22851 Differential Revision: D16256757 Pulled By: izdeby fbshipit-source-id: 8b62f7581fc0ca0d2cff48ab40d877a9fcf70a5b
Enable Add, sub, mul, and div on CPU for bfloat16 type.
Tested via unit tests.