Skip to content

Enable Add, sub, mul, and div on CPU for bfloat16 type. #22851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

izdeby
Copy link
Contributor

@izdeby izdeby commented Jul 14, 2019

Enable Add, sub, mul, and div on CPU for bfloat16 type.
Tested via unit tests.

@izdeby izdeby changed the title [WIP] Bfloat16 math ops [WIP] Enable Add, sub, mul, and div on CPU for bfloat16 type. Jul 14, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@pytorchbot pytorchbot added module: cpu CPU specific problem (e.g., perf, algorithm) module: internals Related to internal abstractions in c10 and ATen module: operators labels Jul 15, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@izdeby izdeby changed the title [WIP] Enable Add, sub, mul, and div on CPU for bfloat16 type. Enable Add, sub, mul, and div on CPU for bfloat16 type. Jul 15, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@izdeby izdeby requested a review from gchanan July 15, 2019 20:13
Copy link
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few things plus I can't reproduce the "= {0};" change.

@izdeby izdeby force-pushed the bfloat16MathOps branch from 4507218 to 01fd8d7 Compare July 15, 2019 22:02
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ezyang
Copy link
Contributor

ezyang commented Jul 16, 2019

I'd kind of like to see some property-based tests for these. Maybe stuff like, if a floating point contains no significant information in the truncated part of the representation, you should get equivalent results adding floats and adding their corresponding bfloat16's.

@izdeby izdeby changed the title Enable Add, sub, mul, and div on CPU for bfloat16 type. [WIP] Enable Add, sub, mul, and div on CPU for bfloat16 type. Jul 31, 2019
@izdeby izdeby force-pushed the bfloat16MathOps branch from 01fd8d7 to 9c54ae1 Compare July 31, 2019 21:15
@izdeby izdeby requested review from gchanan and VitalyFedyunin July 31, 2019 21:15
@izdeby izdeby requested a review from ezyang August 5, 2019 19:27
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@@ -1707,6 +1707,11 @@ def test_add(self):
expected = torch.zeros(2, 3, device=device).bool()
self.assertEqual(res, expected)

# bfloat16
m1 = torch.tensor([1, 2], dtype=torch.bfloat16)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wouldn't torch.tensor([1., 2.], dtype=torch.bfloat16) be the canonical representation?

self.assertRaises(RuntimeError, lambda: m1 - m2)
elif (dtype == torch.bfloat16):
# bfloat16 has a lower precision so we have to have a separate check for it
self.assertEqual(m1 - m2, torch.tensor([1.1016, 2.1094], dtype=dtype))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand why we do this instead of just assertClose or whatever with a different atol/rtol.

if device == 'cpu':
a1 = torch.tensor([0.1, 0.1], dtype=torch.bfloat16, device=device)
a2 = torch.tensor([1.1, 0.1], dtype=torch.bfloat16, device=device)
self.assertEqual(a1 * a2, torch.tensor([0.1089, 0.0099], dtype=torch.bfloat16, device=device))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, also there should be a comment here.

@@ -1851,6 +1877,10 @@ def test_div(self):
res2[i, 3] = res2[i, 3] / 2
self.assertEqual(res1, res2)

a1 = torch.tensor([4.2, 6.2], dtype=torch.bfloat16)
a2 = torch.tensor([2., 2.], dtype=torch.bfloat16)
self.assertEqual(a1 / a2, torch.tensor([2.0938, 3.0938], dtype=torch.bfloat16))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, also there should be a comment here.

Copy link
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above.

@izdeby izdeby requested a review from gchanan August 5, 2019 22:06
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

self.assertRaises(RuntimeError, lambda: m1 - m2)
else:
# bfloat16 has a lower precision so we have to have specified precision
self.assertEqual(m1 - m2, torch.tensor([1.11, 2.11], dtype=dtype), 0.01)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm, isn't this setting lower precision for all dtypes except half and bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

if device == 'cpu':
a1 = torch.tensor([0.1, 0.1], dtype=torch.bfloat16, device=device)
a2 = torch.tensor([1.1, 0.1], dtype=torch.bfloat16, device=device)
self.assertEqual(a1 * a2, torch.tensor([0.11, 0.01], dtype=torch.bfloat16, device=device), 0.01)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because this test doesnt test all dtypes

@@ -1851,6 +1872,10 @@ def test_div(self):
res2[i, 3] = res2[i, 3] / 2
self.assertEqual(res1, res2)

a1 = torch.tensor([4.2, 6.2], dtype=torch.bfloat16)
a2 = torch.tensor([2., 2.], dtype=torch.bfloat16)
self.assertEqual(a1 / a2, torch.tensor([2.1, 3.1], dtype=torch.bfloat16), 0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because this test doesnt test all dtypes

@izdeby izdeby requested a review from gchanan August 5, 2019 22:35
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ezyang ezyang removed their request for review August 6, 2019 13:28
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izdeby has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Aug 8, 2019
Summary:
Enable Add, sub, mul, and div on CPU for bfloat16 type.
Tested via unit tests.
Pull Request resolved: pytorch/pytorch#22851

Differential Revision: D16256757

Pulled By: izdeby

fbshipit-source-id: 8b62f7581fc0ca0d2cff48ab40d877a9fcf70a5b
@facebook-github-bot
Copy link
Contributor

@izdeby merged this pull request in 5b9f55f.

@facebook-github-bot facebook-github-bot deleted the bfloat16MathOps branch July 13, 2020 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: internals Related to internal abstractions in c10 and ATen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants