-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Expand the coverage of test_addmm and test_addmm_sizes #43831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/test_torch.py
Outdated
self.assertEqual(res1, res2, atol=prec, rtol=0) | ||
}[dtype] | ||
|
||
if False and dtype.is_complex: # bug to be fixed in another PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fixed in #43827
Codecov Report
@@ Coverage Diff @@
## master #43831 +/- ##
=========================================
Coverage ? 40.16%
=========================================
Files ? 378
Lines ? 46728
Branches ? 0
=========================================
Hits ? 18767
Misses ? 27961
Partials ? 0 Continue to review full report at Codecov.
|
💊 CI failures summary and remediationsAs of commit c9b002c (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 21 times. |
test/test_torch.py
Outdated
@dtypesIfCUDA(*torch.testing.get_all_complex_dtypes(), *torch.testing.get_all_fp_dtypes(include_bfloat16=False)) | ||
@dtypes(*torch.testing.get_all_complex_dtypes(), *torch.testing.get_all_fp_dtypes()) | ||
def test_addmm(self, device, dtype): | ||
prec = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use @precisionOverride for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. And note that the precision for bfloat16 is bumped to 0.6. Because we are now accumulating in float32 instead of bfloat16 scalar.
AssertionError: False is not true : Tensors failed to compare as equal! With rtol=0.016 and atol=0.1, found 2 element(s) (out of 250) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.53125 (8.375 vs. 7.84375), which occurred at index (6, 4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Waiting for CI.
Not for this PR, but we should also test that with beta=0 nans and infs in M are not propagated, right now I think we are doing it only for empty inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ailzhang do you know why xla does not inherit precisionOverride here for bfloat16 (0.6) ? Is it ok if we disable bfloat16 on xla? |
@precisionOverride({torch.double: 1e-8, torch.float: 1e-4, torch.bfloat16: 0.6, | ||
torch.half: 1e-1, torch.cfloat: 1e-4, torch.cdouble: 1e-8}) | ||
@dtypesIfCUDA(*torch.testing.get_all_complex_dtypes(), *torch.testing.get_all_fp_dtypes(include_bfloat16=False)) | ||
@dtypes(*torch.testing.get_all_complex_dtypes(), *torch.testing.get_all_fp_dtypes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please not include bfloat16 here, and include it only in @dtypesIfCPU so that bfloat16 is not run on XLA?
cc: @JackCaoG made some changes to bfloat16 test skip, do you have an idea what's the best workaround here? |
@ailzhang the change was to auto skip all of the float16 tests, we still want to run bfloat16 tests. My guess is that pt/xla has its own precision overwrite for each type and ignore pytorch's precision. |
@JackCaoG @ailzhang so what do you guys suggest we do? Currently bfloat16 test is failing with
tbh, rtol 0.001 and atol 0.001 seems incredibly low for bfloat16 even under the best of circumstances, typically errors are much larger. And here the test definitely won't pass with such tolerances. Is there a way to override tolerances on the xla side? |
@ngimel Let me submit a pr on the pt/xla side to disable this test on our end. Ideally we should take pytorch precision overwrite if it is bigger than pt/xla's. I will investigate a bit on this too. |
Looks like GitHub automatically closed this by mistake? I will reopen and rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
out=
support