-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[BE] unsupported backward failing on single sample #59455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BE] unsupported backward failing on single sample #59455
Conversation
💊 CI failures summary and remediationsAs of commit 53dcda9 (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
88d5212
to
51f4595
Compare
@mruberry this is another idea to enable unsupported backward test. please kindly take a look. I think this is a better approach and should be easier to get merge |
test/test_ops.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably require a slight tweak to handle functions like to_sparse()
which return a sparse tensor, but it should be a simple update after #59445 is in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#59445 is merged and I've rebased over it. since it doesn't introduce additional failure I am planning to merge this and then fix anything needed to avoid further merge conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't test to_sparse
, x.to_sparse().sum().backward()
raises runtime error on .sum()
and thus passes this test, but it's not what's supposed to be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Will create a follow up issue along with the rest of the tweaks needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a surprising change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup. surprising to me too
This is a great test and the idea seems correct. For the backward, I think this should call There are also a few test issues that need to be addressed still, like test_unsupported_backward_einsum_cuda_bfloat16 and some ROCm tests. Testing against the "master" builds, too, is definitely the right idea. I erroneously added "all" -- my mistake |
fixing tests and adding skips fix test issues
c20f66d
to
beaee2b
Compare
beaee2b
to
53dcda9
Compare
Codecov Report
@@ Coverage Diff @@
## master #59455 +/- ##
=======================================
Coverage 76.43% 76.43%
=======================================
Files 2038 2038
Lines 203064 203064
=======================================
+ Hits 155217 155218 +1
+ Misses 47847 47846 -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebased and looks like it is working well. (rocm failure seem irrelevant)
will try to merged it and monitor HUD. if any failure occurs i guess the best option is to forward fix by skipping tests
test/test_ops.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#59445 is merged and I've rebased over it. since it doesn't introduce additional failure I am planning to merge this and then fix anything needed to avoid further merge conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup. surprising to me too
@walterddr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Let's try to sneak this in
@walterddr merged this pull request in 26beda8. |
Summary: Echo on pytorch#58260 (comment) similar to `test_unsupported_dtype` which only check exception raised on the first sample. we should do similar things for unsupported_backward as well. The goal for both test is to remind developer to 1. add a new dtype to the support list if they are fulling runnable without failure (over all samples) 2. replace the skip mechanism which will indefinitely ignore tests without warning Pull Request resolved: pytorch#59455 Test Plan: CI. Reviewed By: mruberry Differential Revision: D28927169 Pulled By: walterddr fbshipit-source-id: 2993649fc17a925fa331e27c8ccdd9b24dd22c20
Echo on #58260 (comment)
similar to
test_unsupported_dtype
which only check exception raised on the first sample. we should do similar things for unsupported_backward as well. The goal for both test is to remind developer toTest Plan
CI.