-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor gradient setter in gradient_check
#5699
Conversation
8a02560
to
39f7b20
Compare
39f7b20
to
3ec3a0b
Compare
Rebased. |
3ec3a0b
to
12589fd
Compare
chainer/gradient_check.py
Outdated
@@ -71,6 +100,7 @@ def numerical_grad( | |||
|
|||
""" | |||
assert eps > 0 | |||
assert isinstance(inputs, (tuple, list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use one space after ,
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
chainer/gradient_check.py
Outdated
'Output gradients: {}'.format( | ||
', '.join(str(y.shape) for y in outputs), | ||
', '.join(str(None if gy is None else gy.shape) | ||
for gy in grad_outputs))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the shape
s printed on this error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it would be helpful for those who read this message to spot the cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your intention. Could you make the error message clearer in that the shapes are printed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the messages.
# Keep output arrays to save computation in numerical gradients | ||
y0_data = tuple([y.array for y in ys]) | ||
|
||
# If y_grad is not given, generate the all-1 gradients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current behavior is compatible with Variable.backward
. Do you want to change it to be compatible with chainer.grad
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to the Variable.backward
behavior where it fills grad with 1 only for those with shape ()
? Actually it's not so clear to me why the behavior of Variable.backward
is like that.
Also, considering the purpose of gradient_check
, I think there's no reason for users of it to expect the same behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random is a better default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's also fine for me, but isn't that compatibility breaking?
In that case ()-shape grads should be initialized in random for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion, I reverted the behavior change. That can be separated from this PR.
chainer/gradient_check.py
Outdated
'Output gradients: {}'.format( | ||
', '.join(str(y.shape) for y in outputs), | ||
', '.join(str(None if gy is None else gy.shape) | ||
for gy in grad_outputs))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your intention. Could you make the error message clearer in that the shapes are printed?
chainer/gradient_check.py
Outdated
'Output gradients: {}\n'.format( | ||
', '.join(str(y.shape) for y in outputs), | ||
', '.join(str(None if gy is None else gy.shape) | ||
for gy in grad_outputs))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if dtype
can be checked in this function, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I encountered some error by doing this, but now I can't find it.
Fixed so that dtypes are compared as well.
0f5e22d
to
d075ae4
Compare
d075ae4
to
758f4fa
Compare
PTAL. Currently the error message looks like this:
(I wonder if there is any canonical way to present shapes and dtypes🤔) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# If no input has a gradient, we don't need to compare with numeric | ||
# gradient. | ||
if len(self.x_data) + len(self.params) == self.no_grads.count(True): | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to the deletion of the early return.
- I'd like
detect_nondifferentiable=True
to detect a random function regardless of the length of inputs. - I observed that the early return had no significant speed-up for the tests under
tests/chainer_tests/(functions|links)_tests
.
Jenkins, test this please. |
Jenkins CI test (for commit 20a53f2, target branch master) failed with status FAILURE. |
The Jenkins failure ( |
Merge after
#5698.Allowy_grad=None
in any target functions, not only loss functions.