Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor gradient setter in gradient_check #5699

Merged
merged 2 commits into from
Dec 10, 2018

Conversation

niboshi
Copy link
Member

@niboshi niboshi commented Nov 23, 2018

Merge after #5698.

Allow y_grad=None in any target functions, not only loss functions.

@niboshi niboshi force-pushed the refactor-gradient-setter branch 3 times, most recently from 8a02560 to 39f7b20 Compare November 23, 2018 18:51
@toslunar toslunar self-requested a review November 25, 2018 01:37
@niboshi
Copy link
Member Author

niboshi commented Nov 26, 2018

Rebased.
PTAL

@@ -71,6 +100,7 @@ def numerical_grad(

"""
assert eps > 0
assert isinstance(inputs, (tuple, list))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use one space after ,.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

'Output gradients: {}'.format(
', '.join(str(y.shape) for y in outputs),
', '.join(str(None if gy is None else gy.shape)
for gy in grad_outputs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the shapes printed on this error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be helpful for those who read this message to spot the cause.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your intention. Could you make the error message clearer in that the shapes are printed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the messages.

# Keep output arrays to save computation in numerical gradients
y0_data = tuple([y.array for y in ys])

# If y_grad is not given, generate the all-1 gradients.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior is compatible with Variable.backward. Do you want to change it to be compatible with chainer.grad?

Copy link
Member Author

@niboshi niboshi Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the Variable.backward behavior where it fills grad with 1 only for those with shape ()? Actually it's not so clear to me why the behavior of Variable.backward is like that.
Also, considering the purpose of gradient_check, I think there's no reason for users of it to expect the same behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random is a better default

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also fine for me, but isn't that compatibility breaking?
In that case ()-shape grads should be initialized in random for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, I reverted the behavior change. That can be separated from this PR.

'Output gradients: {}'.format(
', '.join(str(y.shape) for y in outputs),
', '.join(str(None if gy is None else gy.shape)
for gy in grad_outputs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your intention. Could you make the error message clearer in that the shapes are printed?

'Output gradients: {}\n'.format(
', '.join(str(y.shape) for y in outputs),
', '.join(str(None if gy is None else gy.shape)
for gy in grad_outputs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if dtype can be checked in this function, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I encountered some error by doing this, but now I can't find it.
Fixed so that dtypes are compared as well.

@toslunar toslunar added the st:needs-discussion State indicating that discussions are needed before proceeding. label Dec 3, 2018
@niboshi niboshi force-pushed the refactor-gradient-setter branch 3 times, most recently from 0f5e22d to d075ae4 Compare December 7, 2018 05:11
@niboshi
Copy link
Member Author

niboshi commented Dec 7, 2018

PTAL.

Currently the error message looks like this:

E   ValueError: Shapes and/or dtypes of outputs and output gradients do not match.            
E   Output shapes and dtypes         : (2, 2, 3):float16, (2, 3, 3):float16, (2, 2, 3):float16                     
E   Output gradient shapes and dtypes: (12,):float16, (18,):float16, (12,):float16

(I wonder if there is any canonical way to present shapes and dtypes🤔)

Copy link
Member

@toslunar toslunar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# If no input has a gradient, we don't need to compare with numeric
# gradient.
if len(self.x_data) + len(self.params) == self.no_grads.count(True):
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to the deletion of the early return.

  • I'd like detect_nondifferentiable=True to detect a random function regardless of the length of inputs.
  • I observed that the early return had no significant speed-up for the tests under tests/chainer_tests/(functions|links)_tests.

@toslunar toslunar removed the st:needs-discussion State indicating that discussions are needed before proceeding. label Dec 7, 2018
@toslunar
Copy link
Member

toslunar commented Dec 7, 2018

Jenkins, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 20a53f2, target branch master) failed with status FAILURE.
(For contributors, please wait until the reviewer confirms the details of the error.)

@toslunar
Copy link
Member

toslunar commented Dec 7, 2018

The Jenkins failure (TestDeconvolutionND_param_19.test_forward_consistency_cudnn) seems unrelated to the PR.

@toslunar toslunar added this to the v6.0.0b2 milestone Dec 10, 2018
@toslunar toslunar merged commit 87f497b into chainer:master Dec 10, 2018
@niboshi niboshi deleted the refactor-gradient-setter branch December 10, 2018 12:42
@kmaehashi kmaehashi added the cat:code-fix Code refactoring that does not change the behavior. label Jan 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:code-fix Code refactoring that does not change the behavior.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants