Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NNVM] Fix grads for sum and expand_like #1455

Merged
merged 4 commits into from
Jul 27, 2018
Merged

[NNVM] Fix grads for sum and expand_like #1455

merged 4 commits into from
Jul 27, 2018

Conversation

sgrechanik-h
Copy link
Contributor

@sgrechanik-h sgrechanik-h commented Jul 19, 2018

  • Fix gradients for the sum operation when the result is a scalar and when keepdims=True.
  • Add a function for testing gradients by comparing them with numerically computed gradients. It is used to check gradients for the sum operation.

@sgrechanik-h
Copy link
Contributor Author

Strange, looks like the rounding function works differently on gpu and cpu, and I'm not sure if this is connected to the changes or just a coincidence. Is it possible to rerun the tests?

@tqchen
Copy link
Member

tqchen commented Jul 19, 2018

@kazum @kevinthesun @srkreddy1238 can you please review this PR?

from ..import symbol
from ..import graph

def check_gradients_numeric(y, inputs=None, dtype='float32', print_graph=False,
Copy link
Contributor

@kevinthesun kevinthesun Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to implement another gradient test interface? Since there is backward interface to test operator gradient, can we reuse or enhance that interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the helper function from test_top_level1.py? Yes, I think it's a good idea to move this helper function into nnvm.testing and improve it with the ability to compute gradients numerically. I'll try to do this in a couple of days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That would be great.

@nishi-t
Copy link
Contributor

nishi-t commented Jul 20, 2018

#1460 may be able to fix the ci error

return std::vector<NodeEntry>{
MakeNode("expand_like", n->attrs.name + "_grad",
{ograds[0], n->inputs[0]},
{{"axis", axis.str()},
{"exclude", std::to_string(param.exclude)}})
{"keepdims", std::to_string(param.keepdims)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing keepdims to expand_like looks a bit strange since expand_like is not a reduce operation actually. Whether the parameter is specified or not, the output dimension does not changed.

An alternative I came up with was squeezing the output of sum beforehand like as follows. This might be less efficient, but looks easier to understand what we want to do.

  if (param.keepdims) {
    NodeEntry squeezed = MakeNode("squeeze", n->attrs.name + "_grad_sqz",
                                  {ograds[0]},
                                  {{"axis", axis.str()}});
    return std::vector<NodeEntry>{
      MakeNode("expand_like", n->attrs.name + "_grad",
               {squeezed, n->inputs[0]},
               {{"axis", axis.str()},
                {"exclude", std::to_string(exclude)}})};
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, expand_like is not a reduction operation, but it might be considered dual to reduction: when given the same parameters as a reduction operation, it expands exactly the same dimensions as the reduction operation reduces.

Your solution needs adding an exclude parameter to the squeeze operation, otherwise it won't work when exclude=True. Another approach, probably even less efficient, but reliable, is to use any reduction operator instead of squeeze.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, expand_like is not a reduction operation, but it might be considered dual to reduction: when given the same parameters as a reduction operation, it expands exactly the same dimensions as the reduction operation reduces.

I'd agree that those are dual, but I think it's not necessary to have the keepdims parameter in expand operators. I'll send a comment about it later.

Your solution needs adding an exclude parameter to the squeeze operation, otherwise it won't work when exclude=True.

Having the exclude option in squeeze sounds good to me. But if you don't prefer it, I'm fine with broadcasting in expand_like. :)

@@ -15,6 +15,9 @@
@reg.register_compute("expand_like")
def compute_expand_like(attrs, inputs, _):
"""Compute definition of expand_like"""
if attrs.get_bool("keepdims"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can replace this line with if len(inputs[0].shape) == len(inputs[1].shape):. I'd suggest removing keepdims from the expand_like arguments since it only implies that inputs[0] was reduced with keepdims=True.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks elegant and works, I'll go with this one.

@sgrechanik-h sgrechanik-h changed the title [NNVM] Fix grads for sum; Grad testing by comparing with numeric gradients [NNVM] Fix grads for sum and expand_like Jul 26, 2018
@sgrechanik-h
Copy link
Contributor Author

Merging numerical gradient testing into existing testing functions turned out to be a bit more difficult than I thought, so I'll create a separate PR for it when it's finished.

Now this is just a fix for sum, expand_like and their gradients for some special cases.
@kazum @kevinthesun could you please review this again?

Copy link
Contributor

@kazum kazum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added minor comments. Other parts look good to me.

@@ -15,6 +15,10 @@
@reg.register_compute("expand_like")
def compute_expand_like(attrs, inputs, _):
"""Compute definition of expand_like"""
if len(inputs[0].shape) == len(inputs[1].shape):
# If the shape is not changed then it is just a broadcasting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not shape but dimension?

@@ -391,11 +398,17 @@ def forward(x, y):

def backward(head_grads, x, y):
odim = len(out_shape)

keepdims = len(x.shape) == len(y.shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove a trailing space.

Copy link
Contributor

@kazum kazum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@tqchen tqchen merged commit a8ca691 into apache:master Jul 27, 2018
@tqchen
Copy link
Member

tqchen commented Jul 27, 2018

Thanks @sgrechanik-h @kazum @kevinthesun this is now merged

tqchen pushed a commit to tqchen/tvm that referenced this pull request Aug 4, 2018
sergei-mironov pushed a commit to sergei-mironov/tvm that referenced this pull request Aug 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants