Update torch.nn.init and torch.nn.utils.clip_grad #6173

tonybeltramelli · 2018-04-01T17:25:19Z

Introducing two updates.

1. Add param to He initialization scheme in torch.nn.init
Problem solved:
The function calculate_gain can take an argument to specify the type of non-linearity used. However, it wasn't possible to pass this argument directly to the He / Kaiming weight initialization function.

2. Add util to clip gradient value in torch.nn.utils.clip_grad
Problem solved:
DL libraries typically provide users with easy access to functions for clipping the gradients both using the norm and a fixed value. However, the utils clip_grad.py only had a function to clip the gradient norm.

ezyang · 2018-04-02T03:41:06Z

@pytorchbot test this please

torch/nn/utils/clip_grad.py

+    Gradients are modified in-place.
+
+    Arguments:
+        parameters (Iterable[Variable]): an iterable of Variables that will have


apaszke

Can we have some tests please?

torch/nn/utils/clip_grad.py

+        clip_value (float or int): maximum allowed value of the gradients
+            The gradients are clipped in the range [-clip_value, clip_value]
+    """
+    parameters = list(filter(lambda p: p.grad is not None, parameters))


karandwivedi42 · 2018-04-02T14:28:22Z

Might be good to have this confirm to the _ convention as it's an in-place operation.

ezyang · 2018-04-02T17:35:37Z

@pytorchbot test this please

ezyang · 2018-04-02T17:36:36Z

Might be good to have this confirm to the _ convention as it's an in-place operation.

Well, there's already other clip functions which don't have a suffix _. If we change this one, the rest of them should change to (and also add the BC-compat code...)

tonybeltramelli · 2018-04-03T11:59:09Z

The only check that failed comes from "short-perf-test-cpu" which is unrelated to the pytorch tests.
I probably don't have the permission to quickstart @pytorchbot to give it another shot.

ezyang · 2018-04-03T13:51:03Z

@pytorchbot retest this please

tonybeltramelli · 2018-04-03T16:01:50Z

Thanks @ezyang!

ssnl · 2018-04-03T16:05:38Z

I think it is reasonable to change the name to have suffix_. We did this for the init methods, and there are really just two grad clip methods including the one added in this PR...

…_ convention

tonybeltramelli · 2018-04-03T17:56:01Z

Good point @ssnl let's get it done.

torch/nn/utils/clip_grad.py

@@ -1,12 +1,12 @@

-def clip_grad_norm(parameters, max_norm, norm_type=2):
+def clip_grad_norm_(parameters, max_norm, norm_type=2):


torch/nn/utils/clip_grad.py

+    """
+    warnings.warn("torch.nn.utils.clip_grad_norm is now deprecated in favor "
+                  "of torch.nn.utils.clip_grad_norm_.",
+                  category=DeprecationWarning, stacklevel=2)


tonybeltramelli · 2018-04-03T22:32:23Z

The tests are stuck even though their console output show them being done.

ssnl

LTGM but @apaszke might want to take an extra look.

tonybeltramelli · 2018-04-04T07:19:46Z

Sounds good @ssnl and thanks for reviewing these changes.

apaszke

Almost good to go! Three minor things that could be improved

test/test_nn.py

+
+        grads = torch.arange(-50, 50).view(10, 10).div(5), torch.ones(10).mul(2)
+        for p, g in zip(l.parameters(), grads):
+            p._grad = Variable(g.clone().view_as(p.data))


torch/nn/utils/clip_grad.py

+            The gradients are clipped in the range [-clip_value, clip_value]
+    """
+    clip_value = float(clip_value)
+    for p in list(filter(lambda p: p.grad is not None, parameters)):


ssnl · 2018-04-10T18:44:04Z

Hi @tonybeltramelli , are you planning to finish this PR soon? If not, I can fix the minor things for you :)

tonybeltramelli · 2018-04-14T14:12:12Z

@ssnl sorry for the delay! I just pushed these minor fixes.

torch/nn/utils/clip_grad.py

        Total norm of the parameters (viewed as a single vector).
    """
-    parameters = list(filter(lambda p: p.grad is not None, parameters))
+    parameters = filter(lambda p: p.grad is not None, parameters)


test/test_nn.py

+
+        grads = torch.arange(-50, 50).view(10, 10).div(5), torch.ones(10).mul(2)
+        for p, g in zip(l.parameters(), grads):
+            p._grad = Variable(g.clone().view_as(p.data))


test/test_nn.py

+        clip_grad_value_(l.parameters(), clip_value)
+        for p in filter(lambda p: p.grad is not None, l.parameters()):
+            self.assertLessEqual(p.grad.data.max(), clip_value)
+            self.assertGreaterEqual(p.grad.data.min(), -clip_value)


apaszke · 2018-04-16T16:57:48Z

@tonybeltramelli the code looks good, but I'd really like to get rid of the Variable. Can you please rebuild PyTorch from the master branch and paste the error if you're still getting one?

ezyang

@apaszke doesn't want Variable in the commit

ssnl · 2018-04-17T15:31:56Z

If we can merge this, I'll remove the Variable wrapper in #6641.

ezyang · 2018-04-17T15:32:13Z

@ssnl is planning to fix the Variable problem in a codemode, so this is OK to go in.

tonybeltramelli · 2018-04-17T18:32:16Z

@apaszke Thanks and sorry for keeping that Variable. I have noticed that most of the tests still use explicit casting to Variable and figured it would probably make more sense and be cleaner to update all the tests at once to get rid of this deprecated requirement.

@ssnl and @ezyang thank you and sorry for my unresponsive response time this week!

ssnl · 2018-04-17T18:33:42Z

@tonybeltramelli No worries. The tests are that way because we haven't gotten around to update them (fully). It was already quite some work for me to update part of those in #6641 ...

tonybeltramelli · 2018-04-17T18:38:03Z

@ssnl Makes total sense, pytorch is becoming a beast! :)

ngimel · 2018-04-17T20:12:36Z

My local tests on the fresh build are failing, I wonder how CI is passing

root@bf33ceab382b:/raid/pytorch/test# python test_nn.py 
Traceback (most recent call last):
  File "test_nn.py", line 23, in <module>
    from torch.nn.utils import clip_grad_norm_, clip_grad_value_
ImportError: cannot import name 'clip_grad_norm_'

ssnl · 2018-04-17T20:16:15Z

@ngimel My local test script imports fine.. Do you have prior binary installs that are not properly cleaned?

ngimel · 2018-04-17T20:20:28Z

Ah, right, I think it's prior install that I forgot to clean. Sorry for the noise.

Introducing two updates. 1. Add param to He initialization scheme in torch.nn.init Problem solved: The function calculate_gain can take an argument to specify the type of non-linearity used. However, it wasn't possible to pass this argument directly to the He / Kaiming weight initialization function. 2. Add util to clip gradient value in torch.nn.utils.clip_grad Problem solved: DL libraries typically provide users with easy access to functions for clipping the gradients both using the norm and a fixed value. However, the utils clip_grad.py only had a function to clip the gradient norm. * add param to He initialization scheme in torch.nn.init * add util to clip gradient value in torch/nn/utils/clip_grad.py * update doc in torch.nn.utils.clip_grad * update and add test for torch.nn.utils.clip_grad * update function signature in torch.nn.utils.clip_grad to match suffix_ convention * ensure backward compatibility in torch.nn.utils.clip_grad * remove DeprecationWarning in torch.nn.utils.clip_grad * extend test and implementation of torch.nn.utils.clip_grad * update test and implementation torch.nn.utils.clip_grad

tonybeltramelli added 2 commits April 1, 2018 14:18

add param to He initialization scheme in torch.nn.init

f3bee8c

add util to clip gradient value in torch/nn/utils/clip_grad.py

f713e61

ssnl reviewed Apr 2, 2018

View reviewed changes

torch/nn/utils/clip_grad.py Outdated

Gradients are modified in-place.

Arguments:

parameters (Iterable[Variable]): an iterable of Variables that will have

This comment was marked as off-topic.

Sign in to view

update doc in torch.nn.utils.clip_grad

68a4e31

apaszke reviewed Apr 2, 2018

View reviewed changes

update and add test for torch.nn.utils.clip_grad

f6986ec

update function signature in torch.nn.utils.clip_grad to match suffix…

8bdb61b

…_ convention

ssnl reviewed Apr 3, 2018

View reviewed changes

torch/nn/utils/clip_grad.py

@@ -1,12 +1,12 @@

def clip_grad_norm(parameters, max_norm, norm_type=2):

def clip_grad_norm_(parameters, max_norm, norm_type=2):

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ensure backward compatibility in torch.nn.utils.clip_grad

57beff3

ssnl reviewed Apr 3, 2018

View reviewed changes

torch/nn/utils/clip_grad.py Outdated

"""

warnings.warn("torch.nn.utils.clip_grad_norm is now deprecated in favor "

"of torch.nn.utils.clip_grad_norm_.",

category=DeprecationWarning, stacklevel=2)

This comment was marked as off-topic.

Sign in to view

remove DeprecationWarning in torch.nn.utils.clip_grad

f8a5370

ssnl approved these changes Apr 4, 2018

View reviewed changes

apaszke reviewed Apr 4, 2018

View reviewed changes

extend test and implementation of torch.nn.utils.clip_grad

ddb4a50

tonybeltramelli requested review from colesbury, ezyang, gchanan and soumith as code owners April 14, 2018 14:09

tonybeltramelli requested a review from zdevito as a code owner April 14, 2018 14:09

apaszke reviewed Apr 15, 2018

View reviewed changes

update test and implementation torch.nn.utils.clip_grad

ca944d8

ezyang requested changes Apr 16, 2018

View reviewed changes

ezyang approved these changes Apr 17, 2018

View reviewed changes

ezyang merged commit 7fcaf3b into pytorch:master Apr 17, 2018

ezyang added the open source label Jun 24, 2019

		@@ -1,12 +1,12 @@

		def clip_grad_norm(parameters, max_norm, norm_type=2):
		def clip_grad_norm_(parameters, max_norm, norm_type=2):

Update torch.nn.init and torch.nn.utils.clip_grad #6173

Update torch.nn.init and torch.nn.utils.clip_grad #6173

Uh oh!

Conversation

tonybeltramelli commented Apr 1, 2018

Uh oh!

ezyang commented Apr 2, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

karandwivedi42 commented Apr 2, 2018

Uh oh!

ezyang commented Apr 2, 2018

Uh oh!

ezyang commented Apr 2, 2018

Uh oh!

tonybeltramelli commented Apr 3, 2018

Uh oh!

ezyang commented Apr 3, 2018

Uh oh!

tonybeltramelli commented Apr 3, 2018

Uh oh!

ssnl commented Apr 3, 2018

Uh oh!

tonybeltramelli commented Apr 3, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

tonybeltramelli commented Apr 3, 2018

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

tonybeltramelli commented Apr 4, 2018

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ssnl commented Apr 10, 2018

Uh oh!

tonybeltramelli commented Apr 14, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke commented Apr 16, 2018

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

ssnl commented Apr 17, 2018

Uh oh!