Handle None gradients in nn.utils.clip_grad_norm #5650

monajalal · 2018-03-08T23:40:23Z

I get this error:

python train.py --batch-size 20 --rnn_type GRU --cuda --gpu 1 --lr 0.0001 --mdl RNN --clip_norm 1 --opt Adam
/scratch/sjn-p2/anaconda/anaconda2/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
There are 2 CUDA devices
Setting torch GPU to 1
Using device:1 
Stored Environment:['term_len', 'word_index', 'glove', 'max_len', 'train', 'dev', 'test', 'index_word']
Loaded environment
Creating Model...
Setting Pretrained Embeddings
Initialized GRU model
Starting training
Namespace(aggregation='mean', attention_width=5, batch_size=20, clip_norm=1, cuda=True, dataset='Restaurants', dev=1, dropout_prob=0.5, embedding_size=300, epochs=50, eval=1, gpu=1, hidden_layer_size=300, l2_reg=0.0, learn_rate=0.0001, log=1, maxlen=0, mode='term', model_type='RNN', opt='Adam', pretrained=1, rnn_direction='uni', rnn_layers=1, rnn_size=300, rnn_type='GRU', seed=1111, term_model='mean', toy=False, trainable=1)
/scratch2/debate_tweets/sentiment/pytorch_sentiment_rnn/models/rnn.py:51: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  decoded = self.softmax(decoded)
Traceback (most recent call last):
  File "train.py", line 343, in <module>
    exp.train()
  File "train.py", line 326, in train
    loss = self.train_batch(i)
  File "train.py", line 303, in train_batch
    coeff = clip_gradient(self.mdl, self.args.clip_norm)
  File "train.py", line 35, in clip_gradient
    modulenorm = p.grad.data.norm()
AttributeError: 'NoneType' object has no attribute 'data'
[jalal@goku pytorch_sentiment_rnn]$

for train.py file in https://github.com/vanzytay/pytorch_sentiment_rnn I have followed all the steps in the readme up to here. What do you think should be fixed?

When submitting a bug report, please include the following information (where relevant):

OS: CentOS Linux release 7.4.1708 (Core)
PyTorch version: 0.3.1.post2
How you installed PyTorch (conda, pip, source): conda install -c pytorch pytorch
Python version: Python 2.7.14 |Anaconda custom (64-bit)| (default, Dec 7 2017, 17:05:42)
CUDA/cuDNN version: CUDA Version 8.0.61
GPU models and configuration: GP102 [GeForce GTX 1080 Ti], driver=nvidia latency=0
GCC version (if compiling from source): [GCC 7.2.0] on linux2

The text was updated successfully, but these errors were encountered:

monajalal · 2018-03-08T23:41:26Z

vanzytay/pytorch_sentiment_rnn#6

zou3519 · 2018-03-08T23:59:45Z

What this says is p.grad is None. It's possible that p (whatever it is) wasn't used in the gradient computation, or there was no backwards pass applied.

monajalal · 2018-03-09T00:01:33Z

Well, I understand that however, this seems like a problem with PyTorch as the people who have used the repo following the commands provided didn't have this error. Possibly a recent problem with PyTorch actually.

zou3519 · 2018-03-09T02:19:51Z

The last commit on the page is from Jan 24, 2017. Pytorch definitely has changed a lot since then. If you have specific questions about how to use pytorch, please ask on our forums: https://discuss.pytorch.org/

soumith · 2018-03-09T02:56:23Z

closed via @zou3519 's comment.

apaszke · 2018-03-10T12:06:53Z

I think the error is still legitimate. We should handle None .grad attributes correctly in clip_grad_norm (by assuming the norm is 0). Right now we fail like in the posted stack trace.

zou3519 · 2018-03-12T14:57:39Z

@apaszke I think we do handle None .grad attributes correctly in clip_grad_norm: https://github.com/pytorch/pytorch/blob/master/torch/nn/utils/clip_grad.py#L18.

The traceback @monajalal posted implies that the code uses its own clip_gradient method:

Traceback (most recent call last):
  File "train.py", line 343, in <module>
    exp.train()
  File "train.py", line 326, in train
    loss = self.train_batch(i)
  File "train.py", line 303, in train_batch
    coeff = clip_gradient(self.mdl, self.args.clip_norm)
  File "train.py", line 35, in clip_gradient
    modulenorm = p.grad.data.norm()

@monajalal If you replace clip_gradient with some usage of torch.nn.utils.clip_grad_norm this particular error should go away.

soumith closed this as completed Mar 9, 2018

apaszke reopened this Mar 10, 2018

apaszke changed the title ~~modulenorm = p.grad.data.norm() AttributeError: 'NoneType' object has no attribute 'data'~~ Handle None gradients in nn.utils.clip_grad_norm Mar 10, 2018

soumith closed this as completed Mar 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle None gradients in nn.utils.clip_grad_norm #5650

Handle None gradients in nn.utils.clip_grad_norm #5650

monajalal commented Mar 8, 2018 •

edited

monajalal commented Mar 8, 2018

zou3519 commented Mar 8, 2018

monajalal commented Mar 9, 2018

zou3519 commented Mar 9, 2018

soumith commented Mar 9, 2018

apaszke commented Mar 10, 2018

zou3519 commented Mar 12, 2018

Handle None gradients in nn.utils.clip_grad_norm #5650

Handle None gradients in nn.utils.clip_grad_norm #5650

Comments

monajalal commented Mar 8, 2018 • edited

monajalal commented Mar 8, 2018

zou3519 commented Mar 8, 2018

monajalal commented Mar 9, 2018

zou3519 commented Mar 9, 2018

soumith commented Mar 9, 2018

apaszke commented Mar 10, 2018

zou3519 commented Mar 12, 2018

monajalal commented Mar 8, 2018 •

edited