Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Bug in new autograd backward (with LSTM Cell) #1450
This is how I implement the decoder of a sequence to sequence model
import torch from torch import nn from torch.autograd import Variable from torch.nn import functional as F def decoder(input_, embedding, lstm, projection, states): """ unroll the LSTM Cell, returns the flattened logits""" emb = embedding(input_.t()) hs =  for i in range(input_.size(1)): h, c = lstm(emb[i], states) hs.append(h) states = (h, c) lstm_out = torch.stack(hs, dim=0) logit = projection(lstm_out.contiguous().view(-1, lstm.hidden_size)) return logit embedding = nn.Embedding(4, 64, padding_idx=0).cuda() lstm = nn.LSTMCell(64, 64).cuda() projection = nn.Linear(64, 4).cuda() input_ = Variable(torch.LongTensor([[1, 2, 3], [3, 2, 1]])).cuda() states = (Variable(torch.zeros(2, 64)).cuda(), Variable(torch.zeros(2, 64)).cuda()) target = Variable(torch.LongTensor([[3, 2, 1], [2, 3, 1]])).cuda() logit = decoder(input_, embedding, lstm, projection, states) loss = F.cross_entropy(logit, target.t().contiguous().view(-1)) loss.backward() # RuntimeError: No grad accumulator for a saved leaf!
I'm not sure about the new autograd mechanics but this worked in the previous version.