-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Closed
Description
In the time_sequence_prediction example, the two layers LSTM are defined and used as the following:
def forward(self, input, future = 0):
outputs = []
h_t = Variable(torch.zeros(input.size(0), 51).double(), requires_grad=False)
c_t = Variable(torch.zeros(input.size(0), 51).double(), requires_grad=False)
h_t2 = Variable(torch.zeros(input.size(0), 1).double(), requires_grad=False)
c_t2 = Variable(torch.zeros(input.size(0), 1).double(), requires_grad=False)
for i, input_t in enumerate(input.chunk(input.size(1), dim=1)):
h_t, c_t = self.lstm1(input_t, (h_t, c_t))
h_t2, c_t2 = self.lstm2(c_t, (h_t2, c_t2))
outputs += [c_t2]
for i in range(future):# if we should predict the future
h_t, c_t = self.lstm1(c_t2, (h_t, c_t))
h_t2, c_t2 = self.lstm2(c_t, (h_t2, c_t2))
outputs += [c_t2]
outputs = torch.stack(outputs, 1).squeeze(2)
return outputs
Notice that the cell state , not the hidden state, in the 1st layer is used as in input for the 2nd layer. Is that correct? In my understanding of Stacked LSTM is that the hidden states of the lower layers are the input for higher layers. Am I missing something? Also, all the examples from tensorflow, chainer, and theano use the hidden state variables not the cell states as an input.
robertsami, gilberto-BE, dendisuhubdy, richardrl and agrija9
Metadata
Metadata
Assignees
Labels
No labels