Backward hidden state used to initialize decoder instead of Forward #2

eridgd · 2019-06-03T23:26:49Z

Hi,

I came across your paper while searching for seq2seq examples with aligned one-to-one mapping from inputs to labels. I appreciate the clarity of the code and am grateful that you made it available.

I haven't been able to run & step through the code yet, but one thing I noticed that seems to differ from the paper is this line where the initial hidden state for the decoder is extracted from the encoder https://github.com/liah-chan/sequence2sequenceNER/blob/master/scripts/train.py#L92:

last_hidden = (encoder_hidden[0][1].unsqueeze(0), encoder_hidden[1][1].unsqueeze(0))

encoder_hidden is the tuple (hidden, cell) each with shape (num_layers * num_directions, batch, hidden_size). It seems that encoder_hidden[0][1] would therefore be the hidden state of the backward direction at word 0, whereas the paper in §4.3 states that the forward output at word=seq_length is the initial hidden state:

On the decoder side, we use a single layer LSTM that generates label predictions step by step from the start to the end of the sentence. The last hidden state of the forward encoder RNN (−→ht) is used as the initial decoder hidden state.

Was your intent to use the forward or backward state for initialization? And do you think it would make a real difference since the decoder also receives the aligned F/B output states for each step?

Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backward hidden state used to initialize decoder instead of Forward #2

Backward hidden state used to initialize decoder instead of Forward #2

eridgd commented Jun 3, 2019

Backward hidden state used to initialize decoder instead of Forward #2

Backward hidden state used to initialize decoder instead of Forward #2

Comments

eridgd commented Jun 3, 2019