Skip to content
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.

Input-feeding Approach #33

Closed
TinaB19 opened this issue Jun 12, 2017 · 4 comments
Closed

Input-feeding Approach #33

TinaB19 opened this issue Jun 12, 2017 · 4 comments

Comments

@TinaB19
Copy link

TinaB19 commented Jun 12, 2017

Thank you very much for the awesome work, I need a clarification in the decoder part of seq2seq-translation.
#Combine embedded input word and last context, run through RNN
rnn_input = torch.cat((word_embedded, last_context.unsqueeze(0)), 2)

Is the above code an implementation of Input-feeding Approach in the Effective Approaches to Attention-based Neural Machine Translation paper?

@spro
Copy link
Owner

spro commented Jun 12, 2017

Yes it is, though looking back at it I'm missing one layer between the context vector c_t and the softmax layer, to create the "attentional hidden state" ~h_t, which is what they use for input feeding.

@TinaB19
Copy link
Author

TinaB19 commented Jun 12, 2017

It would be great if you add it to the tutorial later, thank you very much.

@TinaB19
Copy link
Author

TinaB19 commented Jun 13, 2017

I just saw seq2seq-translation-batched.

concat_input = torch.cat((rnn_output, context), 1)
concat_output = F.tanh(self.concat(concat_input))

So I guess in this case we should concatenate concat_output with embedded in the next time step and then feed them to gru. Is this correct?

@spro
Copy link
Owner

spro commented Jun 14, 2017

Correct.

@spro spro closed this as completed Jun 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants