#Sequence to Sequence Learning with Neural Networks - NIPS 2014

###Written by Mingdong

##Task Machine translation from English to French.

##Method ###Overview It is a statistical machine translation approach so it uses a Stacked LSTM (4 layer) to parameterized a Conditioned Possibility

###Stacked LSTM

As shown in the figure 1, first the LSTM reads in the input, then <EOS>. Afterwards it begins to output translated sequence until another <EOS> is predicted.

Note that here a unit is actually 4 vertically placed units (Refer to the author's talk for more details).

The input words are embedding vectors and the output words are predicted by a super huge softmax (1000*80000!).

##Contribution Improve the performance by using Stacked LSTM comparing to Cho's and Bahdanau's work.

###Tricks - Reverse input sequence Reduce the long term dependency problem.

##Drawback It did not use any more sophisticated techniques like Attention. Thus it have to use more parameters (stacked LSTM)

Stacked LSTM and large softmax make training very time consuming. 8 GPU * 10 days.

##Reference

Author's Talk is very informative.

##Cite Bibtex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequence to Sequence Learning with Neural Networks.md

Sequence to Sequence Learning with Neural Networks.md

Files

Sequence to Sequence Learning with Neural Networks.md

Latest commit

History

Sequence to Sequence Learning with Neural Networks.md

File metadata and controls