Skip to content

Latest commit

 

History

History
45 lines (24 loc) · 2.17 KB

Sequence to Sequence Learning with Neural Networks.md

File metadata and controls

45 lines (24 loc) · 2.17 KB

#Sequence to Sequence Learning with Neural Networks - NIPS 2014

###Written by Mingdong

##Task Machine translation from English to French.

##Method ###Overview It is a statistical machine translation approach so it uses a Stacked LSTM (4 layer) to parameterized a Conditioned Possibility

###Stacked LSTM

Figure 1

As shown in the figure 1, first the LSTM reads in the input, then <EOS>. Afterwards it begins to output translated sequence until another <EOS> is predicted.

Note that here a unit is actually 4 vertically placed units (Refer to the author's talk for more details).

The input words are embedding vectors and the output words are predicted by a super huge softmax (1000*80000!).

##Contribution Improve the performance by using Stacked LSTM comparing to Cho's and Bahdanau's work.

###Tricks - Reverse input sequence Reduce the long term dependency problem.

##Drawback It did not use any more sophisticated techniques like Attention. Thus it have to use more parameters (stacked LSTM)

Stacked LSTM and large softmax make training very time consuming. 8 GPU * 10 days.

##Reference

Author's Talk is very informative.

##Cite Bibtex