# Recurrent Neural Networks

Traditional artificial neural networks obtain "long-term memory" by learning the adequate weights. The idea behind recurrent neural networks is to model "short-term memory". By connecting the hidden layer of the last observation to the hidden layer of the current observation, we allow the network to include knowledge of previous layers.

<img src='resources/rnn.png' width=800>

A task displays long-term dependencies if prediction of the desired output at time $t$ depends on input presented at an earlier time $\tau < t$. Bengio *et al*. (1994) proposed that a dynamical system that can learn to store relevant state information requires the following: 

1. Ability to store information for an arbitrary duration (information latching).
2. Resistance to noise.
3. Ability to train parameters in reasonable time.

Traditional backpropagation in RNN is not sufficiently powerful to discover contingencies spanning long temporal intervals, resulting in parameters settling in sub-optimal solutions focused on short-time dependencies, but not long-term dependencies. This is mainly due to the vanishing and exploding gradient problems, in which the gradient either exponentially loses or gains magnitude as it travels back through the network.

## Types of RNN

* One-to-Many - image captioning
* Many-to-One - sentiment analysis
* Many-to-Many - neural machine translation, video captioning

<img src='resources/types.png' width=600>


## References

* Bengio, Y., Simard, P., and Frasconi, P. (1994). [Learning Long-Term Dependencies with Gradient Descent is Difficult](http://ai.dinfo.unifi.it/paolo//ps/tnn-94-gradient.pdf).
* Pascanu, R., Mikolov, T., and Bengio, Y., (2013). [On the difficulty of training recurrent neural networks](http://proceedings.mlr.press/v28/pascanu13.pdf).
* [Long Short-Term Memory](http://www.bioinf.jku.at/publications/older/2604.pdf)
* [Understanding LSTM Networks](http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
* [Understanding LSTM and its diagrams](https://medium.com/mlreview/understanding-lstm-and-its-diagrams-37e2f46f1714)
* [The Unreasonable Effectiveness of Recurrent Neural Networks](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
* [Visualizing and Understanding Recurrent Networks](https://arxiv.org/pdf/1506.02078.pdf)
* [LSTM: A Search Space Odyssey](https://arxiv.org/pdf/1503.04069.pdf)