# Recurrent Neural Networks

Works much better on **sequential** data such as financial series, text and audio.

#### Motivation
* Many supervised learning problems deal with ordered sequences (e.g. predicting stock prices)
    * `input`: ordered sequence of past series values
    * `output`: ordered sequence of futures series values
    
**Ressources:**
* [academic RNN text generator](http://www.cs.toronto.edu/~ilya/rnn.html)
* [twitter bots](http://tweet-generator-alex.herokuapp.com/)
* [NanoGenMo](https://github.com/NaNoGenMo/2016)
* [Robot Shakespear](https://github.com/genekogan/RobotShakespeare)

## 1. Recursive Sequences

* **Ordererd sequence**: a list of values ordered by index `(S1,S2,S3,...SP)`
    * index can be time stamp or other orders of sequence
    
* Many real ordered sequences are a product of some underlying process or processes
* But underlying process(es) can be a black box, with no model that explains the data
* Instead we broadly model ordered sequences **recursively**:
    * use past values in a sequence to predict future ones
    * we model future values of a sequence mathematically in terms of it's predecessors

* **seed**: initial value(s) of a recursive sequence
* **order**: number of elements future values are dependent on
* **folded and unfolded view**: ways of thinking about recursivity

![](Screen Shot 2017-08-21 at 08.19.33.png)

* **driver**: drive of a recursive sequence (e.g. stock market data)
* **hidden sequence**: the sequence being driven. The data is generated recursively using the driver.

___
* given a sequence how do we model it as recursive? so that we can make meaningful predictions?
* how can we inject this structural assumption into a supervised learner?

## 2. Feed Forward Networks
Straightforward to implement - reverse engineer recursive mode - and can perform reasonably well in certain circumstancens.

![](Screen Shot 2017-08-21 at 09.38.41.png)

In [23]:
from keras.models import Sequential
from keras.layers import Dense

x = [1, 3, 5, 7, 9, 11, 13]
y = [3, 5, 7, 9, 11, 13, 15]

In [28]:
model = Sequential()
model.add(Dense(1, input_shape=(1,), activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x, y, epochs=3000, batch_size=3, verbose=0)

<keras.callbacks.History at 0x7f703554e9e8>

In [29]:
model.predict(x)

array([[  2.98680735],
       [  4.99007607],
       [  6.99334431],
       [  8.99661255],
       [ 10.99988174],
       [ 13.00314999],
       [ 15.00641823]], dtype=float32)

In [30]:
model.get_weights()

[array([[ 1.00163424]], dtype=float32), array([ 1.98517323], dtype=float32)]

### Points
* Given an ordered sequence we can make a recursive approximation to it:
    1. guess the architecture of the recursive formula
    2. tune the parameters of that architecture optimally, using the sequence itself

* can be used as a **generative model**

![](Screen Shot 2017-08-21 at 09.38.41.png)

Adding a relu and increasing the window size
![](Screen Shot 2017-08-21 at 11.01.41.png)

Here we have a **window size** of 2

In [33]:
model = Sequential()
model.add(Dense(1, input_shape=(2,), activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')

* Our ultimate goal is to model ordered sequences recursively
* Approach used here: resolve a recursive formula
    1. choose architecture (order, functionality)
    2. break recursion into levels
    3. window sequence -> input/output pairs
    4. minimize loss
    5. regressor is generative model

* Recursive formula --> recursive sequence (approximation to truth)
* Some applications require standard train/ test paradigm (e.g. long-term stock predictions)

With Feed Forward Networks we can model recursivity correctly but we loose dependence completely, when we start tuning parameters.
* Further levels of recursion are not dependent on earlier levels, as they should be.

## 3. RNNs
* FFNs fails becaulse levels become independent,
    * so we need to enforce greater dependency across levels
    
We want to enforce consecutive level dependency
* RNNs has memory since it is dependent on the previous state, which is dependendent on...