<a href="https://colab.research.google.com/github/rahiakela/practical-machine-learning-with-tensorflow/blob/week-7/simple_recurrent_neural_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple Recurrent Neural Networks

The process, we just naively implemented in **Numpy** corrosponds to an actual **Keras** layer: the **SimpleRNN** layer.  

In [0]:
import numpy as np

In [0]:
timesteps = 100              # Number of timesteps in the input sequence
input_features = 32          # Dimensionality of the input features space
output_features = 64          # Dimensionality of the output features space

Get input

In [4]:
inputs = np.random.random((timesteps, input_features))
inputs

array([[0.10850797, 0.59631578, 0.42225263, ..., 0.44182855, 0.23309342,
        0.81915178],
       [0.90275417, 0.16193396, 0.43976217, ..., 0.44817614, 0.72989648,
        0.12622022],
       [0.90640542, 0.27716227, 0.77275648, ..., 0.62738259, 0.0756512 ,
        0.70929186],
       ...,
       [0.45566999, 0.59696439, 0.76792123, ..., 0.35960914, 0.79664007,
        0.22563386],
       [0.92985983, 0.13164032, 0.5688466 , ..., 0.33605206, 0.80006412,
        0.61724405],
       [0.57641963, 0.82745562, 0.50072044, ..., 0.10151659, 0.68088018,
        0.49593261]])

Initial state is all-zeros vector

In [5]:
state_t = np.zeros((output_features))
state_t

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

Initialize weights randomly

In [0]:
W = np.random.random((output_features, input_features))
U = np.random.random((output_features, output_features))
b = np.random.random((output_features,))

Let's implement RNN

In [0]:
successive_outputs = []
for input_t in inputs:
  output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)     # Combines input with the current state to obtain the current output
  successive_outputs.append(output_t)
  state_t = output_t                                                  # Updates state of the network for the next timestep

# the final output is 2D tensor os shape (timesteps, output_features)
final_output_sequence = np.concatenate(successive_outputs, axis=0)  

## Issues:

* Too simplistic for real life use cases.
* It is not possible to learn long term dependencies with SimpleRNN. This is due to **Vanishing Gradient** problem - as you add more layers to the network with many layers, it eventually become untrainable.

**LSTM** solve this problem.