<a href="https://colab.research.google.com/github/nigoda/machine_learning/blob/main/28_Simple_RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Simple RNN**

The process we just naively implemented in Numpy corresponds to an actual Keras layer: the SimpleRNN layer:

In [1]:
import numpy as np

In [2]:
timesteps = 100        # Number of timesteps in the input sequence.
input_feature = 32     # Dimensionality of the input feature space.
out_feature = 64       # Dimensionality of the output feature space.  

Get the input

In [3]:
inputs = np.random.random((timesteps, input_feature))
inputs

array([[0.44603114, 0.1238897 , 0.15387167, ..., 0.36176625, 0.19393007,
        0.1791747 ],
       [0.21220262, 0.79783966, 0.39454305, ..., 0.15504415, 0.37042054,
        0.34735725],
       [0.00120405, 0.56023748, 0.65233289, ..., 0.53356833, 0.85795457,
        0.62226362],
       ...,
       [0.14327761, 0.35689071, 0.28627252, ..., 0.93671422, 0.74434833,
        0.05569425],
       [0.12394122, 0.86452569, 0.56517272, ..., 0.93693043, 0.30697797,
        0.21804504],
       [0.3432727 , 0.13070061, 0.45842125, ..., 0.2545313 , 0.60975184,
        0.18803827]])

Initial state is all-zero vector

In [4]:
state_t = np.zeros((out_feature))
state_t

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

Initialize weight randomly

In [5]:
W = np.random.random((out_feature, input_feature))
U = np.random.random((out_feature, out_feature))
b = np.random.random((out_feature))

Let's implement RNN

In [12]:
successive_output = []
i = 0
for input_t in inputs:
  output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b) # Combines input with the current state to obtain the current output
  successive_output.append(output_t)
  state_t = output_t # Update state of the network for the next timestep

# The final output is a 2D tensor of shape (timesteps, output_features)
final_output_sequence = np.concatenate(successive_output, axis=0)


Issues:

*  Too simplistic for real life usecases
*  It is not possible to learn long term dependencies with SimpleRNN. This is due to *vanishing gradient problem* - as you add more layers to the network with many layer, it eventually becomes untrainable.