<table class="tfo-notebook-buttons" align="left">
  <td>
    <a href="https://colab.research.google.com/github/martin-fabbri/colab-notebooks/blob/master/deeplearning.ai/nlp/c3_w2_02_rnn_gru.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>    
  </td>
  <td>
    <a href="https://github.com/martin-fabbri/colab-notebooks/blob/master/deeplearning.ai/nlp/c3_w2_02_rnn_gru.ipynb" target="_parent"><img src="https://raw.githubusercontent.com/martin-fabbri/colab-notebooks/master/assets/github.svg" alt="View On Github"/></a>  </td>
</table>

# Vanilla RNNs, GRUs and the `scan` function

In this notebook, you will learn how to define the forward method for vanilla RNNs and GRUs. Additionally, you will see how to define and use the function `scan` to compute forward propagation for RNNs.

By completing this notebook, you will:

- Be able to define the forward method for vanilla RNNs and GRUs
- Be able to define the `scan` function to perform forward propagation for RNNs
- Understand how forward propagation is implemented for RNNs.

In [2]:
import numpy as np
from numpy import random
from time import perf_counter

An implementation of the `sigmoid` function is provided below so you can use it in this notebook.

In [3]:
def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

# Part 1: Forward method for vanilla RNNs and GRUs

In this part of the notebook, you'll see the implementation of the forward method for a vanilla RNN and you'll implement that same method for a GRU. For this excersice you'll use a set of random weights and variables with the following dimensions:

- Embedding size (`emb`) : 128
- Hidden state size (`h_dim`) : (16,1)

The weights `w_` and biases `b_` are initialized with dimensions (`h_dim`, `emb + h_dim`) and (`h_dim`, 1). We expect the hidden state `h_t` to be a column vector with size (`h_dim`,1) and the initial hidden state `h_0` is a vector of zeros.

In [5]:
random.seed(10)
emb = 128
T = 256
h_dim = 16
h_0 = np.zeros((h_dim, 1))
w1 = random.standard_normal((h_dim, emb + h_dim))
w2 = random.standard_normal((h_dim, emb + h_dim))
w3 = random.standard_normal((h_dim, emb + h_dim))

b1 = random.standard_normal((h_dim, 1))
b2 = random.standard_normal((h_dim, 1))
b3 = random.standard_normal((h_dim, 1))

X = random.standard_normal((T, emb, 1))
weights = [w1, w2, w3, b1, b2, b3]

## 1.1 Forward method for vanilla RNNs

The vanilla RNN cell is quite straight forward. Its most general structure is presented in the next figure: 

<img src="" width="400"/>

As you saw in the lecture videos, the computations made in a vanilla RNN cell are equivalent to the following equations:

$$h^{<t>}=g(W_{h}[h^{<t-1>},x^{<t>}] + b_h)$$

$$\hat{y}^{<t>}=g(W_{yh}h^{<t>} + b_y)$$

where $[h^{<t-1>},x^{<t>}]$ means that $h^{<t-1>}$ and $x^{<t>}$ are concatenated together. In the next cell we provide the implementation of the forward method for a vanilla RNN. 