# Homework 10 - Recurrent Neural Networks

We will use a pseudo-random number generator to set the values for the inputs and weight matrices. 
To make sure that we all will get the same results, we have to set a seed for the pseudo-random number generator. 

**DO NOT CHANGE THE SEED!**

The seed will probably change from semester to semester. Please make sure that this is the notebook for the current semester!

In [1]:
# For this homework we only need numpy
import numpy as np

# Set the seed for the pseudo-random number generator. DO NOT CHANGE!
seed = 2223

# Forward pass for a simple RNN
In this exercise you will write a Python function to evaluate the forward pass of a simple RNN with the following unfolded computation graph:

![Simple RNN](./data/simple_rnn.png)

Write a Python function e.g. called $\mathbf{Y} = sim\_rnn(\mathbf{h}_0, \mathbf{X}, \mathbf{W}, \mathbf{U}, \mathbf{V}, \mathbf{b}, \mathbf{v})$, that evaluates the forward pass of the RNN with the following equations:

\begin{align*}
    \mathbf{h}_t &= \tanh(\mathbf{W}\mathbf{h}_{t-1} + \mathbf{U}\mathbf{x}_t + \mathbf{b}) \\
    \mathbf{y}_t &= \mathbf{V}\mathbf{h}_t + \mathbf{v}
\end{align*}

The $i$-th row in the matrix $\mathbf{X}$ corresponds to the input vector $\mathbf{x}_i$ at time-step $i$.
The function should return a matrix $\mathbf{Y}$ where the $i$-th row contains the output at time-step $i$. 

All inputs into the function as well as the outputs are numpy arrays. $\mathbf{X}$, $\mathbf{W}$, $\mathbf{U}$, $\mathbf{V}$ are matrices and $\mathbf{h}_0$, $\mathbf{b}$, $\mathbf{v}$ are vectors.

You can use the following template for the function:

In [2]:
def sim_rnn(
    h_0: np.ndarray,
    X: np.ndarray,
    W: np.ndarray,
    U: np.ndarray,
    V: np.ndarray,
    b: np.ndarray,
    v: np.ndarray
) -> np.ndarray:
    """Forward pass for the simple RNN"""
    
    # Do your calculations here
    # ....
    Y = np.zeros(10) # this is just here to create a working template function
    
    return Y

Now, we create pseudo-random inputs, weight matrices and vectors. 

In [None]:
def gen_random_weights_rnn():
    """
    Generate random weights
    
    DO NOT CHANGE!
    """
    
    # create pseudo-random number generator and set seed
    prng = np.random.RandomState(seed)
    
    # random dimensions
    n_h = int(prng.randint(2, 10, 1))  # the hidden state has a dimension between 2 and 10
    n_x = int(prng.randint(50, 500, 1))  # one input vector has a dimension between 50 and 500
    n_y = 2  # the output will have 2 dimensions

    # number of steps
    n_steps = 15  # the sequence will be 15 steps long

    # initial hidden state
    h_0 = prng.randn(n_h)

    # input and weight matrices and bias vectors
    X = prng.randn(n_steps, n_x)
    W = prng.randn(n_h, n_h)
    U = prng.randn(n_h, n_x)
    b = prng.randn(n_h)
    V = prng.randn(n_y, n_h)
    v = prng.randn(n_y)
    
    return h_0, X, W, U, b, V, v

h_0, X, W, U, b, V, v = gen_random_weights_rnn()

Use your function to calculate the output sequence and store it in the matrix $\mathbf{Y}$.

In [None]:
Y = sim_rnn(h_0, X, W, U, V, b, v)

for i, y_i in enumerate(Y, 1):
    print(f"Output time-step {i}: {y_i}")

Check for the requested time-step in the Moodle question and enter your result.

# Forward pass for a bidirectional RNN
In this exercise you will extend your function for the simple RNN, to also evaluate a backward sequence, resulting in the following bidirectional RNN:

![Bidirectional RNN](./data/bidirectional_rnn.png)

Write a Python function e.g. called $\mathbf{Y} = sim\_bidir\_rnn(\mathbf{h}_0, \mathbf{k}_T, \mathbf{X}, \mathbf{W}_1, \mathbf{U}_1, \mathbf{b}_1, \mathbf{V}_1, \mathbf{W}_2, \mathbf{U}_2, \mathbf{b}_2, \mathbf{V}_2, \mathbf{v})$, that evaluates the forward pass of the bidirectional RNN with the following equations:

\begin{align*}
    \mathbf{h}_t &= \tanh(\mathbf{W}_1\mathbf{h}_{t-1} + \mathbf{U}_1\mathbf{x}_t + \mathbf{b}_1) \\
    \mathbf{k}_t &= \tanh(\mathbf{W}_2\mathbf{k}_{t+1} + \mathbf{U}_2\mathbf{x}_t + \mathbf{b}_2) \\
    \mathbf{y}_t &= \mathbf{V}_1\mathbf{h}_t + \mathbf{V}_2\mathbf{k}_t + \mathbf{v}
\end{align*}

The $i$-th row in the matrix $\mathbf{X}$ corresponds to the input vector $\mathbf{x}_i$ at time-step $i$.
The function should return a matrix $\mathbf{Y}$ where the $i$-th row contains the output at time-step $i$. 

All inputs into the function as well as the outputs are numpy arrays. $\mathbf{X}$, $\mathbf{W}_1$, $\mathbf{U}_1$, $\mathbf{V}_1$, $\mathbf{W}_2$, $\mathbf{U}_2$, $\mathbf{V}_21$ are matrices and $\mathbf{h}_0$, $\mathbf{k}_T$, $\mathbf{b}_1$, $\mathbf{b}_2$, $\mathbf{v}$ are vectors. $\mathbf{k}_T$ is similar to $\mathbf{h}_0$, the initial guess for the second hidden state at the end of the sequence at time step $T$.

You can use the following template for the function:

In [5]:
def sim_bidir_rnn(
    h_0: np.ndarray,
    k_T: np.ndarray,
    X: np.ndarray,
    W_1: np.ndarray,
    U_1: np.ndarray,
    b_1: np.ndarray,
    V_1: np.ndarray,
    W_2: np.ndarray,
    U_2: np.ndarray,
    b_2: np.ndarray,
    V_2: np.ndarray,
    v: np.ndarray
) -> np.ndarray:
    """Forward pass for the bidirectional RNN"""
    
    # Do your calculations here
    # ....
    Y = np.zeros(10) # this is just here to create a working template function
    
    return Y

Now, we create pseudo-random inputs, weight matrices and vectors. 

In [None]:
def gen_random_weights_bidir_rnn():
    """
    Generate random weights
    
    DO NOT CHANGE!
    """
    
    # create pseudo-random number generator and set seed
    prng = np.random.RandomState(seed)
    
    # random dimensions
    n_h = int(prng.randint(2, 10, 1))  # the hidden state h has a dimension between 2 and 10
    n_k = int(prng.randint(3, 7, 1))  # the hidden state k has a dimension between 3 and 7
    n_x = int(prng.randint(50, 500, 1))  # one input vector has a dimension between 50 and 500
    n_y = 2  # the output will have 2 dimensions

    # number of steps
    n_steps = 15  # the sequence will be 15 steps long

    # initial hidden states
    h_0 = prng.randn(n_h)
    k_T = prng.randn(n_k)


    # input and weight matrices and bias vectors
    X = prng.randn(n_steps, n_x)
    W_1 = prng.randn(n_h, n_h)
    U_1 = prng.randn(n_h, n_x)
    b_1 = prng.randn(n_h)
    V_1 = prng.randn(n_y, n_h)
    
    W_2 = prng.randn(n_k, n_k)
    U_2 = prng.randn(n_k, n_x)
    b_2 = prng.randn(n_k)
    V_2 = prng.randn(n_y, n_k)
    
    v = prng.randn(n_y)
    
    return h_0, k_T, X, W_1, U_1, b_1, V_1, W_2, U_2, b_2, V_2, v

h_0, k_T, X, W_1, U_1, b_1, V_1, W_2, U_2, b_2, V_2, v = gen_random_weights_bidir_rnn()

Use your function to calculate the output sequence and store it in the matrix $\mathbf{Y}$.

In [None]:
Y = sim_bidir_rnn(h_0, k_T, X, W_1, U_1, b_1, V_1, W_2, U_2, b_2, V_2, v)

for i, y_i in enumerate(Y, 1):
    print(f"Output time-step {i}: {y_i}")

Check for the requested time-step in the Moodle question and enter your result.