# Quiz 7 

### Question 3
The figure below shows a Recurrent Neural Network (RNN) with one input unit x, one logistic hidden unit h, and one linear output unit $y$

The RNN is unrolled in time for T=0,1, and 2.

![RNN1](misc/rnn1.png)

The network parameters are: $W_{xh} = 0.5, W_{hh} = -1.0, W_{hy} = -0.7$, $h_{bias}=-0.1$ and $y_{bias}=0.0$. Remember, $\sigma(k) = \frac{1}{1 + e^{-k}}$.

The inputs at various time steps are the following:

$T$ | $x$ (input)
--- | ---
0 | 9
1 | 4
2 | -2


What is the value of the output y at $T=1$?

**How to forward propagate in a recurrent neural network?**

We can think of the given neural net as a growing rnn where, after each time-step, an input is produced, the output of the net at time $t$ does not affect the hidden units in the forward progation. It then follows that in order to find the output at time 2 of this RNN, we only forward propagate twice with the same hidden units (using the logistic function), and compute the output with the linear activation function

In [27]:
from math import exp
sigma = lambda k: 1 / (1 + exp(-k))
X = [9, 4]
Wxh, Whh, Why = 0.5, -1.0, -0.7
hbias = -0.1

# hidden from time 0:
#    Input to hidden state
o1 = sigma(X[0] * Wxh + hbias)
# hidden from time 1:
#    output from time 0 and input from time 1
o2 = sigma((o1 * Whh) + (X[1] * Wxh) + hbias)
# output from time 1
output = o2 * Why
print(output)

-0.49940485619669833


### Question 4
Consider the RNN architecture above.

The network parameters are:

Params | Values
--- | ---
$W_{xh}$ | -0.1
$W_{hh}$ | 0.5
$W_{hy}$ | 0.25
$h_{bias}$ | 0.4
$y_{bias}$ | 0.0

And the inputs are (verify)

$T$ |  $x_i$ | $h_i$ | $y_i$ | $t_i$
--- |    --- |   --- | ---   | ---
0   | 18     | 0.2   | 0.05 | 0.1
1   | 9      | 0.4   | 0.1   | -0.1
2   | -8     | 0.8   | 0.2   | -0.2

And the following sequence of equations are required to calculate the RSS

$$
\begin{align}
    z_0 &= W_{xh} x_0 + h_{bias}\Rightarrow h_0 = \sigma(z_0)\\
    z_1 &= W_{xh} x_1 + W_{hh}h_{0} + h_{bias} \Rightarrow  h_1 = \sigma(z_1)\\
    z_2 &= W_{xh} x_2 + W_{hh}h_{1} + h_{bias} \Rightarrow  h_2 = \sigma(z_2)
\end{align}
$$

$$
\begin{align}
y_0 &= W_{hy} h_0 + y_{bias} \Rightarrow E_0 = \frac{1}{2}(t_0 - y_0)^2 \\
y_1 &= W_{hy} h_1 + y_{bias} \Rightarrow E_1 = \frac{1}{2}(t_1 - y_1)^2 \\
y_2 &= W_{hy} h_1 + y_{bias} \Rightarrow E_2 = \frac{1}{2}(t_2 - y_2)^2
\end{align}
$$

$$
E = E_0 + E_1 + E_2
$$

In [12]:
X = [18, 9, -8]
Wxh, Whh, Why = -0.1, 0.5, 0.25
hbias = 0.4
ti = [0.1, -0.1, -0.2]

In [13]:
# Finding the hidden values.
hi = []
for ix, x in enumerate(X):
    if ix >= 1:
        h_i = sigma(Wxh * x + Whh * hs[ix-1] + hbias)
    else:
        h_i = sigma(Wxh * x + hbias)
        
    # Rounding to be consistent with the data given.
    hi.append(round(h_i,2))
    
hi

[0.2, 0.4, 0.8]

In [17]:
# Finding the output values.
yi = [Why * h for h in hi]
yi

[0.05, 0.1, 0.2]

Finally, $\frac{\partial E}{\partial z_1}$ is given by
$$
    \frac{\partial E}{\partial z_1} = [y_1 - t_1](W_{hy} [h_1(1-h_1)]) + [y_2 - t_2](W_{hy} [h_2(1-h_2)])(W_{hh} [h_1(1-h_1)]) 
$$

In [21]:
(yi[1] - ti[1]) * (Why * (hi[1] * (1 - hi[1]))) + \
(yi[2] - ti[2]) * (Why * (hi[2] * (1 - hi[2]))) * (Whh * (hi[1] * (1 - hi[1])))

0.01392