# Building your Recurrent Neural Network - Step by Step


- $[l]$ 第l层
- $(i)$ 第i个例子/样本
- <$t$> 第t个time-step
- $a_5^{(2)[3]<4>}$ 第三层 第二个样本 第4time-step 第5个输入a值

In [3]:
import numpy as np

In [4]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def softmax(x):
    """
    x = [1, 2, 3]
    t = [e^1, e^2, e^3]
    sum(t) = e^1 + e^2 + e^3 = _sum
    t = t / sum(t) = t / _sum = [e^1/_sum, e^2/_sum, e^3/_sum]
    """
    t = np.exp(x - np.max(x))
    return t / t.sum(axis=0)

## 1 - Forward Propagation for basic RNN

- $T_x = 10$ time steps 一句话有10个单词
- $n_x$ 一个字典里有n个单词, $x^{(i)<t>}$ 第i个样本的第t个timestep shape为(n,)
- $m$ mini-batch is m

- ($n_x$, m, $T_x$) input x is fed into the RNN

![image](https://wx4.sinaimg.cn/mw1024/701c57e5gy1gemde9apnxj211k0eujt1.jpg)
![image](https://wx1.sinaimg.cn/mw1024/701c57e5gy1gemde9kvxdj213m0gwtbh.jpg)

```python
# inputs: a_prev, x
# parameters: Waa, Wax, ba, Way, by

# 1. compute a_next
z = Waa * a_prev + Wax * x + ba
a_next = tanh(z)

# 2. compute y
y = softmax(Way * a_next + by)
```

In [9]:
def rnn_cell_forward(xt, a_prev, parameters):
    # parameters
    Wax = parameters['Wax'] # 5 * 5
    Waa = parameters['Waa'] # 5 * 3
    ba = parameters['ba'] # 5 * 1

    Wya = parameters['Wya'] # 2 * 5
    by = parameters['by'] # 2 * 1
    
    # 1. compute a_next
    # 当前状态值 = 前一个状态值Waa * a_prev + 当前输入值Wax * xt + bias
    a_next = np.tanh(np.dot(Waa, a_prev) + np.dot(Wax, xt) + ba) # 5 * 10
    
    # 2. compute y
    # 当前状态值 * weights + bias
    yt_pred = softmax(np.dot(Wya, a_next) + by) # 2 * 10
    
    # 3. store cache for backpropagation
    cache = (a_next, a_prev, xt, parameters)
    
    return a_next, yt_pred, cache

In [10]:
np.random.seed(1)
xt_tmp = np.random.randn(3,10)
a_prev_tmp = np.random.randn(5,10)

parameters_tmp = {}
parameters_tmp['Waa'] = np.random.randn(5,5)
parameters_tmp['Wax'] = np.random.randn(5,3)
parameters_tmp['Wya'] = np.random.randn(2,5)
parameters_tmp['ba'] = np.random.randn(5,1)
parameters_tmp['by'] = np.random.randn(2,1)

a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)
print("a_next[4] = \n", a_next_tmp[4])
print("a_next.shape = \n", a_next_tmp.shape)
print("yt_pred[1] =\n", yt_pred_tmp[1])
print("yt_pred.shape = \n", yt_pred_tmp.shape)

a_next[4] = 
 [ 0.59584544  0.18141802  0.61311866  0.99808218  0.85016201  0.99980978
 -0.18887155  0.99815551  0.6531151   0.82872037]
a_next.shape = 
 (5, 10)
yt_pred[1] =
 [0.9888161  0.01682021 0.21140899 0.36817467 0.98988387 0.88945212
 0.36920224 0.9966312  0.9982559  0.17746526]
yt_pred.shape = 
 (2, 10)


# 2 - RNN forward pass

- $a\langle t-1 \rangle$: previous cell
- $x\langle t \rangle$: input data
- $a\langle t \rangle$: hidden state
- $y\langle t \rangle$: output prediction
- Waa, Wya, Wax, ba, by => weights and bias