## Recurrent Neural Network

Recurrent neural networks (RNN) is a class of artificial neural networks specifically designed to learn patterns in sequential data. RNN is used in many fields, including speech recognition, stock price prediction, as well as machine translation. The basic idea of RNN can be shown in training a language model where the probability of a sentence is calcualted by 

\begin{equation*}
P(w_1,w_2,... w_m) = \prod_{i=1}^{m} P(w_i | w_1, w_2,... w_{i-1}).
\end{equation*}

The probability of a sentence is the product of probabilities for each word given all words preceeding it. RNN learns sequence patterns by performing same task on each element one after another, collecting information about sequence order in the weights. The main equations of RNN are:   

\begin{equation*}
h_t = \sigma(U x_t + W h_{t-1} + b)
\end{equation*}
\begin{equation*}
y_t = softmax(V h_t + b)
\end{equation*}

where $U$, $W$, and $V$ are weight matrices, $x_t$ is input at step $t$, $h_t$ is hidden layer at step $t$, and $y_t$ represents output at step $t$.  


References:

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

https://github.com/dennybritz/rnn-tutorial-rnnlm

In [None]:
import numpy as np 


In [None]:
U = 0.01 * np.random.randn(hidden_size, vocab_size) 
W = 0.01 * np.random.randn(hidden_size, hidden_size)
V = 0.01 * np.random.randn(vocab_size, hidden_size) 

b1 = np.zeros((1, hidden_size))
b2 = np.zeros((vocab_size, 1)) 