<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc" style="margin-top: 1em;"><ul class="toc-item"></ul></div>

A major characteristic of all NN seen so far, is that they have no memory. Each input shown to them is processed independently, with no state kept in between inputs. Such NN are called **feedforward networks**. 

A **recurrent neural network (RNN)** processes sequences by iterating through the sequence elements and maintaining a **state** containing information relative to what it has seen so far. In effect, RNN is a type of NN that has an internal loop. The state of the RNN is reset between processing two different, independent sequences (such as two movie reviews), so you still consider one sequence a single data point: a single input to the network. What changes is that this data point is no longer processed in a single step, rather, the network internally loops over sequence elements. 

# NumPy implementation of a simple RNN

This RNN takes as input a sequence of vectors, which you will encodeas a 2D tensor of size ```(timesteps, input_features)```. It loops over timesteps, and at each timestep, it considers its current state at *t* and the input at *t* (of shape ```(input_features, )``` and combines them to obtain the output at *t*.

In [1]:
import numpy as np

In [2]:
timesteps = 100 # Number of timesteps in the input sequence
input_features = 32 # Dimensionality of the input feature space
output_features = 64 # Dimensionality of the output feature space

In [3]:
inputs = np.random.random((timesteps, input_features)) # Random noise for the sake of the example

In [4]:
state_t = np.zeros((output_features, )) # Initial state: an all-zero vector

In [5]:
W = np.random.random((output_features, input_features))

In [6]:
U = np.random.random((output_features, output_features))

In [7]:
b = np.random.random((output_features, ))

In [8]:
W.shape

(64, 32)

In [9]:
U.shape

(64, 64)

In [10]:
b.shape

(64,)

In [11]:
successive_outputs = []

In [12]:
for input_t in inputs: # input_t is a vector of shape (input_features)
    output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b) # Combine the input with the current state to obtain the current output
    successive_outputs.append(output_t)
    state_t = output_t # Update the state of the network for the next timestep
    

In [13]:
# The final output is a 2D tensor of shape (timesteps, output_features)
final_output_sequence = np.concatenate(successive_outputs, axis=0)

In [14]:
final_output_sequence.shape

(6400,)

In [15]:
len(successive_outputs)

100

In [16]:
successive_outputs[0].shape

(64,)

![](https://dpzbhybb2pdcj.cloudfront.net/chollet/Figures/06fig10_alt.jpg)

In this example, the final output is a 2D tensor of shape (timesteps, output_features), where each timestep is the output of the loop at time t. Each timestep t in the output tensor contains information about timesteps 0 to t in the input sequence—about the entire past. For this reason, in many cases, you don’t need this full sequence of outputs; you just need the last output (output_t at the end of the loop), because it already contains information about the entire sequence. 