# Neural Network from Scratch with Autograd

---
## Introduction 
Autograd is a library capable of automatically differentiating native Python and Numpy code. It is capable of handling loops, ifs, recursion, and closures. It supports reverse-mode differentiation (used by TensorFlow). Its primary purpose is gradient-based optimization – which happens to be how we train neural networks. Today, we will look at creating a neural network from scratch, and we will use Autograd to assist us in implementing backpropagation. 

---
## Loading Data

In [5]:
from keras.datasets import imdb

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

This function will convert reviews from integer sequences to text.

In [6]:
def sequence_to_text(sequence):
    """Converts an integer sequence into the actual review text."""
    word_index = imdb.get_word_index()
    reverse_index = {value: key for (key, value) in word_index.items()}
    return ' '.join([reverse_index.get(i - 3, '?') for i in sequence])

print("Review")
print("~~~~~~")
print(sequence_to_text(train_data[0]), end="\n\n")

print("Sentiment")
print("~~~~~~~~~")
print(train_labels[0])

Review
~~~~~~
? this film was just brilliant casting location scenery story direction everyone's really suited the part they played and you could just imagine being there robert ? is an amazing actor and now the same being director ? father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for ? and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also ? to the two little boy's that played the ? of norman and paul they were just brilliant children are often left out of the ? list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done

We have to prepare our data to be fed into a neural network.

In [9]:
import numpy as np 

def vectorize_sequences(sequences, dimension=10000):
    results = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
        results[i, sequence] = 1
    return results

# Vectorize our input data 
vectorized_train_data = vectorize_sequences(train_data)
vectorized_test_data = vectorize_sequences(test_data)

# Vectorize our label data as well 
vectorized_train_labels = np.asarray(train_labels).astype('float32')
vectorized_test_labels = np.asarray(test_labels).astype('float32')

Let's check the shape of our data 

In [22]:
print('Input Data')
print('Train Data', vectorized_train_data.shape)
print('Test Data', vectorized_test_data.shape)

print('\nOutput Data')
print('Train Labels', vectorized_train_labels.shape)
print('Train Labels', vectorized_test_labels.shape)

Input Data
Train Data (25000, 10000)
Test Data (25000, 10000)

Output Data
Train Labels (25000,)
Train Labels (25000,)


Now let's creating a training and validation set.

In [10]:
# Separate input data
x_val = vectorized_train_data[:10000]
x_train = vectorized_train_data[10000:]

# Separate labels 
y_val = vectorized_train_labels[:10000]
y_train_partial = vectorized_train_labels[10000:]

--- 
## Step 1: Initializing Parameters

In [51]:
import autograd.numpy as np 
from autograd import grad

def init_parameters(sizes):
    """Initialize weights for a network with the given layer sizs."""
    parameters = {} 
    for i in range(1, len(sizes)):
        parameters['W' + str(i)] = np.random.randn(sizes[i-1], sizes[i]) * 0.01
        parameters['b' + str(i)] = np.zeros((sizes[i], 1))
    return parameters

Let's create some parameters to see what this looks like.

In [47]:
parameters = init_parameters([10000, 32, 1])
for i in range(1, len(parameters) // 2 + 1):
    print('W' + str(i), parameters['W' + str(i)].shape)
    print('b' + str(i), parameters['b' + str(i)].shape)

W1 (10000, 32)
b1 (32, 1)
W2 (32, 1)
b2 (1, 1)


---
## Step 2: Forward Propagation

### Activation Functions
We will start by defining our non-linearities

In [48]:
def relu(Z):
    return np.maximum(Z, 0)

def sigmoid(Z):
    return 1 / (1 + np.exp(-Z))

### Single Layer Computation

In [49]:
def feed_forward_step(A, W, b, activation):
    """Performs one step of forward propagation"""
    Z = np.dot(A, W) + b
    return activation(Z)

Let's test that this works manually

In [55]:
A0 = x_train[:32]
A1 = feed_forward_step(A0, parameters['W1'], parameters['b1'], relu)
A2 = feed_forward_step(A1, parameters['W2'], parameters['b2'], sigmoid)

print('Layer Dimensions')
print('Input', A0.shape)
print('Hidden', A1.shape)
print('Output', A2.shape)

print('Output')
print(A2[:5])

Layer Dimensions
Input (32, 10000)
Hidden (32, 32)
Output (32, 1)
Output
[[0.72559597]
 [0.72621343]
 [0.72639743]
 [0.72604409]
 [0.72640842]]


### Forward Propagation

In [57]:
def feed_forward(X, parameters, activations):
    """Performs forward propagation for the given examples."""
    A = X
    for i in range(1, len(parameters) // 2 + 1):
        W = parameters['W' + str(i)]
        b = parameters['b' + str(i)]
        activation = activations[i]
        A = feed_forward_step(A, W, b, activation)
    return A

We can confirm that this is correct by checking our previous example.

In [59]:
activations = [None, relu, sigmoid]
out = feed_forward(A0, parameters, activations)
print(out[:5])

[[0.72559597]
 [0.72621343]
 [0.72639743]
 [0.72604409]
 [0.72640842]]


---
## Step 3: Defining an Objective Function