# Multi-class Classification: Neural Networks (only Forward Propagation)

This notebook contains the last part of the Programming Exercise 3 of the Andrew Ng's course on Coursera.

Implement a multi-class classification model using Neural Networks to recognize handwritten numbers (0-9). We will only implement the Forward Propagation as the model has already been trained and the Theta values have been provided.

The NN has 3 levels:
- L1: 401 = 400 (20x20) features + 1 bias
- L2: 25 + 1 bias
- L3: 10 classification labels

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## Load Data
Same as previous exercise with Logistic Regression Classifier

In [8]:
# Load the data (in .mat format)
from scipy.io import loadmat
data = loadmat('ex3data1.mat')

X = data['X']
y = data['y']
X.shape, y.shape

((5000L, 400L), (5000L, 1L))

## Load trained model (theta values)

In [9]:
# Load the data (in .mat format)
from scipy.io import loadmat
thetas = loadmat('ex3weights.mat')
theta1, theta2 = thetas['Theta1'], thetas['Theta2']

In [10]:
print "Theta1 has shape:",Theta1.shape
print "Theta2 has shape:",Theta2.shape

Theta1 has shape: (25L, 401L)
Theta2 has shape: (10L, 26L)


- a1 => 401 (400 + 1)
- theta1 => 25 x 401
- a2 => 26 (25 + 1)
- theta2 => 10 x 26
- a3 => 10

## Forward Propagation

In [11]:
# sigmoid function definition
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

In [33]:
# Function that receives as input an X[1x400], a theta1[25x401] and theta2[10x26]
# and returns all the forward propagation intermedate steps (a1, z2, a2, z3, h)
def forward_propagate(X, theta1, theta2):
    
    # convert to matrices to use notation of algebra operations
    X = np.matrix(X)
    theta1 = np.matrix(theta1)
    theta2 = np.matrix(theta2)
    
    # set helper variable
    num_features = X.shape[0]
    
    # add the bias unit at Level 1 (a1 is a vector of [1x401])
    a1 = np.insert(X, 0, values=np.ones(num_features), axis=1)
    
    # Compute z2 (a1[1x401] · theta1.T[401x25] = z2[1x25])
    z2 = a1 * theta1.T
    
    # Compute a2, and add the bias unit at Level 2 (a2 is a vector of [1x26])
    a2 = np.insert(sigmoid(z2), 0, values=np.ones(num_features), axis=1)
    
    # Compute z3 (a2[1x26] · theta2.T[26x10] = z2[1x10])
    z3 = a2 * theta2.T

    # Compute h (or a3) (h is a vector of [1x10])
    h = sigmoid(z3)
    
    return a1, z2, a2, z3, h

In [35]:
# Execute the forward propagation for a single sample of X
a1, z2, a2, z3, h = forward_propagate(X[0], theta1, theta2)

In [37]:
print "a1 has shape:    ", a1.shape
print "theta1 has shape:", theta1.shape
print "z2 has shape:    ", z2.shape
print "a2 has shape:    ", a2.shape
print "theta2 has shape:", theta2.shape
print "z3 has shape:    ", z3.shape
print "h (a3) has shape:", h.shape

a1 has shape:     (1L, 401L)
theta1 has shape: (25L, 401L)
z2 has shape:     (1L, 25L)
a2 has shape:     (1L, 26L)
theta2 has shape: (10L, 26L)
z3 has shape:     (1L, 10L)
h (a3) has shape: (1L, 10L)


OK! Shapes are correct! :)