# Multi-Layer Perceptrons

To solve more complicated problems, we need to add additional layers. There are several immediate questions to consider:
 - How many perceptrons do I need?
 - How many layers is sufficient?
 - Should the hidden layers be larger or smaller than the input layer?
 
This notebook will shed light on some of these questions through examples.

## XOR Problem

Let's start out by creating the data representing the XOR problem.

In [7]:
import numpy as np
import matplotlib.pyplot as plt

%matplotlib notebook

def calc_decision_boundary(weights, bias):
    x = -bias / weights[0]
    y = -bias / weights[1]
    m = -y / x
    b = -bias / weights[1]
    return np.array([m, b])

def gen_boundary_points(weights, m, b):
    # If the slope is undefined, it is vertical.
    if weights[1] != 0:
        x = np.linspace(-5, 5, 100)
        y = m * x + b
    else:
        x = np.zeros(100)
        y = np.linspace(-5, 5, 100) + b
        
    return x, y

def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

In [3]:
data = np.array([[0, 0],
                 [0, 1],
                 [1, 0],
                 [1, 1]])
targets = np.array([0, 1, 1, 0])

fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(data[:, 0], data[:, 1], c=targets)

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x7fe8443a2ed0>

What was the result of using a single perceptron to solve this problem?

The most optimal outcome is 75\% accuracy.

In [4]:
# Classifier Parameters
weights = np.array([1, 1]) 
bias = -0.5

# For visualizing the line
m, b = calc_decision_boundary(weights, bias)

x, y = gen_boundary_points(weights, m, b)
    
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y, c='g')
ax.scatter(data[:, 0], data[:, 1], c=targets)
ax.set_xlim([-0.2, 1.2])
ax.set_ylim([-0.2, 1.2])

<IPython.core.display.Javascript object>

(-0.2, 1.2)

## Adding another perceptron

We begin by adding a hidden layer with a single perceptron having a sigmoidal, nonlinear activation function.

If the hidden layer has only a single unit that produces a scalar output, then the initialization of our output perceptron changes. The weight matrix defining the output perceptron must have a weight for each incoming input. Since the hidden layer output is of size 1, the output perceptron only has a single weight.

In [5]:
hidden_weights = np.array([1, 1])
hidden_bias = -0.5
out_weights = np.array([1])
out_bias = 0

# For visualizing the line
hidden_m, hidden_b = calc_decision_boundary(hidden_weights, hidden_bias)

hidden_x, hidden_y = gen_boundary_points(hidden_weights, hidden_m, hidden_b)
    
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(hidden_x, hidden_y, c='g')
ax.scatter(data[:, 0], data[:, 1], c=targets)
ax.set_xlim([-0.2, 1.2])
ax.set_ylim([-0.2, 1.2])

<IPython.core.display.Javascript object>

(-0.2, 1.2)

A single perceptron in the hidden layer means that we still only have a single decision boundary. Let's add one more and see how things change.

In [8]:
hidden_weights = np.array([[1, 1], [1, 1]])
hidden_bias = np.array([-0.5, -1.5])
out_weights = np.array([1.0, -1.0])
out_bias = -0.2

# For visualizing the line
hidden_m0, hidden_b0 = calc_decision_boundary(hidden_weights[0], hidden_bias[0])
hidden_x0, hidden_y0 = gen_boundary_points(hidden_weights[0], hidden_m0, hidden_b0)
hidden_m1, hidden_b1 = calc_decision_boundary(hidden_weights[1], hidden_bias[1])
hidden_x1, hidden_y1 = gen_boundary_points(hidden_weights[1], hidden_m1, hidden_b1)

# Forward propagation
hidden_bias_exp = np.repeat(np.expand_dims(hidden_bias, axis=-1), 4, axis=1)
hidden_out = hidden_weights @ data.T + hidden_bias_exp
hidden_act = sigmoid(hidden_out)
print(hidden_out)
print(hidden_act)
out = out_weights @ hidden_act + out_bias
print(out)
    
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(hidden_x0, hidden_y0, c='g')
ax.plot(hidden_x1, hidden_y1, c='g')
ax.scatter(data[:, 0], data[:, 1], c=targets)
ax.set_xlim([-0.2, 1.2])
ax.set_ylim([-0.2, 1.2])

[[-0.5  0.5  0.5  1.5]
 [-1.5 -0.5 -0.5  0.5]]
[[0.37754067 0.62245933 0.62245933 0.81757448]
 [0.18242552 0.37754067 0.37754067 0.62245933]]
[-0.00488486  0.04491866  0.04491866 -0.00488486]


<IPython.core.display.Javascript object>

(-0.2, 1.2)