# Refactoring the multi-layer perceptron
In the demo code from lecture, we implemented a multi-layer perceptron (MLP) with one hidden layer, but it's written in a very brittle fashion with everything being just one loop.

For now, we'll stick to the simple model without activation functions (so it's still just linear transformations all the way), but the refactoring will make it easier to add activation functions later.

## Objectives
- Refactor the code to make it more modular
- Understand the steps in the training loop
- Visualize the learned hidden space

In [None]:
import numpy as np
import matplotlib.pyplot as plt

For reference, here's the original training loop:

In [None]:
# toy mlp example
# forward pass
x = np.array([2, 3])
y = 1
w1 = np.array([[-0.78, 0.13], [0.85, 0.23]])
w2 = np.array([1.8, 0.40])

iterations = 20
eta = 0.01
loss = np.zeros(iterations)

for i in range(iterations):
    # forward pass
    y_hat = x @ w1 @ w2
    
    # update loss
    loss[i] = 0.5 * (y_hat - y)**2

    # backpropagate!
    w2_partials = (y_hat - y) * (x @ w1)
    w1_partials = np.outer(w2_partials, x)

    w1 = w1 - eta * w1_partials
    w2 = w2 - eta * w2_partials

plt.plot(loss)

# check how well we did
print("Final prediction:", x @ w1 @ w2)

Define a set of functions to perform the actions of the MLP:

- `forward`: forward pass through the network
- `loss`: calculate the loss
- `backward`: backward pass (backpropagation) through the network, calculating the gradients
- `train`: update the weights of the network

In [None]:
def forward(x, w1, w2):
    pass

In [None]:
def loss(y, y_hat):
    pass

In [None]:
def backward():
    pass

In [None]:
def train():
    pass

Finally, string the various functions together in a new training loop. The results should be the same as in the original, but now we have a more modular implementation.

## Part 2: Visualizing the hidden space
As we saw in the lecture, it's impossible to draw a line through the scatter plot of $x_1$ and $x_2$ in the XOR case. However, the output neuron is still just a linear transformation, so it must be possible to draw a line through the hidden space.

In [None]:
# assumes binary classification
def plot_data(x, y):
    y_false = y.flatten() == 0
    y_true = y.flatten() == 1
    plt.scatter(x[y_false,0],x[y_false,1],color="blue", marker="o", label="Output: 0")
    plt.scatter(x[y_true,0],x[y_true,1],color="red", marker="x", label="Output: 1")
    plt.legend()
    plt.xlabel("Input 1")
    plt.ylabel("Input 2")

In [None]:
# XOR gate
x = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,1,1,0])
plot_data(x, y)

In [None]:
# MLP with hard-coded weights to solve XOR
X = np.hstack((np.ones((x.shape[0],1)),x))

w1 = np.array([[0, 1, 1], [-1, 1, 1]]).T

H = (X @ w1 > 0).astype(int)
H = np.hstack((np.ones((H.shape[0],1)),H))

w2 = np.array([0, 1, -1])
y = (H @ w2 > 0).astype(int)

In the cell below, write the code to plot the hidden space. You should see that it's possible to draw a line through the hidden space that separates the two classes. This is a single line of code, but it does require understanding what each variable represents.

## Bonus exercise
Try extending your refactored MLP to handle a whole design matrix instead of a single sample. This will require some numpy magic.