In [1]:
import numpy as np
import pandas as pd

# Lab 4 - Multi-layer Perceptron Forward Pass

## Part I
For this exercise you will implement a simple 2-layer perceptron (forward pass)

For the first part you'll write a function that computes the forward pass of a 2-layer perecptron that predicts the prices of houses, using the usual Boston housing dataset.

In [2]:
boston = pd.read_csv('./BostonHousing.txt')

As usual, consider the MEDV as your target variable. 
* Split the data into training, validation and testing (70,15,15)% (you will need this for the next lab as we will build from this lab)

In [3]:
# your code goes here
from sklearn.model_selection import train_test_split
# The target variable is the last column of our dataset
X = boston.values[:,:-1]
y = boston.values[:,1].reshape(-1, 1)
# Now lets split the 3 results
X_train, X_tv, y_train, y_tv = train_test_split(X, y, test_size=0.3, random_state=0)
X_test, X_val, y_test, y_val = train_test_split(X_tv, y_tv, test_size=0.5, random_state=0)

Now you will write the function that computes the forward pass. 
* I provide here a structure that you can follow for your function, but again, feel free to modify it as you see fit.
* Use the sigmoid function as the activation of the hidden layer.
* Don't forget about the biases!
* *It is up to you to think what should be the activation for the output layer.*

In [4]:
def sigmoid_activation(z:np.ndarray) -> np.ndarray:
    # your code goes here
    return 1 / (1 + np.exp(-z))

In [5]:
def two_layer_perceptron(X:np.ndarray, activation, dim_input:int, dim_hidden:int, dim_output:int):
    """
    Implements the forward pass of a two-layer fully connected perceptron.
    
    Parameters
    ----------
    X : a 2-dimensional array
        the input data
    activation : function
        the activation function to be used for the hidden layer
    dim_input : int
        the dimensionality of the input layer
    dim_hidden : int
        the dimensionality of the hidden layer
    dim_output : int
        the dimensionality of the output layer
    Returns
    -------
    y_pred : float
        the output of the computation of the forward pass of the network
    """
    # your code goes here
    # Initializing weights and biases with random values
    # First we initialize the input to hidden layers of weights and biases
    W_input = np.random.normal(size=(dim_input, dim_hidden))
    b_input = np.random.normal(size=(dim_hidden, 1))
    # Then the hidden to output layers of weights and biases
    W_output = np.random.normal(size=(dim_hidden, dim_output))
    b_output = np.random.normal(size=(dim_output, 1))
    
    # So we calculate the hidden layer and apply the activation function
    hidden_layer_i = X @ W_input + b_input.T
    hidden_layer_o = activation(hidden_layer_i)

    # Finally we calculate the output layer and apply another activation function
    output_layer = hidden_layer_o @ W_output + b_output.T
    # The chossen funciont was power to 2 cause in our predictions all the values are
    # Above 0 and below 100, then, to get a better aproximation we can estimate our
    # Values to the power of 2
    y_pred = np.power(output_layer, 2)

    return y_pred

Calculate the RMSE of the forward pass. 

In [6]:
# your code goes here
def rmse(y_pred:np.ndarray, y_real:np.ndarray) -> float:
    return np.sqrt(np.mean(np.power(y_pred - y_real, 2)))

# Let's set our hidden layer dim and calculate our predictions to every splited
hidden_layer_dim = 7
y_train_pred = two_layer_perceptron(X_train, sigmoid_activation, X.shape[1], hidden_layer_dim, 1)
train_rmse = rmse(y_train_pred, y_train)

y_test_pred = two_layer_perceptron(X_test, sigmoid_activation, X.shape[1], hidden_layer_dim, 1)
test_rmse = rmse(y_test_pred, y_test)

y_val_pred = two_layer_perceptron(X_val, sigmoid_activation, X.shape[1], hidden_layer_dim, 1)
val_rmse =rmse(y_val_pred, y_val)
# After we get our rmse let's show there
print("RMSE de treino:", train_rmse)
print("RMSE de teste:", test_rmse)
print("RMSE de validação:", val_rmse)

RMSE de treino: 24.43998089699049
RMSE de teste: 21.415008469477197
RMSE de validação: 25.951034705141204


  return 1 / (1 + np.exp(-z))


## Part II 

For this exercise you will write a function that calculates the forward pass of a 2-layer perceptron that predicts the exact digit from a hand-written image, using the MNIST dataset. 

In [7]:
from sklearn.datasets import load_digits

In [8]:
digits = load_digits()

In [9]:
X = digits.data
y = digits.target

In [10]:
X.shape

(1797, 64)

Again, you will split the data into training, validation and testing.

In [11]:
# your code goes here:
# The same as the previsous training, validation and testing split
X_train, X_tv, y_train, y_tv = train_test_split(X, y, test_size=0.3, random_state=0)
X_test, X_val, y_test, y_val = train_test_split(X_tv, y_tv, test_size=0.5, random_state=0)

Write a function that calculates the forward pass for this multi-class classification problem.
* You will use the sigmoid activation function for the hidden layer.
* For the output layer you will have to write the softmax activation function (you can check the slides)
* __Note:__ you can easily re-use the function that you coded for Part I if you do a simple modification and also include an input argument for the activation of the output layer.

In [12]:
def softmax_activation(z:np.ndarray) -> np.ndarray:
    # your code goes here:
    exp_z = np.exp(z)
    total = np.sum(exp_z)
    return exp_z/total

In [13]:
# your code goes here: 
def two_layer_perceptron(X:np.ndarray, activation_1, activation_2, dim_input:int, dim_hidden:int, dim_output:int):
    """
    Implements the forward pass of a two-layer fully connected perceptron.
    
    Parameters
    ----------
    X : a 2-dimensional array
        the input data
    activation_1 : function
        the activation function to be used for the hidden layer
    activation_2 : function
        the activation function to be used for the output layer
    dim_input : int
        the dimensionality of the input layer
    dim_hidden : int
        the dimensionality of the hidden layer
    dim_output : int
        the dimensionality of the output layer
    Returns
    -------
    y_pred : float
        the output of the computation of the forward pass of the network
    """
    # your code goes here
    # your code goes here
    # Initializing weights and biases with random values
    # First we initialize the input to hidden layers of weights and biases
    W_input = np.random.normal(size=(dim_input, dim_hidden))
    b_input = np.random.normal(size=(dim_hidden, 1))
    # Then the hidden to output layers of weights and biases
    W_output = np.random.normal(size=(dim_hidden, dim_output))
    b_output = np.random.normal(size=(dim_output, 1))
    
    # So we calculate the hidden layer and apply the first activation function
    hidden_layer_i = X @ W_input + b_input.T
    hidden_layer_o = activation_1(hidden_layer_i)

    # Finally we calculate the output layer and the second activation function
    output_layer = hidden_layer_o @ W_output + b_output.T
    y_pred = activation_2(output_layer)

    return y_pred

Lastly, calculate the error of this forward pass using the cross-entropy loss.

In [14]:
# your code goes here:
# Then we do the same as in the previsous predictions rmse
hidden_layer_dim = 15
y_train_pred = two_layer_perceptron(X_train, sigmoid_activation, softmax_activation, X.shape[1], hidden_layer_dim, 1)
train_rmse = rmse(y_train_pred, y_train)

y_test_pred = two_layer_perceptron(X_test, sigmoid_activation, softmax_activation, X.shape[1], hidden_layer_dim, 1)
test_rmse = rmse(y_test_pred, y_test)

y_val_pred = two_layer_perceptron(X_val, sigmoid_activation, softmax_activation, X.shape[1], hidden_layer_dim, 1)
val_rmse =rmse(y_val_pred, y_val)

print("RMSE de treino:", train_rmse)
print("RMSE de teste:", test_rmse)
print("RMSE de validação:", val_rmse)

RMSE de treino: 5.258337423234244
RMSE de teste: 5.531233309643062
RMSE de validação: 5.423468698988252
