

# Lab 2: Neural networks

In this lab we will build dense neural networks on the MNIST dataset.

`https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html`

## Load the data and create train-test splits

In [None]:
# Auto-setup when running on Google Colab
if 'google.colab' in str(get_ipython()):
    !pip install openml

# Global imports and settings
%matplotlib inline
import numpy as np
import pandas as pd
import openml as oml
import os
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPClassifier

In [None]:
# Download MNIST data. Takes a while the first time.
mnist = oml.datasets.get_dataset(554)
X, y, _, _ = mnist.get_data(target=mnist.default_target_attribute, dataset_format='array');
X = X.reshape(70000, 28, 28)

# Take some random examples
from random import randint
fig, axes = plt.subplots(1, 5,  figsize=(10, 5))
for i in range(5):
    n = randint(0,70000)
    axes[i].imshow(X[n], cmap=plt.cm.gray_r)
    axes[i].set_xticks([])
    axes[i].set_yticks([])
    axes[i].set_xlabel("{}".format(y[n]))
plt.show();

In [None]:
# For MNIST, there exists a predefined stratified train-test split of 60000-10000. We therefore don't shuffle or stratify here.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=6000, random_state=0, test_size = 1000 )


## Exercise 1: Preprocessing
* Normalize the data: map each feature value from its current representation (an integer between 0 and 255) to a floating-point value between 0 and 1.0.
* Store the floating-point values in `x_train_normalized` and `x_test_normalized`.
* Map the class label to a one-hot-encoded value. Store in `y_train_encoded` and `y_test_encoded`.

In [None]:
from sklearn.preprocessing import OneHotEncoder

# Assuming x_train, x_test, y_train, and y_test are your training and testing data

# Normalize features
X_train_normalized = X_train / 255.0
X_test_normalized = X_test / 255.0

# Perform one-hot encoding for class labels
encoder = OneHotEncoder(categories='auto', sparse=False)

# Reshape the labels to be column vectors
y_train_reshaped = y_train.reshape(-1, 1)
y_test_reshaped = y_test.reshape(-1, 1)

# Fit and transform the training labels
y_train_encoded = encoder.fit_transform(y_train_reshaped)

# Transform the testing labels
y_test_encoded = encoder.transform(y_test_reshaped)


## Exercise 2: Create a MLPClassifier model

Implement a `create_model` function which defines the topography of the deep neural net, specifying the following:

* The number of layers in the deep neural net: Use 2 dense layers for now.
* The number of nodes in each layer: these are parameters of your function.
* Any regularization layers.
* The optimizer and learning rate. Make the learning rate a parameter of your function as well.

Consider:
* What should be the shape of the input layer?
* Which activation function you will need for the last layer, since this is a 10-class classification problem?

In [None]:
from sklearn.neural_network import MLPClassifier

def create_model(hidden_layer_sizes=(32, 10), activation='relu', learning_rate_init=0.003):


    # Initialize the MLPClassifier model
    model = MLPClassifier(hidden_layer_sizes=hidden_layer_sizes,
                          activation=activation,
                          learning_rate_init=learning_rate_init)

    return model

# Example usage:
hidden_layer_sizes = (32, 10)  # Number of nodes in each hidden layer
activation = 'relu'  # Activation function for the hidden layers
learning_rate_init = 0.003  # Learning rate for the optimizer
model = create_model(hidden_layer_sizes, activation, learning_rate_init)


In [None]:
### Create and compile a 'deep' neural net
def create_model(layer_1=32, layer_2=10, learning_rate=0.003, activation='relu' ):
    pass

## Exercise 3: Create a training function
Implement a `train_model` function which trains.

In [None]:
def train_model(model, X, y):
    """
    model: the model to train
    X, y: the training data and labels

    """
    trained_model = model.fit(X, y)
    return trained_model
    trained_model = train_model(model, X_train, y_train)

    pass

## Exercise 4: Evaluate the model

Train the model with a learning rate of 0.003.


In [None]:
from sklearn.neural_network import MLPClassifier

# Define the learning rate
learning_rate = 0.003

# Create an instance of MLPClassifier with the specified learning rate
model = MLPClassifier(learning_rate_init=learning_rate, max_iter=1000)
# Reshape X_train if necessary
X_train = X_train.reshape(X_train.shape[0], -1)


# Train the model on the training data
model.fit(X_train, y_train)

# Reshape X_train if necessary
X_test = X_test.reshape(X_test.shape[0], -1)

# Evaluate the trained model on the test set
accuracy = model.score(X_test, y_test)

# Print the test set accuracy
print("Test set accuracy:", accuracy)


## Exercise 5: Optimize the model

Try to optimize the model, either manually or with a tuning method. At least optimize the following:
* the number of hidden layers
* the number of nodes in each layer


Try to reach at least 96% accuracy against the test set.

In [None]:
from sklearn.model_selection import GridSearchCV
# Assume you have already imported necessary libraries and loaded your dataset

# Split the data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Perform data preprocessing if necessary, such as scaling or encoding categorical variables

# Define the parameter grid for hyperparameter tuning
param_grid = {
    'hidden_layer_sizes': [(30,), (60,), (30, 30), (60, 60)],
    'activation': ['relu', 'tanh'],
    'alpha': [0.0001, 0.001, 0.01],
    'learning_rate_init': [0.001, 0.003, 0.01]
}

# Create an instance of MLPClassifier
mlp = MLPClassifier(max_iter=100)

# Create GridSearchCV object
grid_search = GridSearchCV(mlp, param_grid, cv=3, scoring='accuracy', n_jobs=-1)

# Fit the model
X_train = X_train.reshape(X_train.shape[0], -1)
grid_search.fit(X_train, y_train)

# Get the best parameters
best_params = grid_search.best_params_
print("Best parameters:", best_params)

# Get the best model
best_model = grid_search.best_estimator_

# Evaluate the best model on the test set
X_test = X_test.reshape(X_test.shape[0], -1)
test_accuracy = best_model.score(X_test, y_test)
print("Test set accuracy:", test_accuracy)
