######  The University of Melbourne, School of Computing and Information Systems
# COMP30027 Machine Learning, 2021 Semester 1

## Week 11 - Practical Workshop

### NOTE:  You will need the newer (18.1) build of `scikit-learn` for its neural network support.


### Exercise 1.
The Multilayer Perceptron is available from (newer builds of) `scikit-learn` as `sklearn.neural_network.MLPClassifier`.


In [3]:
import numpy as np
from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.neural_network import MLPClassifier
from collections import Counter

### Exercise 1.(a) 
Build a default Multilayer Perceptron to classify the `Iris` data. Evaluate its cross-validation accuracy.

In [None]:
iris = datasets.load_iris()
X = iris.data
y = iris.target
print('X:', X.shape, 'y:', set(y))


clf = MLPClassifier(max_iter=...)

print('corss-val acc:', np.mean(cross_val_score(...)))
clf.fit(X, y)


### Exercise 1.(b) 
Check the `coefs_` and `n_layers_` attributes of the fitted classifier to examine the resulting neural network.

In [None]:
print(clf.coefs_)
print('parameter shapes:',...)
print('num layers:', clf.n_layers_)

### Exercise 2.
One important issue with this Multilayer Perceptron is that it is sensitive to the scale of the input attribute values.
### Exercise 2.(a) 
Read up on the `StandardScaler` , and re-scale the `Iris` data so that each attribute has a *mean* of 0 and a *variance* of 1. Evaluate and examine the resulting neural network built on the re-scaled data.


In [None]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

clf = MLPClassifier(max_iter=2000)


print('Corss-val standardised features acc:', np.mean(...)) 
#it is cheating because the mean and variance are estimated using both training and test data

### Exercise 2.(c) 
*(Harder)* Calculating the _mean_ and _variance_ on the entire data set (before splitting into train/test sets) is cheating slightly. Write a re-scale function that calculates the scaling factors for the training data, and applies the scaler to the test data. Then, write a wrapper function that uses this to cross-validate.

In [None]:
clf = MLPClassifier(max_iter=2000)

pipeline = Pipeline([('transformer',...), ('estimator', ...)])
#this way we don't cheat read more on pipelines https://scikit-learn.org/stable/modules/compose.html

print('corss-val noncheating standardised features acc:', np.mean(...))

### Exercise 3.
You can coerce the Multilayer Perceptron to have specifically–sized hidden layers using the *hidden_layer_sizes* parameter.
### Exercise 3.(a) 
Train a Multilayer Perceptron on the two-class `Abalone` data, and examine the resulting neural
network.


In [None]:
def convert_class(raw, num_class=2):
    raw = int(raw)
    if num_class == 2:
        if raw<=10: return 0
        else: return 1
    elif num_class == 3:
        if raw <= 8:
            return 0
        elif 9<=raw<=10:
            return 1
        elif 11<=raw:
            return 2
    elif num_class == 29:
        return raw

def load_abalone(addsex=False, num_class=2):
    X, y = [], []
    with open('abalone.data', 'r') as fin:
        for line in fin:
            atts = line[:-1].split(",")
            if not addsex:
                X.append(atts[1:-1])
            else:
                sex = atts[0]
                if sex == "M": sex = 0
                elif sex=="I": sex = 1
                elif sex=="F": sex = 2
                else: sex = 3
                
                X.append([sex] + atts[1:-1])
            y.append(convert_class(atts[-1], num_class))
    X = np.array(X, dtype=float)
    return X, y

X, y = load_abalone(addsex=False, num_class=2)
print('X:', X.shape, 'y:', set(y))

clf = ...
clf.fit(X,y)
print(clf.coefs_)

### Exercise 3.(b) 
*(Harder)* Change the size and/or number of hidden layers. How are the resulting weights affected? Can you discern any relationship between the weights for layers of varying sizes?

In [None]:
clf = MLPClassifier(hidden_layer_sizes=..., max_iter=2000)
#this way we don't cheat read more on pipelines https://scikit-learn.org/stable/modules/compose.html
clf.fit(X, y)
print(clf.coefs_)

### Exercise 4. 
There are a couple of different tune-able parameters for the MLPClassifier , mostly dealing with the weight optimisation — however, it is often worthwhile to tune the **Regularisation parameter (α)**.
### Exercise 4.(a) 
Try varying orders of α between 10 and 10 −5 for a Multilayer Perceptron built on the two-class `Abalone` data. How much variance in cross-validation accuracy do you observe?


In [None]:
alphas = [np.power(10.0, i) for i in range(-7, 2)]
print(alphas)

for alpha in alphas:
    clf = MLPClassifier(max_iter=2000, alpha=...)
    pipeline = Pipeline([('transformer', scaler), ('estimator', clf)])
    scores = cross_val_score(...)
    print('alpha: {} mean_acc: {} standard_dev_acc: {}'.format(alpha, np.mean(scores), np.std(scores)))

### Exercise 4.(b) 
Read up on the `GridSearchCV` utility, to help you in tuning the performance of the *Multilayer Perceptron*. Split the data into a training–and–tuning partition, and a test partition. What is the value of the regularisation parameter that `GridSearchCV` comes up with? How does the test accuracy compare to the default (un-tuned) `MLPClassifier` ?

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split

X_train, X_devtest, y_train, y_devtest = train_test_split(X, y, test_size=0.4, random_state=42)
X_dev, X_test, y_dev, y_test = train_test_split(X_devtest, y_devtest, test_size=0.5, random_state=42)

clf.fit(X_train, y_train)
print('MLP acc without tuning:', clf.score(X_test, y_test))

hidden_sizes = [[100], [10, 10]]
#arguments of MLPClassifier and a list of values for them to search and find the best.
param_grid = {'alpha': ..., 'hidden_layer_sizes':...}


gs = GridSearchCV(estimator=...,
                  param_grid=...,
                  scoring='accuracy',
                  cv=3,
                  n_jobs=2,
                  verbose=1)
gs.fit(X_train, y_train)

best_params = gs.best_params_
print('best_params', best_params)

clf = MLPClassifier(max_iter=2000, **best_params)

clf.fit(X_train, y_train)
print('acc with best params:', clf.score(X_test, y_test))
