In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Neural Nets

The moment you have all been waiting for, let's finally build our own neural network architecture! While you have already used Neural Nets in the previous assignment when using the Multi-Layer Perceptron (MLP), you did not really define an architecture. MLPs are generally a stack of fully connected a.k.a. Dense layers and that is what we will start with here as well. We will use one of the previous datasets and optimize the architecture below to improve the test accuracy.

In [None]:
# Load dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

loader = load_iris()
X = loader.data
y = loader.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
X_train.shape

(100, 4)

### Encode the target data
When doing a classification using NNs we generally one-hot encode the target variable. One-hot encoding means that we convert the target label into a binary vector where the length of the vector is the same size as the number of categories and we encode a class by keeping all scalars in the vector 0 except for the class that we want to encode which is 1. In other words, only one of the 10 values in the array can be 1 while all others are 0 and the location of the 1 represents the number in the class. so 4 becomes [0,0,0,1,0,0,0,0,0,0]

We sadly do not have time to go into detail on why this is and how you should go about this. SKlearn's LaberBinarizer below will do the trick for you. For those that are curious, you encode classes in an array because when you use a single number the neural network will assume that class 1 is closer to class 2 than to class 7. Think of it as classifying pictures of animals, the networks would think that class 1 (dog) is closer to class 2 (horse) than to class 8 (cat). Which is ofcourse not the case since theire is no order between images of animals.



In [None]:
# Convert labels to onehot encoding
from sklearn.preprocessing import LabelBinarizer
lb = LabelBinarizer()
lb.fit(y_train)
y_train = lb.transform(y_train)
y_test = lb.transform(y_test)
print(y_train[0])

[0 1 0]


### Designing the Neural Network
Designing an NN is a bit more complex than the algorithms used in assignment 1. A NN exists of multiple layers that can be of a different types.

We will build up the NN using Keras' sequential model.

The network below uses standard Dense layers and ReLu activation functions. When you feel adventurous you can change these but for this exercise, we advise keeping those set. [Here you find all available layers and activation functions in Keras](https://keras.io/api/layers/). The first number in each layer defines the number of neurons in the layer and with that the limit of the amount of information that layer can pass to the next. The first layer is called the input layer, here you will need to define the shape of data it should be expecting. Based on this it can determine how many incoming connections each neuron will need. The Last Dense layer should compress the influx of data to the number of classes that exist, therefore the number of neurons should equal the number of classes. In the case of classification, the activation function should determine the choice your model is making. You can choose from several activation functions or even create your own, but when doing classification SoftMax is generally used.


In [None]:
from tensorflow import keras

def define_model(input_shape, num_classes):
    # Build the architecture
    model = keras.Sequential(
        [
            keras.layers.Dense(64, activation="relu", input_shape=input_shape),
            keras.layers.Dense(32, activation="relu"),
            keras.layers.Dense(16, activation="relu"),
            keras.layers.Dense(num_classes, activation="softmax"),
        ]
    )

    print(model.summary())
    return model

### Training the network
When training a model we need to define the number of epochs it will run for. Furthermore, we need an optimizer, you can use a classic SDG optimizer but here we chose Adam which is similar but has an adaptive learning rate, so you don't need to choose one. Finally, we select a loss function that determines how well the NN is performing. For classification generally Categorical Crossentropy is used. 

In [None]:
num_epochs = 99
input_shape = X_train[0].shape
num_classes = y_train.shape[-1]

model = define_model(input_shape, num_classes)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=num_epochs)

### Metric Scores
Finally, we need to determine if our NN has actually learned something by testing it on data it has never seen before, the test set. Below we calculate the loss (categorical cross-entropy) which is a bit hard to interpret. But also the accuracy which can be used in case of classification and is very easy to interpret since it is basically the percentage of cases in the test set it predicted correctly.

In [None]:
score = model.evaluate(X_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.04529091343283653
Test accuracy: 0.9800000190734863


## **Assignment 1:** Tune the MLP
Tune the parameters until you get a test accuracy of 98%. We advise starting with changing the number of layers and the number of neurons in each layer and proceed with changing the number of epochs. You are free to change anything else from the optimizer to the layer types but make sure to save enough time for the next challenge.

## **Assignment 2:** Boston house pricing
Load the Boston dataset and build a Neural Network that predicts the value of a house. This is a regression problem so you'll need another activation and loss function. Have a look at [Keras documentation](https://keras.io/api/). 

Experiment with different layers, functions, number of epochs, until you reach MSE of <15.0 on the test set


In [1]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow import keras
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
X = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
y = raw_df.values[1::2, 2]
y = y.astype('int')

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
X_train.shape


def define_model(input_shape, num_classes):
    # Build the architecture
    model = keras.Sequential(
        [   keras.layers.Dense(1024, activation="relu", input_shape=input_shape),
            keras.layers.Dense(512, activation="relu"),
            keras.layers.Dense(256, activation="relu"),
            keras.layers.Dense(128, activation="relu"),
            keras.layers.Dense(64, activation="relu"),
            keras.layers.Dense(32, activation="relu"),
            keras.layers.Dense(16, activation="relu"),
            keras.layers.Dense(8, activation="relu"),
            keras.layers.Dense(num_classes, activation="linear"),
        ]
    )

    return model

num_epochs = 200
input_shape = X_train[0].shape
num_classes = y_train.shape[-1]

model = define_model(input_shape, num_classes)
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mse'])

history = model.fit(X_train, y_train, epochs=num_epochs)

y_pred = model.predict(X_test)

MSE = model.evaluate(X_test, y_test)
print(f'This neural network got an MSE score of {MSE[1]}')



ModuleNotFoundError: No module named 'sklearn'