<a href="https://colab.research.google.com/github/FabriceBeaumont/4216_Biomedical_DS_and_AI/blob/main/Sheet9/Assignment9_Solutions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
import pandas as pd
import numpy as np
from numpy.random import seed
import matplotlib.pyplot as plt

import tensorflow as tf

import torchvision.transforms as transforms
import torchvision.datasets as datasets

from tensorflow import keras
from tensorflow.keras import layers
# from keras.models import Sequential
# from keras.layers import Dense, Activation, Dropout
# from keras.wrappers.scikit_learn import KerasClassifier

from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV, KFold, cross_val_predict, train_test_split
from sklearn.calibration import calibration_curve

In [3]:
def get_dataset_from_github(filename, index_col_str=None, header_str='infer'):    
    data_file_path = "https://raw.githubusercontent.com/FabriceBeaumont/4216_Biomedical_DS_and_AI/tree/main/Datasets"
    if index_col_str is None and header_str == 'infer':
      data = pd.read_csv(data_file_path + filename)
    elif index_col_str is None:
        data = pd.read_csv(data_file_path + filename, header=header_str)
    elif header_str == 'infer':
      data = pd.read_csv(data_file_path + filename, index_col=index_col_str)
    else:
      data = pd.read_csv(data_file_path + filename, index_col=index_col_str, header=header_str)

    return data

## Biomedical Data Science & AI

## Assignment 9

#### Group members:  Fabrice Beaumont, Fatemeh Salehi, Genivika Mann, Helia Salimi, Jonah

---
### Exercise 1 - Basics of NN

From the MNIST database load the handwritten digits dataset.

In [4]:
# Load the data: 
# Split between train and test sets w.r.t. data 'x' and class vectors 'y'
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

#### 1.1. Normalize your dataset before training your model.

In [5]:
# For normalization we scale the images to contain values in [0, 1]
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255

In [6]:
# Check, that all images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
input_shape = x_train[0].shape

print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


In [7]:
# Check, if the number of classes is ten (since there are ten digits 0,...,9)
num_classes = 10

num_classes == len(set(y_train)) == len(set(y_test))

True

In [8]:
# Convert the class vectors 'y' to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

#### 1.2. Train a neural network once using **Adam** and once using **AdaGrad** optimizer. 

*Hint*: Set epochs to $20$, neurons of hidden layer to $100$ and use the ReLU as activation function.

In [12]:
num_epochs = 20
num_hidden_neurons = 100
activation_fct = "relu"
loss = "sparse_categorical_crossentropy"

batch_size = 128
validation_split = 0.1

In [13]:
model = keras.Sequential(
    [
        # # Input layer
        # keras.Input(shape=input_shape),
     
        # # One hidden layer
        # layers.Dense(num_hidden_neurons, activation=activation_fct),

        # # Output layer
        # # Use 'softmax' to get a probability for the digits 0,...,9 as classification
        # layers.Flatten(),
        # layers.Dense(num_classes, activation="softmax"),
     
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model.summary()


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 1600)              0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 1600)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)               

In [14]:
# Train the model with 'adam'
model.compile(loss=loss, optimizer="adam", metrics=["accuracy"])

history_adam = model.fit(x_train, y_train, batch_size=batch_size, epochs=num_epochs, validation_split=validation_split)

Epoch 1/20


InvalidArgumentError: ignored

In [None]:
score_adam = model.evaluate(input_test, target_test, verbose=0)
print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')

In [None]:
# Now do the same using AdaGrad
# Train the model with 'adam'
model.compile(loss=loss, optimizer="adagrad", metrics=["accuracy"])

history_adagrad = model.fit(x_train, y_train, batch_size=batch_size, epochs=num_epochs, validation_split=validation_split)

score_adagrad = model.evaluate(input_test, target_test, verbose=0)
print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')

#### 1.3. Plot the *SparseCategoricalCrossentropy* loss for both models. Plot the computed accuracy for both models. Which model performed better while training?

In [None]:
# First check, what information was stored in the histories
print(history_adam.history.keys())

In [None]:
# Now plot the losses
plt.plot(history_adam.history['loss'])
plt.plot(history_adam.history['val_loss'])

plt.plot(history_adagrad.history['loss'])
plt.plot(history_adagrad.history['val_loss'])

plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['Adam_train', 'Adam_validation', 'Adagrad_train', 'Adagrad_validation'], loc='upper left')
plt.show()

#### 1.4. Compute the model accuracy on the test set for both optimizers. Which model performed better?

In [None]:
# Plot the accuracies
plt.plot(history_adam.history['acc'])
plt.plot(history_adam.history['val_acc'])

plt.plot(history_adagrad.history['acc'])
plt.plot(history_adagrad.history['val_acc'])

plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['Adam_train', 'Adam_validation', 'Adagrad_train', 'Adagrad_validation'], loc='upper left')
plt.show()

#### 1.5. Familiarize yourself with **Layer Normalization** and explain how it works.

#### 1.6. Using the same dataset to train a neural network with Layer Normalization.

##### 1.6.a. Compute the SparseCategoricalCrossentropy loss and model accuracy.

##### 1.6.b. Evaluate the model performance using the test dataset.

---
### Exercise 2 - Hyper Parameter Optimization

#### 2.1. What are the main challenges with hyper-parameter optimization for neural networks?

##### 2.2. Inform yourself about variants of **Bayesian-HPO** and explain them in detail.

#### 2.3. Using the same MNIST dataset, optimize the activation function for the output layer and the number of dropout units in the NN model using the following methods.

##### 2.3.a. Grid search

##### 2.3.a. Random search

##### 2.3.a. Bayesian Hyper-parameter optimization

---
### Exercise 3 - Transfer Learning & CNNs

#### 3.1. Load the *VGG16 pre-trained model* using Keras Applications API. Use the model to classify the dog images in canines.zip after pre-processing each image by doing the following:

##### 3.1.a. Load each image and set the size to 224 x 224 pixels.

##### 3.1.b. Convert the image pixels to a numpy array and reshape it according to the model’s input requirements.

##### 3.1.c. Use the model to print out the predicted class and its probability for each image.

#### 3.2. Downscale the given matrix by applying the following pooling operations:

##### 3.2.a. Max Pool

##### 3.2.a. Average Pool

#### 3.3. Load the **CIFAR10 dataset** using Keras datasets API and normalize the images’ pixel values. Train a convolutional neural network to classify the dataset images with the following architecture:

##### 3.3.a. Convolutional Base:
1. An input convolution layer with $32$ filters and a kernel size of $(3,3)$.
Adjust your input shape to that of the CIFAR images’ format
2. Two convolution layers, each with 64 filters and a kernel size of $(3,3)$
3. Two Max Pool layers, with a pool size of $2\times 2$

##### 3.3.b. Two dense layers, with $64$ and $10$ units respectively. Adjust the output of the convolutional base such that it satisfies the input requirements of the dense layers.

##### 3.3.c. Use the following parameters to train the network:
1. Sparse categorical cross entropy as your loss function
1. Adam optimizer
1. $10$ epochs
1. ReLU activation for your layers

Compile your model, then plot the accuracy across each epoch.