## Assingment 1
*by Ebba Bergman*

Let's do something very similar to the lab

**Hand in:**This notebook, and a pdf of this notebook. No written answers to the questions are required, they are only here to help you learn

**You are free to discuss the general concepts with other groups, but we encourage you not to exchange code for your own learning**

A lot of the code below is inspired labs developed by Christophe Avenel at NBIS , labs and assignments made by Phil Harrison as well as  by https://www.tensorflow.org/guide/keras/functional/,


In [None]:
## First we need to import all of the packages we need

import numpy as np
import tensorflow as tf
import pandas as pd
from PIL import Image
import IPython
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix

from tensorflow.keras.preprocessing.image import ImageDataGenerator

import cnn_helper

Note: the cnn_helper was written by Christophe Avenel, and his code (including his lab which this one is based on), is available here: https://github.com/NBISweden/workshop-neural-nets-and-deep-learning/tree/master/session_convolutionalNeuralNetworks/Labs

In [None]:
    
def plot_history(model_history, model_name):
    fig = plt.figure(figsize=(15, 5), facecolor='w')
    ax = fig.add_subplot(131)
    ax.plot(model_history.history['loss'])
    ax.plot(model_history.history['val_loss'])
    ax.set(title=model_name + ': Model loss', ylabel='Loss', xlabel='Epoch')
    ax.legend(['train', 'valid'], loc='upper right')
    
    ax = fig.add_subplot(132)
    ax.plot(np.log(model_history.history['loss']))
    ax.plot(np.log(model_history.history['val_loss']))
    ax.set(title=model_name + ': Log model loss', ylabel='Log loss', xlabel='Epoch')
    ax.legend(['Train', 'Test'], loc='upper right')    

    ax = fig.add_subplot(133)
    ax.plot(model_history.history['accuracy'])
    ax.plot(model_history.history['val_accuracy'])
    ax.set(title=model_name + ': Model accuracy', ylabel='Accuracy', xlabel='Epoch')
    ax.legend(['train', 'valid'], loc='upper right')
    plt.show()
    plt.close()
    
    plt.savefig("History Plot.png")


## Set up the data, look at it

In [None]:
## Set up where to find our data
data_directory = "./LabData/bloodcells_small/data/"
labels_path =  "./LabData/bloodcells_small/labels.csv"


In [None]:
# This is a dataframe, a way to look at data as tables.
#Google "Python pandas dataframe" to get more information, or to find new commands as you need
# Anything you can do with data frames you could do with loops, but it is sometimes easier to read and write code with dataframes
df_labels = pd.read_csv(labels_path) 

### Q: Look at the labels, what columns do you think contains the true label?

In [None]:
## Let's look at the images - always a good start to the project
# Here random images will be displayed, run this several time to see different images

figure, ax = plt.subplots(2, 3, figsize=(14, 10))
figure.suptitle("Examples of images", fontsize=20)
axes = ax.ravel()

df_images_to_show = df_labels.sample(8)


for i in range(len(axes)):
    row = df_images_to_show.iloc[[i]]
    random_image = Image.open(data_directory + row["Filenames"].values[0])
    axes[i].set_title(row["Class"].values[0], fontsize=14) 
    axes[i].imshow(random_image)
    axes[i].set_axis_off()
    
plt.subplots_adjust(wspace=0.05, hspace=0.05)
plt.show()
plt.close()


### Q: Can you see any difference between the classes? 
### Q: Do you think a human being able to see the difference between classes makes it an easier or more difficult problem for a neural network?

In [None]:
# What's the shape of the image?
image_shape = np.array(random_image).shape
print(image_shape)

In [None]:
# Let's look a little bit into the labels
set_size = df_labels.size
print(set_size)
print(df_labels.head())

In [None]:
df_labels['Class'].value_counts()

## Divide the data for training, validation and test

In [None]:
## Next, let's divide the filtered rows into a train, validation and a test set. 
class_column_header = "Class"
df_to_use = df_labels.copy() #We're copying the df_labels so that you can look at it again later if you want

test_set_fraction = 0.1
validation_set_fraction = 0.2

df_test = df_to_use.groupby(class_column_header).sample(frac = test_set_fraction)
df_to_use = pd.concat([df_to_use, df_test, df_test]).drop_duplicates(keep=False) # This line finds the intersection between df_filtered and df_test and df_test and dropps anything that belongs to two of those, so we are left with df_train. Using only df_test once should be fine, but better safe than sorry
df_valid = df_to_use.groupby(class_column_header).sample(frac = validation_set_fraction)
df_train = pd.concat([df_to_use, df_valid, df_valid]).drop_duplicates(keep=False) 

In [None]:
print(df_test.head())

In [None]:
## Set up generators that specify how the images are loaded, how many at a time (batch size),
## that the images should be shuffled should be shuffled etc.
batch_size = 8

filename_column = 'Filenames'
true_value = "Class"
# create a data generator

## Note: we tend to get better results if the values of the pixels are between 0 and 1, so we need a rescale of 1/255 since the highest possible pixel value for these images are 255
train_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255, samplewise_center=True, samplewise_std_normalization=True)
valid_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255, samplewise_center=True, samplewise_std_normalization=True)
test_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255, samplewise_center=True, samplewise_std_normalization=True)

train_generator = train_data_generator.flow_from_dataframe(
    df_train, directory=data_directory, x_col=filename_column, y_col=true_value,
    weight_col=None, class_mode='categorical', batch_size=batch_size, target_size = image_shape,  color_mode='grayscale' ,shuffle=True,
)

valid_generator = valid_data_generator.flow_from_dataframe(
    df_valid, directory=data_directory, x_col=filename_column, y_col=true_value,
    weight_col=None, class_mode='categorical', batch_size=batch_size, target_size = image_shape,  color_mode='grayscale' , shuffle=False,
)


test_generator = test_data_generator.flow_from_dataframe(
    df_test, directory=data_directory, x_col=filename_column, y_col=true_value,
    weight_col=None,class_mode='categorical', batch_size=batch_size, target_size = image_shape,  color_mode='grayscale' , shuffle=False,
)


train_steps=train_generator.n//train_generator.batch_size if train_generator.n >= train_generator.batch_size else 1
validation_steps=valid_generator.n//valid_generator.batch_size if valid_generator.n >= valid_generator.batch_size else 1



# CNN

Convolutional Neural Networks revolutionized the field of deep learning. You have seen how convolutions work in the lectures. One of the huge benefits of convolutions is that as the filters (sometimes called kernels in codes) move across the image the position of an object in an image becomes much less important than when we flattened images to use in traditional Artificial Neural Networks. 
  
For this part of the lab you will try a couple of different architectures and hyperparameters. The **architecture** is basically the structure of the network: how many nodes, how many layers, and overall shape of these. The **hyperparamters** are most easily defined as all of the parameters changed *before* the training of the network begin, such as the number of epochs, what activation function to use in each layer, and which optimization method we use for backpropagation.  

In [None]:
## Set up the model architecture
# See https://www.tensorflow.org/guide/keras/functional/ if you want to see the documentation

cnn_inputs = keras.Input(shape=(32,32,1))
x = layers.Conv2D(1, kernel_size=(3, 3), strides=1,padding='same')(cnn_inputs)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Flatten()(x)
cnn_outputs = layers.Dense(5, activation='softmax')(x)

In [None]:
## Define the model as a keras model
cnn_model = keras.Model(inputs=cnn_inputs, outputs=cnn_outputs, name="cnn_Model_1")

In [None]:
## We'll use the same generators as above here, so no need to redefine them
## compile model

cnn_model.compile(optimizer=keras.optimizers.Adam(), loss='categorical_crossentropy', metrics = ['accuracy'])
cnn_model.summary()

In [None]:
## Actually train model
epochs = 5
history = cnn_model.fit_generator(generator=train_generator,
                    steps_per_epoch= train_steps,
                    validation_data= valid_generator,
                    validation_steps= validation_steps,
                    epochs= epochs
        )

In [None]:
## Plot results
plot_history(history, "CNN")

In [None]:
# plot confusion matrix
cnn_helper.plot_confusion_matrix_from_generator(cnn_model, valid_generator)

### Q: What do the curves tell you about the models?

You can see some examples of how curves can look at : https://uppsala.instructure.com/courses/23804/pages/deep-learning-plots/edit

# Expanding the models

## Deeper models


Sometimes a deeper model and/or a more complex model, can be helpful. Try adding some more convolution layers and pooling layers to the model. Try changing the filter sizes, and the number of filters as well.
More information about the convolutional layer can be found here: https://keras.io/api/layers/convolution_layers/convolution2d/, maxpooling here: https://keras.io/api/layers/pooling_layers/max_pooling2d/, and a different kind of way of making models can be found here: https://www.tensorflow.org/tutorials/images/cnn  and here https://www.tensorflow.org/tutorials/quickstart/advanced

In [None]:
## Set up the model architecture

#Change the code below so that the new model has roughly the same number of parameters as your best ANN
# Hint: you can add both more Conc2D layers, and increase the kernel (filter) size

cnn_inputs = keras.Input(shape=(32,32,1))
x = layers.Conv2D(5, kernel_size=(3, 3), strides=1,padding='same')(cnn_inputs)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Flatten()(x)
cnn_outputs = layers.Dense(5, activation='softmax')(x)

## Define the model 
cnn_model = keras.Model(inputs=cnn_inputs, outputs=cnn_outputs, name="cnn_Model_2")

## Compile the model
cnn_model.compile(optimizer=keras.optimizers.Adam(), loss='categorical_crossentropy', metrics = ['accuracy'])
cnn_model.summary()

In [None]:
## Actually train model
epochs = 10
history = cnn_model.fit_generator(generator=train_generator,
                    steps_per_epoch= train_steps,
                    validation_data= valid_generator,
                    validation_steps= validation_steps,
                    epochs= epochs
                                 )         

In [None]:
## Plot results
plot_history(history, "cnn_model")

# plot confusion matrix
cnn_helper.plot_confusion_matrix_from_generator(cnn_model, valid_generator)

## Try a couple of deeper models and save your best one for further study


### Add all these models beneath this heading

## Data Augmentation


Let's try something else, maybe you would like to add some data augmentation? 
Data augmentation basically means that we randomly alter the incoming images in different ways to make sure that the network can handle those types of variations.

If you want to read more you can look at this article, especially the "Data Augmentations based on basic image manipulations Geometric transformations" is of interest here: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0

See https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator for things you can try by adding input paramters to the ImageDataGenerator().


Update the cell below to  **include data augmentations, only in the training data generator then run your CNN again**

In [None]:
## Set up generators 
batch_size = 8

filename_column = 'Filenames'
true_value = "Class"
# create a data generator

## Note: we tend to get better results if the values of the pixels are between 0 and 1, so we need a rescale of 1/255 since the highest possible pixel value for these images are 255
train_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255, samplewise_center=True, samplewise_std_normalization=True ,## ADD CODE HERE)
valid_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255, samplewise_center=True, samplewise_std_normalization=True)
test_data_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255, samplewise_center=True, samplewise_std_normalization=True)

train_generator = train_data_generator.flow_from_dataframe(
    df_train, directory=data_directory, x_col=filename_column, y_col=true_value,
    weight_col=None, class_mode='categorical', batch_size=batch_size, target_size = image_shape,  color_mode='grayscale' ,shuffle=True,
)

valid_generator = valid_data_generator.flow_from_dataframe(
    df_valid, directory=data_directory, x_col=filename_column, y_col=true_value,
    weight_col=None, class_mode='categorical', batch_size=batch_size, target_size = image_shape,  color_mode='grayscale' , shuffle=False,
)


test_generator = test_data_generator.flow_from_dataframe(
    df_test, directory=data_directory, x_col=filename_column, y_col=true_value,
    weight_col=None,class_mode='categorical', batch_size=batch_size, target_size = image_shape,  color_mode='grayscale' , shuffle=False,
)


train_steps=train_generator.n//train_generator.batch_size if train_generator.n >= train_generator.batch_size else 1
validation_steps=valid_generator.n//valid_generator.batch_size if valid_generator.n >= valid_generator.batch_size else 1


In [None]:
## Set up the model architecture
### use your best model from above, and rename it here to cnn_model_augmented
cnn_inputs = keras.Input(shape=(32,32,1))
x = layers.Conv2D(1, kernel_size=(3, 3), strides=2,padding='same')(cnn_inputs)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Flatten()(x)
cnn_outputs = layers.Dense(5, activation='softmax')(x)

In [None]:
## Define the model 
cnn_model = keras.Model(inputs=cnn_inputs, outputs=cnn_outputs, name="cnn_Model_2")

In [None]:
## Compile the model

cnn_model.compile(optimizer=keras.optimizers.Adam(), loss='categorical_crossentropy', metrics = ['accuracy'])
cnn_model.summary()

In [None]:
## Actually train model
epochs = 10
history = cnn_model.fit_generator(generator=train_generator,
                    steps_per_epoch= train_steps,
                    validation_data= valid_generator,
                    validation_steps= validation_steps,
                    epochs= epochs
        )

In [None]:
## Plot results
plot_history(history, "Data Augmentation added")

In [None]:
# plot confusion matrix
cnn_helper.plot_confusion_matrix_from_generator(cnn_model, valid_generator)

### Q: Did the data augmentation help? Why or why not? What makes this dataset more or less likely to be helped by data augmentation?


<details>
<summary>
<font size="3" color="green">
<b>Optional hints for <code><font size="4">question above</font></code></b>
</font>
</summary>
    1. Are the blood cells at random places in the image?
    
2. Look at some of the images. Are the bloodcells centered? What could rotations or zooms change about this?
    
3. Are there color changes you could compensate for?    
</details>

## Regularisation methods


Both BatchNormalization and DropOut are two different regularisation methods. Try adding both to the best working CNN model.  
  
Read more about BatchNormalization here: https://keras.io/api/layers/normalization_layers/batch_normalization/
Read more about DropOut here:https://keras.io/api/layers/regularization_layers/dropout/

### Q:  What are the main similarities and differences between these methods?

In [None]:

# Create the model here
## Set up the model architecture
### use your best model from above

cnn_inputs = keras.Input(shape=(32,32,1))
x = layers.Conv2D(1, kernel_size=(3, 3), strides=2,padding='same')(cnn_inputs)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Flatten()(x)
cnn_outputs = layers.Dense(5, activation='softmax')(x)





In [None]:
## Define the model 
cnn_model = keras.Model(inputs=cnn_inputs, outputs=cnn_outputs, name="cnn_Model")

In [None]:
## Compile the model

cnn_model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics = ['accuracy'])


In [None]:
## Actually train model
epochs = 15
history = cnn_model.fit_generator(generator=train_generator,
                    steps_per_epoch= train_steps,
                    validation_data= valid_generator,
                    validation_steps= validation_steps,
                    epochs= epochs
        )

In [None]:
## Plot results
plot_history(history, "test_Name")

# plot confusion matrix
cnn_helper.plot_confusion_matrix_from_generator(cnn_model, valid_generator)

### Q: Is there such a thing as too much regularisation?

# Visualise your best CNN

Use the code below to visualise some of the weights you have trained. Hint: Weights are present in convolutional filters and dense layers, nowhere else.
### Visualize both one layer with filters, and the outputlayer

In [None]:
# Pick the layer 
print(cnn_model.layers)
cw1 = np.array(cnn_model.layers[1].get_weights()) ## Pick the layer whose weights you want to visualise
print(cw1.shape) # 2 weight, 1 weight, 1 bias
print(cw1[0].shape) # Weights
print(cw1[1].shape) # Biases
matrix = cw1[0]

In [None]:
# Plot your filters 
figure, ax = plt.subplots(2, 3, figsize=(14, 10))
figure.suptitle("Weights visualized", fontsize=20)
axes = ax.ravel()

for i in range(0,1): # Range should be 0 - the number of filters you have
    image = matrix[:,:,i:i+1]
    image = np.reshape(matrix, (2, 2)) ## Reshape to the size of your filters
    axes[i].set_title("Filter" + str(i+1), fontsize=14) 
    axes[i].imshow(image)
    axes[i].set_axis_off()
    
plt.subplots_adjust(wspace=0.05, hspace=0.05)
plt.show()
plt.close()

# Using existing models 

One great thing to do when making a CNN model is to use an architecture that has worked for simmilar cases. I happen to know that the existing CNN model VGG16 is a good model for these types of images, try that one next.

There are many way of visualising neural networks, see https://datascience.stackexchange.com/questions/12851/how-do-you-visualize-neural-network-architectures, but here is one made by Christophe Avenel

<img src="Illustrations/vgg16.png" title="VGG16 model"/>

### VGG16

In [None]:
vgg_model = keras.applications.VGG16(
    include_top=False,
    weights=None,
    input_shape=(32, 32, 1),
    pooling=None,
)


In [None]:

# add new classifier layers
flat1 = layers.Flatten()(vgg_model.layers[-1].output)
class1 = layers.Dense(1024, activation='relu')(flat1)
output = layers.Dense(5, activation='softmax')(class1)


In [None]:

vgg_model = keras.Model(inputs=vgg_model.inputs, outputs=output)

print (vgg_model.summary())

### Q: How many parameters does this model have?

In [None]:
## Compile the model

vgg_model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics = ['accuracy'])

### Q: Why do we need a new classification layers?

<details>
<summary>
<font size="3" color="green">
<b>Optional hints for <code><font size="4">question above</font></code></b>
</font>
</summary>
1. What is the original network classifying? 

2. What do we want to classify? 
    
<details>
<summary>
<font size="3" color="green">
<b>Optional hints for <code><font size="4">The hint, if you need it</font></code></b>
</font>
</summary>
1. So how do we remove the previous classification and make the new one? Just like the code above naturally! A flattening layer is almost always followed by a dense layer or two to expand the model, and then a final classification layer.
    

</details>
</details>

In [None]:
## Actually train model
epochs = 10
history = vgg_model.fit_generator(generator=train_generator,
                    steps_per_epoch= train_steps,
                    validation_data= valid_generator,
                    validation_steps= validation_steps,
                    epochs= epochs
        )

In [None]:
## Plot results
plot_history(history, "VGG16")

In [None]:
# plot confusion matrix
cnn_helper.plot_confusion_matrix_from_generator(vgg_model, valid_generator)

### Q: What is your worst performing class in this classifier? Is it the same as in the other ones?

### Q: How many layers with 10 filters of size 3*3 would you have to add to the first CNN model we designed to achieve the same number of parameters?  

# Try some more models.

Try other optimizers, learning rates, batch sizes or number of epochs. Which would you like to try first and why? Show atleast 4 models


# Finally test your best model

In [None]:
test_steps=test_generator.n//test_generator.batch_size if test_generator.n >= test_generator.batch_size else 1

pred=unknown_model.predict_generator(test_generator, ## replace unknown_model with your best model
steps=test_steps,
verbose=1)

In [None]:
cnn_helper.plot_confusion_matrix_from_generator(unknown_model, test_generator) ## replace unknown_model with your best model

# ANN 

Make a neural network without any convolutions that achieves atleast 90% on the validation test. It will be possible with the techniques you have used above.

In [None]:
## Set up the model architecture
inputs = keras.Input(shape= (32,32,1))
#Extend your model here (atleast)
outputs = layers.Dense(5, activation="softmax")(x)

## Optional

Try using different proportions for training, validation and test. How does this affect your results? Why?