# Artificial Intelligence for Beginners
## Introduction
![](https://raw.githubusercontent.com/GNS-Science/AI_workshop_GNS_staff_conference/christof/notebook_images/AI_ML_DL.png)

## Image Classification
![](https://raw.githubusercontent.com/GNS-Science/AI_workshop_GNS_staff_conference/christof/notebook_images/CNN_scheme.png)

## Today's Exercise

Image classification of 3 types of Fossil with a Deep  Convolutional Neural Network

1. Design our own model
2. Tune the architecture of a model 
3. Use a pre-trained model and transfer learning

![](https://raw.githubusercontent.com/GNS-Science/AI_workshop_GNS_staff_conference/christof/notebook_images/mf_composite.png)

To follow along follow this link: **http://bit.ly/3JsfxV0** or scan the QR code

![](https://raw.githubusercontent.com/GNS-Science/AI_workshop_GNS_staff_conference/christof/notebook_images/QR_code.png)


## Fossil Classification with Deep Learning
* The micro-fossil data is provided by the Paleontology team, GNS Science.
* We will train and evaluate a deep convolutional neural network (CNN) to classify images of fossils
* To speed up computation we have selected 3 out of 14 different classes

### 1. Simple Convolutional Neural Network

First, we will:
1. Create a simple CNN
2. Train it on a subset of the data (the training dataset)
3. Evaluate its performance on the rest of the dataset (training dataset)

**Note**: `nn_utils.py` is a Python script with some helper functions to keep this demo brief. Have a look at it by double-clicking on it in the files tab.

**Let start!**


#### Install [tensorflow](https://www.tensorflow.org/) to code the CNN

In [None]:
# #specify the tensorflow version
%tensorflow_version 2.x
import tensorflow as tf
print(tf.__version__)
from tensorflow import keras
!pip install -q -U keras-tuner gdown

#### Get the training data and the Python script with the helper functions from github

In [None]:
!wget -q https://raw.githubusercontent.com/GNS-Science/AI_workshop_GNS_staff_conference/main/ImageClassesM.zip

In [None]:
!wget -q https://raw.githubusercontent.com/GNS-Science/AI_workshop_GNS_staff_conference/main/nn_utils.py

#### Unzip the training data

In [None]:
!unzip ImageClassesM.zip > /dev/null

In [None]:
# You can also upload files directly from your computer
# by using the following code
# from google.colab import files
# uploaded =  files.upload()

#### Load the helper functions module

In [None]:
import nn_utils
# -- function to split a dataset into a training and a testing dataset
from sklearn.model_selection import train_test_split


#### Import the images and store them into a `Numpy` array along a class array 

In [None]:
# -- read data, format images into an Array and encode the classes
input_dir = "/content/ImageClassesM"
ImageArray, Labels = nn_utils.read_images(images_dir=input_dir)

#### Split the dataset into *training* and *testing*.

This dataset like many other is not balanced, i.e. there are more images of certain classes than others. If not taken into account this could lead to a biased model. When spliting the dataset, we will use *Stratified Sampling* to ensure that each class within the dataset receives a statistically accurate representation within both training and testing subsets.

In [None]:
X_train, X_test, y_train, y_test = \
    train_test_split(ImageArray, Labels, test_size=0.2, random_state=42, stratify=Labels)

#### Create our simple CNN model and use the training dataset to teach the model to recognize fossils.

During training, it is common to keep a small portion of the training dataset to *validate* or test the model as it is training. This always ensures that the model's accuracy is tested on never-seen-before data. The `batch_size` controls the number of images (in this case) fed to the model at a time and the number of `epochs` is the number of times it is trained on the training dataset. We will also plot a summary of the training to see how it went.

In [None]:
image_shape = ImageArray.shape[1:] # dimension of one image -> input
number_of_class = y_train.shape[1] # number of class, size of the prediction distribution -> output

model, summary = nn_utils.CNNbase_model(
        input_shape=image_shape, n_class=number_of_class)

# -- training and evaluating the model with a subset of unseen data
history = model.fit(
    x=X_train, y=y_train, batch_size=32, validation_split=0.2, epochs=25)

nn_utils.summarize_diagnostics(history=history, filename="/content/base_summary.jpg")

In [None]:
eval_result = model.evaluate(x=X_test, y=y_test)
print(f"""
        Evaluation of test data [test loss, test accuracy] is {eval_result}.
       """)

# -- try on some examples
X_sample, y_sample = nn_utils.sample_dataset(X=X_test, y=y_test, size=9, random_state=42)
y_pred = model.predict(x=X_sample)
nn_utils.plot_prediction(x=X_sample, y_true=y_sample, y_pred=y_pred, filename="/content/base_prediction.jpg")


Often the first model (CNN) is not performing well. This could be due to multiple factors, e.g. the training dataset is too small, the architecture of the model is too simple, etc..

### 2. Model optimization and hyperparameters tunning

We know that one of the limitation of the previous model is its architecture. Optimizing the architecture and tunning the hyperparameters (*hypertuning*) of a Machine Learning model is almost an art by itself. You can find more details on `Tensorflow`'s tuner [here!](https://www.tensorflow.org/tutorials/keras/keras_tuner).

Here, the parameters that will be optimized are for example the size of the window to perform the 2D convolution in the first half of the model or the number of hidden layers in the second half of the model. 

In [None]:
image_shape = ImageArray.shape[1:] # dimension of one image -> input
number_of_class = y_train.shape[1] # number of class, size of the prediction distribution -> output

tuner = nn_utils.hyper_tuning(input_shape=image_shape, n_class=number_of_class)
tuner.search(X_train, y_train, validation_split=0.2)
tuner.results_summary()

best_hps = tuner.get_best_hyperparameters(1)[0]
print(f"""
        The hyperparameter search is complete. The optimal number of
        layers {best_hps.get('num_layers')} and the optimal learning rate for the optimizer
        is {best_hps.get('learning_rate')}. Best Conv. Filter: {best_hps.get('conv_filter')}. 
        Best Conv. Kernel: {best_hps.get('conv_kernel')}. 
    """)

Once the tuner is done, we get a set of hyperparameters that correspond to the best results on the validation dataset, which is here 20% of the training dataset. We can then train a model with these hyperparameters and evaluate our new model! 

In [None]:
best_model = tuner.hypermodel.build(best_hps)
history = best_model.fit(
    x=X_train, y=y_train, batch_size=32, validation_split=0.2, epochs=10)

# -- 3.3 testing the model
eval_result = best_model.evaluate(x=X_test, y=y_test)
print(f"""
            Evaluation of test data [test loss, test accuracy] is {eval_result}.
        """)

# -- plot summary 
nn_utils.summarize_diagnostics(history=history, filename="/content/tuned_summary.jpg")

Let's look at some examples..

In [None]:
X_sample, y_sample = nn_utils.sample_dataset(X=X_test, y=y_test, size=9, random_state=42)
y_pred = best_model.predict(x=X_sample)
nn_utils.plot_prediction(x=X_sample, y_true=y_sample, y_pred=y_pred, filename="/content/tuned_prediction.jpg")


Still not very convincing, right? Remember it might not be **just** the architecture, most likely our training dataset is too limited. We will see in the next exercise how to use a pre-trained model and transfer learning to have a highly accurate model to classify our images.

### 3. Transfer learning and how to use a pre-trained model

One other limitation of today's exercises is that we are limited by the number of fossil images. Fortunately, a lot of people have trained many other models for image classification. While there might not be a specific model trained to classify fossil images we can still make use of a model trained on other types of images. We are going to use the **ResNet50** model, a CNN 50-layers deep trained on more than a million labelled images such as ballons or strawberries from https://www.image-net.org/. To be able to use it for classifying fossils, we are going to use a principle called *Transfer Learning*: we will use a model trained on some data and apply it on some other data. 

In practice, this consists of *replacing* the last part of the CNN, the fully connected layer which does the classification based on the features extracted by the convolutional layers, by a new fully connected layer that needs to be trained.

In [None]:
# -- 3.1 init of the Model with new top layer
image_shape = ImageArray.shape[1:] # dimension of one image -> input
number_of_class = y_train.shape[1] # number of class, size of the prediction distribution -> output
model = nn_utils.CNNtrained_model(input_shape=image_shape, n_class=number_of_class)

# -- 3.2 training and evaluating the model with a subset of unseen data (validation)
history = model.fit(x=X_train, y=y_train, batch_size=32, validation_split=0.2, epochs=15)

# -- 3.3 testing the model
eval_result = model.evaluate(x=X_test, y=y_test)
print(f"""
            Evaluation of test data [test loss, test accuracy] is {eval_result}.
        """)

# -- 3.4 plot summary 
nn_utils.summarize_diagnostics(history=history, filename="/content/pretrained_summary.jpg")

It's a win! By using Transfer Learning, we clearly see an improvement. Let's have a look at some examples.

In [None]:
# -- 3.5 Show some examples
X_sample, y_sample = nn_utils.sample_dataset(X=X_test, y=y_test, size=9, random_state=42)
y_pred = model.predict(x=X_sample)
nn_utils.plot_prediction(x=X_sample, y_true=y_sample, y_pred=y_pred, filename="/content/pretrained_prediction.jpg")

Not a single mistake! 
This new model to classify images of fossils is much more accurate than our first models thanks to the **ResNet50** pre-trained model. In practice, we benefit from the already very well trained filters that are in the convolutional layers to extract meaningful and accurate features. In fact, Transfer Learning is becoming common: [a good example](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2021JB021910?utm_sq=gqzx3u0l8q) is a model trained to pick seismic wave arrivals in Southern California which was used to create a model specifically trained for picking events at Nabro Volcano, a Stratovolcano in Eritrea. 

### Conclusion

* Colab allows us to get started with Python and Machine Learning without having to install anything
* Setting up a machine learning model is like combining Lego bricks thanks to many available Python packages
* Transfer learning is a great option when the datasets are small
* Training Deep Learning models from scratch requires a lot of data
