# Introduction 

This project is inspired from the competition entitled **Microsoft Rice Disease Classification Challenge** on **Zindi** plateform :

https://zindi.africa/competitions/microsoft-rice-disease-classification-challenge


Dataset source : https://zindi.africa/competitions/microsoft-rice-disease-classification-challenge/data 


The architecture of the following developed model is inspired from the following article

                       **Image Classification Transfer Learning and Fine Tuning using TensorFlow**
                       
https://towardsdatascience.com/image-classification-transfer-learning-and-fine-tuning-using-tensorflow-a791baf9dbf3

Overall, in terms of technology stack for our deepl learning model building, we use :

      Tensorflow, 
      TF dataset, 
      CNN(Convolutional Neural Networks), 
      data augmentation, 
      learning transfer,
      pretrained model : EfficientNetB0,
      Fine tuning.           

**Transfer learning : définition**  (https://datascientest.com/transfer-learning )

Commençons d’abord par définir ce terme de plus en plus utilisé en Data Science.

Le Transfer Learning, ou apprentissage par transfert en français, désigne l’ensemble des méthodes qui permettent de transférer les connaissances acquises à partir de la résolution de problèmes donnés pour traiter un autre problème. 

Le Transfer Learning a connu un grand succès avec l’essor du Deep Learning.  En effet, bien souvent, les modèles utilisés dans ce domaine nécessitent des temps de calcul élevés et des ressources importantes. Or, en utilisant des modèles pré-entraînés comme point de départ, le Transfer Learning permet de développer rapidement des modèles performants et résoudre efficacement des problèmes complexes en Computer Vision ou Natural Language Processing, NLP.

**Transfer learning vs Fine tuning** : https://www.youtube.com/watch?v=3nbin3bT8ec & https://www.youtube.com/watch?v=h_rz7_sIV40 (very interesting course) 

Weights & Biases est un outil qui permet de suivre et facilement enregistrer les performances de modèles deep learning. 
(https://machine-learning.paperspace.com/wiki/weights-and-biases)
(https://www.youtube.com/watch?v=EeqhOSvNX-A)

https://siecledigital.fr/2019/12/20/weights-biases-loutil-pour-suivre-les-performances-de-vos-modeles-de-deep-learning/
https://siecledigital.fr/2019/01/30/differences-intelligence-artificielle-machine-learning-deep-learning/


## Import all the Dependencies

In [1]:
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import random
import shutil
from pathlib import Path
from glob import glob
import os
import warnings
import torch

from tensorflow.keras import models, layers
from tensorflow import expand_dims
from PIL import Image
from tensorflow.keras.applications import EfficientNetB0
from keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import SparseCategoricalCrossentropy

from keras.models import load_model
from PIL import Image


In [2]:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        tf.config.experimental.set_virtual_device_configuration(
        gpus[0],[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=5120)])
    except RuntimeError as e:
            print(e)

## Data collection & pre-processing step (using Tensorflow dataset)

In [3]:
train_data_csv = pd.read_csv('../input/rgb-csv/RGB_train_data.csv')
train_data_csv

### Set all the Constants

In [4]:
batch_size_inst = 5
image_size_inst = (224,224)

### Import data into tensorflow dataset object

We will use image_dataset_from_directory api to load all images in tensorflow dataset: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory

In [5]:
# Train images path
path = '../input/rgb-images/RGB_images'
dataset = tf.keras.preprocessing.image_dataset_from_directory(
    path,
    shuffle = True,
    batch_size = batch_size_inst,
    image_size = image_size_inst,
    follow_links=False
    )

In [6]:
train_classes = dataset.class_names
train_classes

In [7]:
len(dataset)

In [8]:
for image_batch, labels_batch in dataset.take(1):
    print(image_batch.shape)
    print(labels_batch.numpy())

As you can see above, each element in the dataset is a tuple. First element is a batch of 5 elements of images. Second element is a batch of 32 elements of class labels

### Visualize some of the images from our dataset

In [9]:
plt.figure(figsize=(15,15))
for image_batch , label_batch in dataset.take(2):
    for i in range(5):
        ax= plt.subplot(2,3,i+1)
        plt.imshow(image_batch[i].numpy().astype('uint8'))
        plt.title(train_classes[label_batch[i]])
        plt.axis('off')

Since neural networks work best with float32 or float16 datatypes, we need to convert the uint8 datatype of the images to float32 datatype.

In [10]:
def data_type_images(image, label):
    image = tf.cast(image, dtype = tf.float32)
    return image, label

In [11]:
dataset.element_spec

Great! Now all the images are of size (224, 224, 3) and the datatypes are of the float32 datatype. The None shape indicates that the images and labels have been split into batches which we specified earlier.

Oh and another thing we could do to increase our processing speed and thereby decrease the time taken to train a neural network in the process is by implementing the mixed_precision training from the keras module. Mixed precision is the use of both 16-bit and 32-bit floating-point types in a model during training to make it run faster and use less memory. The Keras mixed precision API allows us to use a mix of float16 with float32, to get the performance benefits from float16 and the numeric stability benefits from float32.

In [12]:
from tensorflow.keras import mixed_precision
mixed_precision.set_global_policy('mixed_float16')

### Function to Split Dataset

Dataset should be bifurcated into 3 subsets, namely:

    1) Training: Dataset to be used while training
    2) Validation: Dataset to be tested against while training
    3) Test: Dataset to be tested against after we trained a model

In [13]:
def get_dataset_partitions_tf(ds, train_split=0.8, val_split=0.1, test_split=0.1, shuffle=True, shuffle_size=1000):
    
    assert (train_split + test_split + val_split) == 1
    
    ds_size = len(ds)
    
    if shuffle:
        ds = ds.shuffle(shuffle_size, seed=12)
        
    train_size = int(train_split * ds_size)
    
    val_size = int(val_split * ds_size)
    
    train_ds = ds.take(train_size)
    
    val_ds = ds.skip(train_size).take(val_size)
    
    test_ds = ds.skip(train_size).skip(val_size)
    
    return train_ds, val_ds, test_ds

In [14]:
train_ds, val_ds, test_ds = get_dataset_partitions_tf(dataset)

In [15]:
train_ds = train_ds.map(data_type_images)
val_ds = val_ds.map(data_type_images)
test_ds = test_ds.map(data_type_images)

In [16]:
len(train_ds)

In [17]:
len(val_ds)

In [18]:
len(test_ds)

### Cache, Shuffle, and Prefetch the Dataset

In [19]:
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
val_ds = val_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
test_ds = test_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)

# Building the Model

## Data Augmentation

Data Augmentation is needed when we have less data, this boosts the accuracy of our model by augmenting the data and avoid overfitting.

In [20]:
data_augmentation = tf.keras.Sequential([
    layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
    layers.experimental.preprocessing.RandomRotation(0.2),
    layers.experimental.preprocessing.RandomContrast(0.2),  
    layers.experimental.preprocessing.RandomZoom(0.8),      
])

## Model Architecture

Before we start with the actual transfer learning and fine-tuning, we will create a ModelCheckpoint callback which will create a checkpoint of our model which we can later revert back to, if needed. Here we save only the weights of the model and not the whole model itself as it can take some time. The validation accuracy is the metric being monitored here and we also set the save_best_only parameter to True and so the callback will only the save the weights of the model which has led to the highest validation accuracy.

In [21]:
def model_checkpoint(directory, name):
    log_dir = directory + "/" + name
    m_c = tf.keras.callbacks.ModelCheckpoint(filepath=log_dir,
                                             monitor="val_accuracy",
                                             save_best_only=True,
                                             save_weights_only=True,
                                             mode= 'max',
                                             verbose=2)
    return m_c

One regularization technique as augmentation data is Early-Stopping callback techniques. It can be used to reduce over-fitting and maybe improve our model’s accuracy scores on unseen test data. 

In [22]:
earlystop = EarlyStopping(
monitor='val_accuracy',
min_delta = 0.001,
patience = 10,
verbose = 2,
mode = 'auto')

Okay now that we have preprocessed our data, visualized and created some handy callbacks, we move on the the acutal transfer learning part! We create the base model with **tf.keras.applications.efficientnet** and choose **EfficientNetB0** from the list of available EfficientNet prebuilt models. We also set the **include_top** parameter to **False** as we will be using our own output classifier layer suited to rice disease classification application. Since we are just **feature extracting** and **not fine-tuning** at this point, we will set **base_model.trainable = False.**

In [23]:
base_model = tf.keras.applications.efficientnet.EfficientNetB0(include_top=False)
base_model.trainable = False

Now that we have our feature extraction base model ready, we will use the Keras Functional API to create our model which include the base model as functional layer for our feature extraction training model. The input layer is first created with the shape set to (224, 224, 3). The base model layer is then instantiated by passing the input layer while also setting the training parameter to False as we are not into fine-tuning the model yet. 

Next, the GlobalAveragePooling2D layer is set up to perform the pooling operation for our convolutional neural network. After that, a small change is done where this model differs from the conventional CNN, where the output layer is split separately into the Dense and Activation layers. Usually for any classification, the softmax activation can be included along with the output Dense layer. But here, since we have implemented mixed_precision training, we pass the Activation layer separately at the end as we want the output from the softmax activation to be of the float32 datatype which will rid us of any stability errors by maintaining numeric stability. Finally, the model is created by using the Model() function.

In [24]:
from tensorflow.keras import layers

inputs = layers.Input(shape = (224,224,3), name='inputLayer')
x = data_augmentation(inputs)
x = base_model(x, training = False)
x = layers.GlobalAveragePooling2D(name='poolingLayer', keepdims= False)(x)
x = layers.Dense(3, name='outputLayer')(x)
outputs = layers.Activation(activation="softmax", dtype=tf.float32, name='activationLayer')(x)

model = tf.keras.Model(inputs, outputs, name = "FeatureExtractionModel")

Now that we have built the model with just feature extraction included, we use the summary() method to get a look at the architecture of our feature extraction model:

In [25]:
model.summary()

We see that the whole base model (EfficientNetB0) is implemented as a single functional layer in our feature extraction model. We also see the other layers created for building our model. The base model contains about 4 million parameters but they are all non-trainable since we have frozen the base model as we are not fine tuning but rather only extracting the features from the base model. The only trainable parameters come from the output Dense layer.

Next, we get a detailed look at the various layers in our feature extraction model, their names, see if they are trainable or not, their data types and their corresponding data type policies.

In [26]:
for lnum, layer in enumerate(model.layers):
    print(lnum, layer.name, layer.trainable, layer.dtype, layer.dtype_policy)

0 inputLayer True float32 <Policy "float32">

1 sequential True float32 <Policy "mixed_float16">

2 efficientnetb0 False float32 <Policy "mixed_float16">

3 poolingLayer True float32 <Policy "mixed_float16">

4 outputLayer True float32 <Policy "mixed_float16">

5 activationLayer True float32 <Policy "float32">

We can see that all the layers excepting the base model layer are trainable. The datatype policies indicate that apart from the input layer and the output activation layer which have float32 datatype policy, all the other layers are compatible with and employ the mixed_float16 datatype policy.

Now we can move on to compiling and fitting our model. Since our labels are not one-hot encoded, we use the **SparseCategoricalCrossentropy()** loss function. We use the **Adam()** optimizer with its default learning rate and set the model metrics to measure accuracy.

Then we begin to fit our feature extraction model. We save the model fitting history variable into hist_model and we train our model on the train dataset for 10 epochs. We use our test dataset as the validation data while training. We also pass the previously created callback function ModelCheckpoint which saves our model's best weights while monitoring the validation accuracy.

In [27]:
warnings.filterwarnings('ignore')

In [28]:
model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
              optimizer = tf.keras.optimizers.Adam(),
              metrics = ["accuracy"])

hist_model = model.fit(train_ds,
                       epochs = 10,
                       validation_data=val_ds,
                       callbacks=[earlystop,model_checkpoint("Checkpoints","model.ckpt")])

Epoch 1/10
2022-09-06 04:36:23.625739: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8005
427/427 [==============================] - 34s 27ms/step - loss: 0.8259 - accuracy: 0.6417 - val_loss: 0.7457 - val_accuracy: 0.6566

Epoch 00001: val_accuracy improved from -inf to 0.65660, saving model to Checkpoints/model.ckpt
Epoch 2/10
427/427 [==============================] - 7s 17ms/step - loss: 0.7212 - accuracy: 0.7002 - val_loss: 0.6471 - val_accuracy: 0.7585

Epoch 00002: val_accuracy improved from 0.65660 to 0.75849, saving model to Checkpoints/model.ckpt
Epoch 3/10
427/427 [==============================] - 7s 17ms/step - loss: 0.6677 - accuracy: 0.7274 - val_loss: 0.6288 - val_accuracy: 0.7660

Epoch 00003: val_accuracy improved from 0.75849 to 0.76604, saving model to Checkpoints/model.ckpt
Epoch 4/10
427/427 [==============================] - 8s 19ms/step - loss: 0.6463 - accuracy: 0.7377 - val_loss: 0.7031 - val_accuracy: 0.7170

Epoch 00004: val_accuracy did not improve from 0.76604
Epoch 5/10
427/427 [==============================] - 7s 17ms/step - loss: 0.6356 - accuracy: 0.7311 - val_loss: 0.6104 - val_accuracy: 0.7811

Epoch 00005: val_accuracy improved from 0.76604 to 0.78113, saving model to Checkpoints/model.ckpt
Epoch 6/10
427/427 [==============================] - 7s 17ms/step - loss: 0.6168 - accuracy: 0.7513 - val_loss: 0.6141 - val_accuracy: 0.7585

Epoch 00006: val_accuracy did not improve from 0.78113
Epoch 7/10
427/427 [==============================] - 7s 16ms/step - loss: 0.6228 - accuracy: 0.7504 - val_loss: 0.5796 - val_accuracy: 0.7849

Epoch 00007: val_accuracy improved from 0.78113 to 0.78491, saving model to Checkpoints/model.ckpt
Epoch 8/10
427/427 [==============================] - 8s 18ms/step - loss: 0.6187 - accuracy: 0.7518 - val_loss: 0.5536 - val_accuracy: 0.7811

Epoch 00008: val_accuracy did not improve from 0.78491
Epoch 9/10
427/427 [==============================] - 8s 18ms/step - loss: 0.5834 - accuracy: 0.7766 - val_loss: 0.5685 - val_accuracy: 0.7887

Epoch 00009: val_accuracy improved from 0.78491 to 0.78868, saving model to Checkpoints/model.ckpt
Epoch 10/10
427/427 [==============================] - 7s 17ms/step - loss: 0.5716 - accuracy: 0.7738 - val_loss: 0.5622 - val_accuracy: 0.7774

Epoch 00010: val_accuracy did not improve from 0.78868

We have also created a model checkpoint with the model weights leading to the highest validation accuracy incase we want to revert back to it after fine-tuning in the next stage. Let us evaluate our model on the whole test data now and store it in a variable.

In [29]:
model_results = model.evaluate(test_ds)

54/54 [==============================] - 3s 12ms/step - loss: 0.5355 - accuracy: 0.7963

Not bad, not bad at all. 79.63% accuracy is pretty decent for just a feature extraction model.

We have got an accuracy of about 79.63% with our feature extraction model based on the EfficientNetB0 base model. How about we try to increase the accuracy score by fine-tuning our feature extraction model now. To do this, we set the layers of our model’s base_model (the EfficientNetB0 functional layer) to be trainable and unfreeze the previously frozen base model.

In [30]:
base_model.trainable = True

However, The BatchNormalization layers need to be kept frozen. If they are also turned to trainable, the first epoch after unfreezing will significantly reduce accuracy. Since all the layers in the base model have been set to trainable, we use the following code block to again freeze only the BatchNormalization layers of our base model:

In [31]:
for layer in model.layers[1].layers:
    if isinstance(layer, layers.BatchNormalization):
        layer.trainable = False

A quick check on the base model’s layers, their datatypes and policies and whether or not they’re trainable:

In [32]:
for lnum, layer in enumerate(model.layers[1].layers[-10:]):
    print(lnum, layer.name, layer.trainable, layer.dtype, layer.dtype_policy)

0 random_flip True float32 <Policy "mixed_float16">

1 random_rotation True float32 <Policy "mixed_float16">

2 random_contrast True float32 <Policy "mixed_float16">

3 random_zoom True float32 <Policy "mixed_float16">

Good! Looks like all the layers in the base model are trainable and ready for fine tuning except for the BatchNormalization layers. I think we are good to proceed and fine tune our feature extraction model 'model'. We begin by compiling the model again since we have change the layer attributes (unfreeze), but this time, as a general good practice, we reduce the default learning rate of the Adam() optimizer by ten times so as to reduce overfitting and stop the model from learning/memorizing the train data to a great extent.

Then we begin to fit our fine tuning model. We save the model fitting history variable into **hist_model_tuned** and we train our model on the train dataset for 20 epochs, however, we set the initial epoch to be the last epoch from our feature extraction model training which would be 10. Thus the fine tuning model training would start from epoch number 10 and would run upto 30 (20 epochs total). We again pass the previously created callback functions into the callbacks list : ModelCheckpoint which saves our model's best weights while monitoring the validation accuracy.

In [33]:
model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
              optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001),
              metrics = ["accuracy"])
hist_model_tuned = model.fit(train_ds,
                             epochs=30,
                             validation_data=val_ds,
                             initial_epoch=hist_model.epoch[-1],
                             callbacks=[earlystop, model_checkpoint("Checkpoints", "model_tuned.ckpt")])

Epoch 10/30
427/427 [==============================] - 43s 69ms/step - loss: 0.6743 - accuracy: 0.7255 - val_loss: 0.6565 - val_accuracy: 0.7132

Epoch 00010: val_accuracy improved from -inf to 0.71321, saving model to Checkpoints/model_tuned.ckpt
Epoch 11/30
427/427 [==============================] - 28s 66ms/step - loss: 0.5399 - accuracy: 0.7855 - val_loss: 0.3512 - val_accuracy: 0.8679

Epoch 00011: val_accuracy improved from 0.71321 to 0.86792, saving model to Checkpoints/model_tuned.ckpt
Epoch 12/30
427/427 [==============================] - 28s 66ms/step - loss: 0.4539 - accuracy: 0.8262 - val_loss: 0.2976 - val_accuracy: 0.8717

Epoch 00012: val_accuracy improved from 0.86792 to 0.87170, saving model to Checkpoints/model_tuned.ckpt
Epoch 13/30
427/427 [==============================] - 28s 66ms/step - loss: 0.3888 - accuracy: 0.8473 - val_loss: 0.2136 - val_accuracy: 0.9321

Epoch 00013: val_accuracy improved from 0.87170 to 0.93208, saving model to Checkpoints/model_tuned.ckpt
Epoch 14/30
427/427 [==============================] - 28s 66ms/step - loss: 0.3683 - accuracy: 0.8618 - val_loss: 0.1768 - val_accuracy: 0.9509

Epoch 00014: val_accuracy improved from 0.93208 to 0.95094, saving model to Checkpoints/model_tuned.ckpt
Epoch 15/30
427/427 [==============================] - 28s 65ms/step - loss: 0.3067 - accuracy: 0.8890 - val_loss: 0.2299 - val_accuracy: 0.9170

Epoch 00015: val_accuracy did not improve from 0.95094
Epoch 16/30
427/427 [==============================] - 28s 65ms/step - loss: 0.2886 - accuracy: 0.8937 - val_loss: 0.1828 - val_accuracy: 0.9208

Epoch 00016: val_accuracy did not improve from 0.95094
Epoch 17/30
427/427 [==============================] - 29s 67ms/step - loss: 0.2596 - accuracy: 0.8993 - val_loss: 0.1641 - val_accuracy: 0.9396

Epoch 00017: val_accuracy did not improve from 0.95094
Epoch 18/30
427/427 [==============================] - 28s 65ms/step - loss: 0.2419 - accuracy: 0.9077 - val_loss: 0.1125 - val_accuracy: 0.9698

Epoch 00018: val_accuracy improved from 0.95094 to 0.96981, saving model to Checkpoints/model_tuned.ckpt
Epoch 19/30
427/427 [==============================] - 28s 67ms/step - loss: 0.2135 - accuracy: 0.9199 - val_loss: 0.1012 - val_accuracy: 0.9547

Epoch 00019: val_accuracy did not improve from 0.96981
Epoch 20/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1986 - accuracy: 0.9344 - val_loss: 0.1098 - val_accuracy: 0.9623

Epoch 00020: val_accuracy did not improve from 0.96981
Epoch 21/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1841 - accuracy: 0.9344 - val_loss: 0.0891 - val_accuracy: 0.9698

Epoch 00021: val_accuracy did not improve from 0.96981
Epoch 22/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1822 - accuracy: 0.9311 - val_loss: 0.1550 - val_accuracy: 0.9434

Epoch 00022: val_accuracy did not improve from 0.96981
Epoch 23/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1760 - accuracy: 0.9382 - val_loss: 0.1148 - val_accuracy: 0.9585

Epoch 00023: val_accuracy did not improve from 0.96981
Epoch 24/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1689 - accuracy: 0.9368 - val_loss: 0.0794 - val_accuracy: 0.9660

Epoch 00024: val_accuracy did not improve from 0.96981
Epoch 25/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1696 - accuracy: 0.9429 - val_loss: 0.0421 - val_accuracy: 0.9811

Epoch 00025: val_accuracy improved from 0.96981 to 0.98113, saving model to Checkpoints/model_tuned.ckpt
Epoch 26/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1276 - accuracy: 0.9532 - val_loss: 0.0523 - val_accuracy: 0.9774

Epoch 00026: val_accuracy did not improve from 0.98113
Epoch 27/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1479 - accuracy: 0.9443 - val_loss: 0.0794 - val_accuracy: 0.9774

Epoch 00027: val_accuracy did not improve from 0.98113
Epoch 28/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1464 - accuracy: 0.9433 - val_loss: 0.0928 - val_accuracy: 0.9736

Epoch 00028: val_accuracy did not improve from 0.98113
Epoch 29/30
427/427 [==============================] - 28s 66ms/step - loss: 0.1422 - accuracy: 0.9457 - val_loss: 0.1327 - val_accuracy: 0.9509

Epoch 00029: val_accuracy did not improve from 0.98113
Epoch 30/30
427/427 [==============================] - 28s 65ms/step - loss: 0.1404 - accuracy: 0.9475 - val_loss: 0.0960 - val_accuracy: 0.9623

Epoch 00030: val_accuracy did not improve from 0.98113

Okay great! Looks like the validation accuracy has improved a lot. Let’s evaluate our fine-tuned model on the whole test data now and store it in a variable.

In [34]:
model_tuned_results = model.evaluate(test_ds)

54/54 [==============================] - 1s 12ms/step - loss: 0.0694 - accuracy: 0.9778

In [37]:
model_tuned_results

model_tuned_results is just a list containing loss and accuracy value : [0.06944755464792252, 0.9777777791023254]

Would you look at that! The fine-tuned model’s accuracy has improved and is now at about 97.78%. That is a huge increase we have achieved with just transfer learning and fine-tuning. Thanks EfficientNet for your model!

Since we have created two history objects for both the feature extraction model and the fine-tuned model, we can create a function to plot both the histories together and compare the training and the validation metrics to get a general idea of the performance of both models:

In [35]:
def compare_histories(original_history, new_history, initial_epochs):
    """
    Compares two model history objects.
    """
    acc = original_history.history["accuracy"]
    loss = original_history.history["loss"]
    
    val_acc = original_history.history["val_accuracy"]
    val_loss = original_history.history["val_loss"]

    total_acc = acc + new_history.history["accuracy"]
    total_loss = loss + new_history.history["loss"]

    total_val_acc = val_acc + new_history.history["val_accuracy"]
    total_val_loss = val_loss + new_history.history["val_loss"]

    plt.figure(figsize=(9, 9))
    plt.subplot(2, 1, 1)
    plt.plot(total_acc, label='Training Accuracy')
    plt.plot(total_val_acc, label='Validation Accuracy')
    plt.plot([initial_epochs-1, initial_epochs-1],
              plt.ylim(), label='Start of Fine Tuning')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Accuracy')

    plt.subplot(2, 1, 2)
    plt.plot(total_loss, label='Training Loss')
    plt.plot(total_val_loss, label='Validation Loss')
    plt.plot([initial_epochs-1, initial_epochs-1],
              plt.ylim(), label='Start of Fine Tuning')
    plt.legend(loc='upper right')
    plt.title('Training and Validation Loss')
    plt.xlabel('epoch')
    plt.show()

Great! Now let us use both the hist_model and hist_model_tuned objects along the function with the initial_epochs set to 10 since we trained our first feature extraction model for 10 epochs.

In [36]:
compare_histories(hist_model, hist_model_tuned, initial_epochs=10)

Now that we have trained our fine-tuned model based on EfficientNetB0, we will use it to make predictions on the whole test dataset and store it in preds:

In [38]:
preds = model.predict(test_ds, verbose = 1)

Note: The predictions are stored in a collection of prediction probabilities for each class for an individual test image.

Now that we have the prediction probabilities for each class for a given test image, we can easily obtain the prediction labels using the tf.argmax() function which returns the index which contains the highest probability along a given axis. We store the prediction labels in pred_labels:

In [39]:
pred_labels = tf.argmax(preds, axis=1)
pred_labels[:10]

Now that we have made predictions and obtained the prediction labels, we can compare it with the test dataset labels to get detailed information on how well the fine-tuned model has made its predictions. We get the test labels from the test dataset using the following code block employing list comprehension:

In [40]:
test_labels = np.concatenate([y for x, y in test_ds], axis=0)
test_labels[:10]

Great! We now have both the predicted labels and the ground truth test labels. However to plot our prediction labels along with the associated image, we will need a dataset containing only the images extracted from test_data which contains both the images along with the labels. To do this, two steps are needed. First, we create an empty list and pass a for loop which extracts and appends each batch from the test dataset using the take() function to the newly created empty list test_image_batches. The -1 inside the take() function makes the for loop to pass through all the batches in the test data. The resulting output is a list of all the batches in the test data. Since the images are still in batches in the form of sublists inside the list test_image_batches, in step 2, we use list comprehension to extract each sublist, i.e each image from the main list and store them in the test_images list. 

The resulting list test_images contains all the 270 (=54*5) test images contained in the original test dataset which is verified by using the len() funtion on the list.

In [41]:
# Step 1
test_image_batches = []
for images, labels in test_ds.take(-1):
    test_image_batches.append(images.numpy())

# Step 2
test_images = [item for sublist in test_image_batches for item in sublist]
len(test_images)

Now that we have a list of test images(test_images), their ground truth test labels(test_labels) and their corresponding prediction labels(pred_labels) along with the prediction probabilities (preds), we can easily visualize the test images along with their predictions including the ground truth and prediction class names.

Here, we take 9 random images from test_images and plot them along with the ground truth class names and predicted class names in the plot title, side by side using indexing on class_names with the corresponding ground truth and predicted labels. We also display the prediction probabilities of the predictions on each image to get an idea of how confident the model is in its predictions. Moreover, to make the plots looke good and more informative, for wrong predictions, we set the title color to red, whereas for correct predictions by our fine-tuning model, we set the title color to green.

In [42]:
plt.figure(figsize = (20,20))
for i in range(9):
    random_int_index = random.choice(range(len(test_images)))
    plt.subplot(3,3,i+1)
    plt.imshow(test_images[random_int_index]/255.)
    if test_labels[random_int_index] == pred_labels[random_int_index]:
        color = "g"
    else:
        color = "r"
    plt.title("True Label: " + train_classes[test_labels[random_int_index]] + " || " + "Predicted Label: " +
              train_classes[pred_labels[random_int_index]] + "\n" + 
              str(np.asarray(tf.reduce_max(preds, axis = 1))[random_int_index]), c=color)
    plt.axis(False);

As we have both the test and prediction labels, we can also use the classification_report() function from SciKit-Learn library and store the precision, recall and f1-scores for each class in a dictionary report by specifying the output_dict parameter to be True.

In [43]:
from sklearn.metrics import classification_report
report = classification_report(test_labels, pred_labels, output_dict=True)

#check a small slice of the dictionary
import itertools
dict(itertools.islice(report.items(), 3))

{'0': {'precision': 0.5483870967741935,
  'recall': 0.551948051948052,
  'f1-score': 0.5501618122977345,
  'support': 154},
  
 '1': {'precision': 0.24675324675324675,
  'recall': 0.24050632911392406,
  'f1-score': 0.24358974358974356,
  'support': 79},
  
 '2': {'precision': 0.10526315789473684,
  'recall': 0.10810810810810811,
  'f1-score': 0.10666666666666667,
  'support': 37}}

Now that we have a dictionary containing the various evaluation metrics (precision, recall and f1-scores), how about we create a pandas dataframe which contains only the class names along with its corresponding F1-score. We choose the F1-score among the evaluation metrics as it achieves a balance between the precision and recall. You can read more about evaluation metrics here.

However, in order to create that required dataframe, we will need to extract the class labels(keys) and the corresponding F1-score(part of values) from the classification report dictionary. This can be achieved by creating an empty dictionary f1scores and passing a for loop which goes through the original dictionary report's items(keys and values) and appends the class names using key-indexing and the corresponding F1-score by value-indexing.

In [44]:
f1scores = {}
for k,v in report.items():
    if k == 'accuracy':
        break
    else:
        f1scores[train_classes[int(k)]] = v['f1-score']
        
#check a small slice of the dictionary
dict(itertools.islice(f1scores.items(), 5))

{'blast__rice': 0.5501618122977345,
 'brown__rice': 0.24358974358974356,
 'healthy__rice': 0.10666666666666667}

Now that we have the required dictionary ready, we can easily create a dataframe F1 form the following line of code and sort the dataframe by the descending order of the various classes' F1-scores:

In [45]:
F1 = pd.DataFrame({"Classes":list(f1scores.keys()),
                   "F1-Scores":list(f1scores.values())}).sort_values("F1-Scores", ascending=False)

#check a small slice of the dataframe
F1

In [46]:
fig, ax = plt.subplots(figsize = (6,4))
plt.barh(F1["Classes"], F1["F1-Scores"])
plt.ylim(-3,5)
plt.xlim(0,0.7)
plt.xlabel("F1-Scores")
plt.ylabel("Rice disease Classes")
plt.title("F1-Scores across various rice disease Classes")
plt.gca().invert_yaxis()
for i, v in enumerate(round(F1["F1-Scores"],3)):
    ax.text(v, i + .25, str(v), color='red');

As a final evaluation step for our transfer learning fine-tuned model based on the EfficientNetB0 let’s visualize the most wrong predictions of our model, i.e the false(wrong) predictions of our model where the prediction probability is the highest. This will help us get an idea about our model as to whether it is getting confused on similar food classes or whether a test image has been wrongly labelled which would be a data input error.

To do this, let’ss create a neat dataframe Predictions which contains the 'Image Index' of the various images in the test dataset, their corresponding 'Test Labels', 'Test Classes', 'Prediction Labels', 'Prediction Classes', and 'Prediction Probability'.

In [47]:
Predictions = pd.DataFrame({"Image Index" : list(range(len(test_labels))), 
                            "Test Labels" : list(test_labels), 
                            "Test Classes" : [train_classes[i] for i in test_labels],
                            "Prediction Labels" : list(np.asarray(pred_labels)),
                            "Prediction Classes" : [train_classes[i] for i in pred_labels],
                            "Prediction Probability" : [x for x in np.asarray(tf.reduce_max(preds, axis = 1))]})

As we have created the required dataframe Predictions, we can also create a new column 'Correct Prediction' which contains True for test images that have been correctly predicted by our model and False for wrongly predicted test images:

In [48]:
Predictions["Correct Prediction"] = Predictions["Test Labels"] == Predictions["Prediction Labels"]
Predictions

Great, our dataframe looks fantastic! However, this is not the final form of our dataframe. We want to subset our current dataframe with only those records where ‘Correct Prediction’ is equal to False as our aim is to plot and visualize the most wrong predictions of our fine-tuning model. We obtain our final dataframe using the following code block and sort the dataframe records in the descending order of the 'Prediction Probability':

In [50]:
Predictions = Predictions[Predictions["Correct Prediction"] == False].sort_values("Prediction Probability", ascending=False)
Predictions

In [51]:
indexes = np.asarray(Predictions["Image Index"][:9]) #choosing the top 9 records from the dataframe
plt.figure(figsize=(20,20))
for i, x in enumerate(indexes):
    plt.subplot(3,3,i+1)
    plt.imshow(test_images[x]/255)
    plt.title("True Class: " + train_classes[test_labels[x]] + " || " + "Predicted Class: " + train_classes[pred_labels[x]] + "\n" +
             "Prediction Probability: " + str(np.asarray(tf.reduce_max(preds, axis = 1))[x]))
    plt.axis(False);

Our final dataframe Predictions contains only the wrongly predicted image records (image indexes) along with their ground truth test labels, test classes, predicted labels, prediction classes sorted by the descending order of their prediction probabilities. We can manually take an image index out of this dataset and pass it into a plot function on the test dataset to obtain a visualization of the image with its ground truth class name and the predicted class name. However, here we will take the first 9 images or image indexes which are the most wrong images in our dataframe and plot it along with the true class names, the predicted class names and the corresponding prediction probabilities.

# Submission to Zindi plateforme


In [52]:
test = pd.read_csv('../input/data-test/Test.csv')
test = test[~test.Image_id.str.contains('_rgn')] # Just the RGB images

In [53]:
test_list = test['Image_id'].tolist()

In [54]:
submission = pd.DataFrame({'Image_id': test['Image_id']})

In [55]:
submission['blast'] = 0
submission['brown'] = 0
submission['healthy'] = 0

In [56]:
def convert_array(id_image): 
    path_image = '../input/test-images/Test_images/' + id_image
    img = Image.open(path_image)
    imgarray = np.asarray(img)
    tensor_from_img = tf.convert_to_tensor(imgarray)
    x = expand_dims(tensor_from_img, axis = 0)
    predict = model.predict(x)
    
    return predict[0].tolist()

In [57]:
for i, id_img in enumerate(test_list):
    submission.iloc[i, 1:] = convert_array(id_img)

In [58]:
submission

In [59]:
submission.to_csv('submissions.csv', index=False)

# Saving the Model

We append the model to the list of models as a new version

In [60]:
! mkdir '../working/models'

In [65]:
import os
model_version = max([int(i) for i in os.listdir('../working/models')+[0]]) + 1

In [67]:
model.save(f'../working/models/{model_version}')

In [69]:
model.save(f"../working/models/{model_version}/rice_disease.h5")

In [70]:
!tar -zcvf Saved_model.tar.gz  ./models