# Transfer Learning (EfficientNetB0)

Instead of extracting high-level features, one can use pretrained models directly for classfication by adding some output layers. It is also possible to fine-tune such a model, though it is very ressource consuming.

I use here the original images, as data augmentation is done by two added input layers for random horizontal flipping and random rotation.

## 1. Models used for high-level feature extraction

 **Model**         | **Size (MB)** | **Top-1 Accuracy** | **Top-5 Accuracy** | **Parameters** | **Depth** | **Time (ms) per inference step (CPU)** | **Time (ms) per inference step (GPU)** 
------------------:|--------------:|-------------------:|-------------------:|---------------:|----------:|---------------------------------------:|---------------------------------------:
 InceptionV3       | 92            | 0.779              | 0.937              | 23,851,784     | 159       | 42.25                                  | 6.86                                   
 *EfficientNetB0*    | 29            | -                  | -                  | 5,330,571      | -         | 46                                     | 4.91                                   
 ResNet50          | 98            | 0.749              | 0.921              | 25,636,712     | -         | 58.2                                   | 4.55                                   
 VGG16             | 528           | 0.713              | 0.901              | 138,357,544    | 23        | 69.5                                   | 4.16                                   
 DenseNet121       | 33            | 0.75               | 0.923              | 8,062,504      | 121       | 77.14                                  | 5.38                                   
 Xception          | 88            | 0.79               | 0.945              | 22,910,480     | 126       | 109.42                                 | 8.06                                   
 InceptionResNetV2 | 215           | 0.803              | 0.953              | 55,873,736     | 572       | 130.19                                 | 10.02                                  


> Data source: https://keras.io/api/applications/#available-models  
> Table converter: https://tableconvert.com/excel-to-markdown

## 2. Import packages

In [2]:
import numpy as np
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator

## 3. Structure of `data/split` directory

```
data/split
└── 40X
    ├── test
    │   ├── adenosis
    │   ├── ductal_carcinoma
    │   ├── fibroadenoma
    │   ├── lobular_carcinoma
    │   ├── mucinous_carcinoma
    │   ├── papillary_carcinoma
    │   ├── phyllodes_tumor
    │   └── tubular_adenoma
    ├── train
    │   ├── adenosis
    │   ├── ductal_carcinoma
    │   ├── fibroadenoma
    │   ├── lobular_carcinoma
    │   ├── mucinous_carcinoma
    │   ├── papillary_carcinoma
    │   ├── phyllodes_tumor
    │   └── tubular_adenoma
    └── val
        ├── adenosis
        ├── ductal_carcinoma
        ├── fibroadenoma
        ├── lobular_carcinoma
        ├── mucinous_carcinoma
        ├── papillary_carcinoma
        ├── phyllodes_tumor
        └── tubular_adenoma
```

## 4. Define image generator

In [9]:
image_generator_train = ImageDataGenerator(
    #rescale=1/127.5,
    #rescale=127.5,
    #width_shift_range= 10.0,
    #height_shift_range= 10.0,
    #rotation_range = 20,
    #horizontal_flip = True,
    #vertical_flip = False,
    #zoom_range = 0.1,
    #channel_shift_range = 0.2,
    #brightness_range = (0,1),
    #shear_range = 0.2
)

In [10]:
image_generator_valtest = ImageDataGenerator(
    #rescale=1/127.5
)

In [11]:
image40Xtrain = image_generator_train.flow_from_directory(
    os.path.join('data','split','40X','train'),
    batch_size=4, 
    target_size=(460, 700),
    class_mode = 'sparse',
    shuffle=True
)

Found 1594 images belonging to 8 classes.


In [12]:
image40Xval = image_generator_valtest.flow_from_directory(
    os.path.join('data','split','40X','val'),
    batch_size=4, 
    target_size=(460, 700),
    class_mode = 'sparse',
    shuffle=True
)

Found 195 images belonging to 8 classes.


In [13]:
image40Xtest = image_generator_valtest.flow_from_directory(
    os.path.join('data','split','40X','test'),
    batch_size=4, 
    target_size=(460, 700),
    class_mode = 'sparse',
    shuffle=True
)

Found 206 images belonging to 8 classes.


#### Print shape of images and labels

In [14]:
imgs, labels = image40Xtrain.next()
print('Images:', imgs.shape)
print('Labels:', labels.shape)

Images: (4, 460, 700, 3)
Labels: (4,)


#### Print range of pixel values

In [15]:
print('lowest pixel value:',np.min(imgs), '\nhighest pixel value:', np.max(imgs))

lowest pixel value: 24.0 
highest pixel value: 255.0


## 5. Number of images per class for magnitude 40x

In [16]:
for imgset, imgset_title in zip([image40Xtrain, image40Xval, image40Xtest], ['train','val','test']):
    print('\n', imgset_title)
    for i in range(8):
        lb = list(imgset.class_indices)[i]
        sumclass = sum(imgset.labels==i)
        print('n:', sumclass, ', ratio:', round(sumclass/imgset.n*100), ': 40x',lb)
    print(imgset.n,': Total images with magnitude 40x',)


 train
n: 91 , ratio: 6 : 40x adenosis
n: 691 , ratio: 43 : 40x ductal_carcinoma
n: 202 , ratio: 13 : 40x fibroadenoma
n: 124 , ratio: 8 : 40x lobular_carcinoma
n: 164 , ratio: 10 : 40x mucinous_carcinoma
n: 116 , ratio: 7 : 40x papillary_carcinoma
n: 87 , ratio: 5 : 40x phyllodes_tumor
n: 119 , ratio: 7 : 40x tubular_adenoma
1594 : Total images with magnitude 40x

 val
n: 11 , ratio: 6 : 40x adenosis
n: 86 , ratio: 44 : 40x ductal_carcinoma
n: 25 , ratio: 13 : 40x fibroadenoma
n: 15 , ratio: 8 : 40x lobular_carcinoma
n: 20 , ratio: 10 : 40x mucinous_carcinoma
n: 14 , ratio: 7 : 40x papillary_carcinoma
n: 10 , ratio: 5 : 40x phyllodes_tumor
n: 14 , ratio: 7 : 40x tubular_adenoma
195 : Total images with magnitude 40x

 test
n: 12 , ratio: 6 : 40x adenosis
n: 87 , ratio: 42 : 40x ductal_carcinoma
n: 26 , ratio: 13 : 40x fibroadenoma
n: 17 , ratio: 8 : 40x lobular_carcinoma
n: 21 , ratio: 10 : 40x mucinous_carcinoma
n: 15 , ratio: 7 : 40x papillary_carcinoma
n: 12 , ratio: 6 : 40x phyllo

## 6. Transfer-learnig by using EfficentNetB0

> **The typical transfer-learning workflow**

> 1. Instantiate a base model and load pre-trained weights into it.
> 1. Freeze all layers in the base model by setting trainable = False.
> 1. Create a new model on top of the output of one (or several) layers from the base model.
> 1. Train your new model on your new dataset.

> see [The typical transferlearning workflow](https://keras.io/guides/transfer_learning/#the-typical-transferlearning-workflow)

#### Import packages

In [17]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.efficientnet import EfficientNetB0
from tensorflow.keras.applications.efficientnet import preprocess_input

The following workflow is adapted from [An end-to-end example: fine-tuning an image classification model on a cats vs. dogs dataset](https://keras.io/guides/transfer_learning/#an-endtoend-example-finetuning-an-image-classification-model-on-a-cats-vs-dogs-dataset).

#### 1. Instantiate a base model and load pre-trained weights into it

In [18]:
# Random data augmentation
data_augmentation = keras.Sequential(
    [layers.RandomFlip("horizontal"), layers.RandomRotation(0.1),]
)

2022-02-18 11:51:04.105075: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [19]:
# base model is EfficientNetB0
base_model = keras.applications.EfficientNetB0(
    weights="imagenet",  # Load weights pre-trained on ImageNet.
    input_shape=(460, 700, 3),
    include_top=False,
)  # Do not include the ImageNet classifier at the top.

#### 2. Freeze all layers in the base model by setting trainable = False

In [20]:
# Freeze the base_model
base_model.trainable = False

#### 3. Create a new model on top of the output of one (or several) layers from the base model

In [21]:
# Create new model on top
inputs = keras.Input(shape=(460, 700, 3))
x = data_augmentation(inputs)  # Apply random data augmentation
#x = inputs

In [22]:
# Pre-trained EfficientNetB0 weights requires that input be in a range of (0, 255)

# Therefore skip the following lines:
# Pre-trained Xception weights requires that input be scaled
# from (0, 255) to a range of (-1., +1.), the rescaling layer
# outputs: `(inputs * scale) + offset`
#scale_layer = keras.layers.Rescaling(scale=1 / 127.5, offset=-1)
#x = scale_layer(x)

In [23]:
# The base model contains batchnorm layers. We want to keep them in inference mode
# when we unfreeze the base model for fine-tuning, so we make sure that the
# base_model is running in inference mode here.
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dropout(0.2)(x)  # Regularize with dropout
outputs = keras.layers.Dense(8)(x)
model = keras.Model(inputs, outputs)

model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 460, 700, 3)]     0         
_________________________________________________________________
sequential (Sequential)      (None, 460, 700, 3)       0         
_________________________________________________________________
efficientnetb0 (Functional)  (None, 15, 22, 1280)      4049571   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 8)                 10248     
Total params: 4,059,819
Trainable params: 10,248
Non-trainable params: 4,049,571
______________________________________________

#### 4. Train your new model on your new dataset

In [40]:
model.compile(
    optimizer=keras.optimizers.Adam(),
    #loss=keras.losses.BinaryCrossentropy(from_logits=True),
    #metrics=[keras.metrics.BinaryAccuracy()],
    #optimizer='sgd',
    loss='sparse_categorical_crossentropy',
    metrics=['acc']
)

In [41]:
# End training when accuracy stops improving (optional)
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=6)

In [42]:
# Train model with a subsample
image40Xtrain.reset()
for i in range(20):
    print('batch number:',i)
    train_imgs, train_lbs = image40Xtrain.next()
    val_imgs, val_lbs = image40Xval.next()
    epochs = 5 #20
    model.fit(
        x=train_imgs, 
        y=train_lbs, 
        epochs=epochs, 
        validation_data=(val_imgs, val_lbs),
        callbacks=[early_stopping]
    )

batch number: 0
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 1
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 2
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 3
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 4
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 5
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 6
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 7
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 8
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 9
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 10
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 11
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 12
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 13
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch number: 14
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
batch

In [33]:
# Train model with whole sample
epochs = 2 #20
history = model.fit(
    x=image40Xtrain, 
    validation_data=image40Xval, 
    epochs=epochs, 
    callbacks=[early_stopping]
)

Epoch 1/2

KeyboardInterrupt: 

#### 5. (Additonal step) Do a round of fine-tuning of the entire model

In [43]:
# Unfreeze the base_model. Note that it keeps running in inference mode
# since we passed `training=False` when calling it. This means that
# the batchnorm layers will not update their batch statistics.
# This prevents the batchnorm layers from undoing all the training
# we've done so far.
base_model.trainable = True
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 460, 700, 3)]     0         
_________________________________________________________________
sequential (Sequential)      (None, 460, 700, 3)       0         
_________________________________________________________________
efficientnetb0 (Functional)  (None, 15, 22, 1280)      4049571   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 8)                 10248     
Total params: 4,059,819
Trainable params: 4,017,796
Non-trainable params: 42,023
______________________________________________

In [44]:
model.compile(
    optimizer=keras.optimizers.Adam(1e-5),  # Low learning rate
    #loss=keras.losses.BinaryCrossentropy(from_logits=True),
    #metrics=[keras.metrics.BinaryAccuracy()],
    loss='sparse_categorical_crossentropy',
    metrics=['acc']
)

In [45]:
epochs = 2 #10
model.fit(
    x=train_imgs, 
    y=train_lbs, 
    epochs=epochs, 
    validation_data=(val_imgs, val_lbs),
    callbacks=[early_stopping]
)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x18a282790>

In [None]:
# Train model with whole sample
epochs = 2 #20
history = model.fit(
    x=image40Xtrain, 
    validation_data=image40Xval, 
    epochs=epochs, 
    callbacks=[early_stopping]
)