# Transfer Learning with EfficientNetV2S
 
##### Get the new sorted dataset [here](https://tumde-my.sharepoint.com/:f:/g/personal/gohdennis_tum_de/EmooVZ4vE95Iic-HIP9-P10BzX7oIOBmRhK8Q9tYzfJWRQ?e=maOqo5) [08_Aug_2022]

Annotations are stored under notebooks/preprocesing/restructured_w_original_labels.json and do not to be moved (also in the .zip file)

Extract the zip from the link (sort.zip) under data/.


<hr style="height:2px;border-width:0;color:black;background-color:black">

This notebook will show the EfficientNetWrapper in action.



### [EfficientNet](https://paperswithcode.com/method/efficientnet)

The EfficientNet introduces a model scaling methods and applies it to ResNet and MobileNets. Additionally, the researchers apply neural architecture search to design a new baseline network (EfficientNet), to then scale it up to create the EfficientNet family. The EfficientNetB7 achieves state-of-the-art 84.3% top-1 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than the best existing ConvNet.

In the second iteration of the network architecture, the researches improve on model performance and size by using Fusion-MBConv layer instead of the classical MBCOnv layer introduced with MobileNet

### [Keras Availability](https://keras.io/api/applications/efficientnet_v2/)

The entire network and the pretrained weights for the LSVRC[ImageNet Large Scale Visual Recognition Challenge] are provided by keras. The ImageNet dataset is a large scale collection of labled images with:
- 14 million images
- 1 million images with bounding boxes
- 20.000 categories using WordNet schema (eg. family then species then race)

Lets begin by preparing and inspecting our data. When we feel confident in our ability to handle what is provided, we can begin to fit the SOTA network. Since the training is computationally demanding, make sure tensorflow is running on your GPU.

In [8]:
import json
import os
import zipfile
import tensorflow as tf
import matplotlib.pyplot as plt
from PIL import Image
from pathlib import Path
import tensorflow as tf
import seaborn as sns
import numpy as np
import pandas as pd
import pdb
import shutil
from models.ilsvrc import EfficientNetV2S
from preprocessing.rand_augmenter import RandAugmenterWrapper
from keras.callbacks import  EarlyStopping


In [9]:
tf.get_logger().setLevel('INFO')
tf.test.gpu_device_name()

2022-08-25 19:37:43.603436: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-08-25 19:37:43.603680: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-08-25 19:37:43.603857: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-08-25 19:37:43.604111: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-08-25 19:37:43.604299: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from S

'/device:GPU:0'

## I. Load Data
To begin our showcase, we load the data from the directory, after setting it up as specified.

In [10]:
image_path = Path(os.getenv("DATA"), "sort")

train_ds = tf.keras.utils.image_dataset_from_directory(directory=image_path,
                                                       validation_split=0.3,
                                                       subset='training',
                                                       seed=0,
                                                       image_size=(224, 224))
val_ds = tf.keras.utils.image_dataset_from_directory(directory=image_path,
                                                     validation_split=0.3,
                                                     subset='validation',
                                                     seed=0,
                                                     image_size=(224, 224))


Found 897 files belonging to 4 classes.
Using 628 files for training.
Found 897 files belonging to 4 classes.
Using 269 files for validation.


## II. Model Configuration
Next we can configure our model. The configuration targets the top networks only. Most parameters are self explanatory. 
- layer_size: either as integer or tuple.
- depth: Creates sequential network with layer_size if the later is interger. Else ignored.
- pooling_type: either max or average

In [11]:
model_config = {"layer_size": (128, 32), "dropout": 0.1, "pooling_type": "max"}

model = EfficientNetV2S(**model_config)

## III. Data Augmentation

Before running our model, we need prepend an augmentation layer.

In [12]:
augmentation_ops = [
    'Invert',
    'Rotate',
    'Posterize',
    'Solarize',
    'SolarizeAdd',
    'Color',
    'Contrast',
    'Brightness',
    'TranslateX',
    'TranslateY',
]

augmentation_layer = RandAugmenterWrapper(num_layers=2,
                                            magnitude=7,
                                            op_list=augmentation_ops)
model.prepend_layer(augmentation_layer)



## IV. Training the Model
We can now fit the model to the data. Remember, the model holds every method from tf.keras.Model and can be called in such manner.

In [13]:
model.compile(optimizer= "adam", loss= "sparse_categorical_crossentropy", metrics= ['accuracy'])

In [14]:
from gc import callbacks


es_callback = EarlyStopping(
    patience=10,
    restore_best_weights=True,
)

model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=40,
    callbacks=[es_callback]
)


Epoch 1/40


2022-08-25 19:37:53.922688: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8500
2022-08-25 19:37:54.565580: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory


 3/20 [===>..........................] - ETA: 1s - loss: 5.9425 - accuracy: 0.3021

2022-08-25 19:37:55.244322: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.


Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40


<keras.callbacks.History at 0x7fea9419fb80>

We can proceed by unfreezing blocks. But first lets insepect the total number of blocks available by the model.

In [15]:
model.num_blocks

29

In [16]:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.00001),
              loss="sparse_categorical_crossentropy",
              metrics=['accuracy'])
model.trainable_blocks = 10
model.fit(train_ds, validation_data=val_ds, epochs=40, callbacks=[es_callback])


Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


<keras.callbacks.History at 0x7fe9a1fe49a0>

## V. Score the Model

The model can be evaluated as usually.

In [17]:
model.evaluate(val_ds)



[0.6971398591995239, 0.832713782787323]

## VI. Nice to Have's


In [18]:
model.plot_base_model()

You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model/model_to_dot to work.
