## **Introduction**

This notebook has been created for the final contest of Artificial Vision subject at University of Salerno.The aim of this project is to design a DCNN (as regressor or classifier) for age estimation on [VggFace2 dataset](https://github.com/ox-vgg/vgg_face2) labeled with ages by [MiviaLab](https://mivia.unisa.it/).

<br/>

We decided to build a classifier able to recognize 101 classes (ages from 0 to 100), in particular we choose the [Resnet50 model](https://github.com/WeidiXie/Keras-VGGFace2-ResNet50).

In this notebook, we show our **training procedure**.

## **Initialization**

First of all, we have to mount the Drive and to go in the folder where all operations has to be done because it contains all the needed files

In [None]:
from google.colab import drive
import os

drive.mount('/content/drive')
os.chdir('/content/drive/Shareddrives/ArtificialVision/FinalContest2020')

Mounted at /content/drive


Check if we are using a GPU

In [None]:
%tensorflow_version 2.x
import tensorflow as tf

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Cloning the Github repository of the GenderRecognitionFramework provided by MiviaLab, containing some code used later in this notebook

In [None]:
!git clone https://github.com/MiviaLab/GenderRecognitionFramework

Cloning into 'GenderRecognitionFramework'...
remote: Enumerating objects: 219, done.[K
remote: Counting objects: 100% (219/219), done.[K
remote: Compressing objects: 100% (174/174), done.[K
remote: Total 219 (delta 38), reused 202 (delta 30), pack-reused 0[K
Receiving objects: 100% (219/219), 9.01 MiB | 16.96 MiB/s, done.
Resolving deltas: 100% (38/38), done.
Checking out files: 100% (165/165), done.


## **Create Resnet50 model**

Define some useful parameters, according to CNN model and problem definition respectively

In [None]:
TARGET_SHAPE = (224, 224, 3)
NUM_CLASSES = 101

Cloning Github repository containing the code for building CNN model architecture chosen

In [None]:
!git clone https://github.com/WeidiXie/Keras-VGGFace2-ResNet50

Cloning into 'Keras-VGGFace2-ResNet50'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects: 100% (12/12), done.[K
remote: Compressing objects: 100% (12/12), done.[K
remote: Total 158 (delta 6), reused 0 (delta 0), pack-reused 146[K
Receiving objects: 100% (158/158), 36.73 MiB | 15.96 MiB/s, done.
Resolving deltas: 100% (43/43), done.
Checking out files: 100% (79/79), done.


Build model architecture, using functions provided by Github repository and then adding a final dense layer of 101 neurons for adapting the pre-trained net to solve our problem.

**N.B.** : *Vggface2_ResNet50()* function has been modified deleting some LOCs referring to model compiling that, for convenience, has been inserted later.

In [None]:
import sys
sys.path.append("./Keras-VGGFace2-ResNet50/src/")
from resnet import resnet50_backend
from model import Vggface2_ResNet50

from keras.models import Model
from keras.layers import Dense

# build the model, loading pre-trained weights on ImageNet or the ones passed as parameter
# and adding a final dense layer of 101 neurons with softmax activation function
def get_model(num_classes, weights_path=None):
  model_resnet = Vggface2_ResNet50()
  if weights_path != None:
    model_resnet.load_weights(weights_path)
  else:
    model_resnet.load_weights('weights.h5')
  
  model = Dense(num_classes, activation='softmax')(model_resnet.layers[-2].output)
  model = Model(inputs=model_resnet.input, outputs=model)

  # first 18 epochs of training
  '''
  for layer in model.layers[:-11]:
    layer.trainable = False
  '''
  
  # following 7 epochs of training
  for layer in model.layers:
    layer.trainable = True

  return model

In [None]:
model = get_model(NUM_CLASSES)
print(model.summary())

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
base_input (InputLayer)         [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1/7x7_s2 (Conv2D)           (None, 112, 112, 64) 9408        base_input[0][0]                 
__________________________________________________________________________________________________
conv1/7x7_s2/bn (BatchNormaliza (None, 112, 112, 64) 256         conv1/7x7_s2[0][0]               
__________________________________________________________________________________________________
activation (Activation)         (None, 112, 112, 64) 0           conv1/7x7_s2/bn[0][0]            
____________________________________________________________________________________________

## **Parameters, Callbacks and Model Compiling with them**

### Definition of useful parameters for training

In [None]:
#@title Parameters used in model compile

n_epochs = 25 #@param {type:"integer"}
batch_size = 128 #@param {type:"integer"}
initial_learning_rate = 0.005 #@param {type:"number"}
learning_rate_decay_factor = 0.2 #@param {type:"number"}
learning_rate_decay_epochs = 20 #@param {type:"number"}
momentum = 0.9 #@param {type:"number"}
patience =  5 #@param {type:"integer"}

In [None]:
#@title Parameters for logging directory

log_dir = "./logs/resnet50/" #@param {type:"string"}
mode = "training" #@param {type:"string"}

if not os.path.isdir(log_dir): os.mkdir(log_dir)

Creation of the directory used for storing model checkpints

In [None]:
from glob import glob
import re

dirnm = mode+"_logs/"
dirnm = os.path.join(log_dir, dirnm) #./logs/<net>/inference-training/
print("Log dir: {}".format(dirnm))
if not os.path.isdir(dirnm): os.mkdir(dirnm)

chk_dir = dirnm + "weights/" #./logs/<net>/inference-training/weights/
print("Checkpoint dir: {}".format(chk_dir))
if not os.path.isdir(chk_dir): os.mkdir(chk_dir)

filepath = os.path.join(chk_dir, "checkpoint.{epoch:02d}.h5")
ep_re = re.compile('checkpoint.([0-9]+).h5')

Log dir: ./logs/resnet50/training_logs/
Checkpoint dir: ./logs/resnet50/training_logs/weights/


This code allows to resume a previous started training from the last saved checkpoint done

In [None]:
resume = True #@param {type:"boolean"}

In [None]:
if resume:
  pattern = filepath.replace('{epoch:02d}', '*')
  epochs = glob(pattern)
  epochs = [int(x[-6:-3].replace('.', '')) for x in epochs]
  initial_epoch = max(epochs)
  print('Resuming from epoch %d...' % initial_epoch)
  model.load_weights(filepath.format(epoch=initial_epoch))
else:
  print("Training from scratch")
  initial_epoch = 0

Resuming from epoch 25...


### Callbacks definition

As a metric for evaluating the model, we use the MAE but the Keras implementation is not good for our purpose because we have adopted an one-hot encoding for labels but  we have as CNN output a vector of probabilities (as softmax activation function consequence), not 1 in the predicted class position and 0 otherwise.

For this reason we implement a custom MAE, which relies on Keras one.

In [None]:
import keras.backend as K
from tensorflow.keras.metrics import mean_absolute_error

@tf.function
def mae(y_true, y_pred):
  y_true = K.cast(y_true, y_pred.dtype)

  # as input we have a batch of arrays of 101 elements,
  # so for each array we extract the max probability as index for the predicted class
  ages_true = tf.map_fn(lambda true: K.argmax(true), y_true, dtype=tf.int64)
  ages_pred = tf.map_fn(lambda pred: K.argmax(pred), y_pred, dtype=tf.int64)

  return mean_absolute_error(ages_true, ages_pred)

This function allows to reduce the initial learning rate by a given factor after some epochs.


**N.B.** It is taken from /GenderRecognitionFramework/training/train.py file but has been copied here for solving technical problems. 

In [None]:
def step_decay_schedule(initial_lr, decay_factor, step_size):
    def schedule(epoch):
        return initial_lr * (decay_factor ** np.floor(epoch / step_size))

    return tf.keras.callbacks.LearningRateScheduler(schedule, verbose=1)

Compile previous created model with:

* *parameters*:
  * optimizer -> SGD with momentum
  * loss -> Categorical Cross Entropy
  * metrics -> Categorical Accuracy, MAE

* *callbacks*:
  * learning rate scheduling
  * model checkpoint saving
  * early stopping




In [None]:
from keras.optimizers import SGD
from keras.callbacks import TensorBoard, ModelCheckpoint, EarlyStopping
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow import keras

# Compilation parameters    
optimizer = tf.keras.optimizers.SGD(momentum=momentum)
loss = tf.keras.losses.categorical_crossentropy
metrics = [keras.metrics.categorical_accuracy, mae]

# Callbacks
lr_sched = step_decay_schedule(initial_lr=initial_learning_rate, decay_factor=learning_rate_decay_factor, step_size=learning_rate_decay_epochs)
checkpoint = tf.keras.callbacks.ModelCheckpoint(
    filepath, save_best_only=True, monitor="val_categorical_accuracy", mode='max'
)
tbCallBack = tf.keras.callbacks.TensorBoard(log_dir=log_dir, write_graph=True, write_images=True)
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', mode="min", patience=patience)
callbacks_list = [lr_sched, checkpoint, tbCallBack, early_stopping]



In [None]:
model.compile(
  loss=loss, 
  optimizer=optimizer, 
  metrics = metrics
)

## **Utilities definition**

This function allows to resize the image passed as input to a desired size; if image's size is less than target shape, it adds a black padding around the image, otherwise it resizes the images with an antialias filter before

In [None]:
%tensorflow_version 2.x
import tensorflow as tf

def custom_resize(image, h, w, target_shape):
  t_h, t_w = target_shape[0], target_shape[1]
  if h < t_h and w < t_w:
    image = tf.image.resize_with_crop_or_pad(image, t_h, t_w)
  else:
    image = tf.image.resize(image, (t_h, t_w), antialias=True)
  return image

This function preprocess the image passed as input, applying a normalization consisting in the subtraction of the mean of 3 color channel over all the dataset. It's taken from /GenderRecognitionFramework/training/dataset_tools.py and reported here for convenience.

In [None]:
import numpy as np
import sys
sys.path.append("./GenderRecognitionFramework/training")
from dataset_tools import mean_std_normalize

def vggface2_preprocessing(img):
  ds_means = np.array([131.0912, 103.8827, 91.4953]) # RGB
  ds_stds = None
  img = mean_std_normalize(img, ds_means, ds_stds)
  if (len(img.shape)<3 or img.shape[2]<3):
      img = np.repeat(np.squeeze(img)[:,:,None], 3, axis=2)
  return img

This is an alternative augmentation, tried for some epochs during the training experiments done but then discarded in favour of the other implementation.

In [None]:
def flip(img):
    img=np.fliplr(img)
    return img

def monochrome(x):
    x = cv2.cvtColor(x, cv2.COLOR_RGB2GRAY)
    if len(x.shape)==2:
        x = x[:,:,np.newaxis]
    x = np.repeat(x, 3, axis=2)
    return x

## **Dataset management**

This code block allows to create a dataset from the TFRecord file uploaded on Drive, distinguishing between test and validation/training TFRecord because they have different record format

In [None]:
%tensorflow_version 2.x
import tensorflow as tf
from functools import partial
import numpy as np

def read_tfrecord_test(example):
    tfrecord_format = (
        {
          'path': tf.io.FixedLenFeature([], tf.string),
          'image_raw': tf.io.FixedLenFeature([], tf.string),
        }
    )    
    return tf.io.parse_single_example(example, tfrecord_format)
    
def read_tfrecord(example):
    tfrecord_format = (
        {
          'path': tf.io.FixedLenFeature([], tf.string),
          'height': tf.io.FixedLenFeature([], tf.int64),
          'width': tf.io.FixedLenFeature([], tf.int64),
          'label': tf.io.FixedLenFeature([], tf.int64),
          'image_raw': tf.io.FixedLenFeature([], tf.string),
        }
    )
    return tf.io.parse_single_example(example, tfrecord_format)

def load_dataset(filenames, test):
    ignore_order = tf.data.Options()
    ignore_order.experimental_deterministic = False
    dataset = tf.data.TFRecordDataset(filenames) # create dataset from path passed as input
    dataset = dataset.with_options(ignore_order) # uses data as soon as it streams in, rather than in its original order
    
    # read dataset records according to its type (test or not)
    if not test:
      dataset = dataset.map(partial(read_tfrecord))
    else:
      dataset = dataset.map(partial(read_tfrecord_test))
    return dataset

def get_dataset(filenames, dataset_dim, test=False):
    dataset = load_dataset(filenames, test) 
    if not test: #shuffle elements at each epoch
      dataset = dataset.shuffle(dataset_dim//256, reshuffle_each_iteration=True).repeat()
    #This allows later elements to be prepared while the current element is being processed.
    dataset = dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE) 
    if not test: # set batch size
      dataset = dataset.batch(batch_size)
    return dataset


Creation of training and validation dataset from TFRecord

In [None]:
TRAINING_FILENAMES = "tfrecords/training_set_cropped.record"
VALID_FILENAMES = "tfrecords/validation_set_cropped.record" 

# parameters due to our subset of samples
tot_train_sample = 790487
tot_valid_sample = 344795

train_dataset = get_dataset(TRAINING_FILENAMES, tot_train_sample)
valid_dataset = get_dataset(VALID_FILENAMES, tot_valid_sample)

## Data Iterator with Augmentation and Pre-Processing

This code block creates an iterator over the prevously created dataset, applying:
* *for each image*, augmentation, normalization and resize
* *for each label*, conversion to one-hot encoding

**N.B.** The function *random_monochrome()* taken from Mivia framework has a modification at line 180 for adapting it to our dataset 

In [None]:
# create data generator
import numpy as np
from keras.utils import Sequence
import random

import sys
sys.path.append("./AgeRecognitionFramework/training")
from dataset_tools import random_flip, random_monochrome, random_brightness_contrast

class Generator(Sequence):
    def __init__(self, batch_size, elements, dataset, validation):
      self.batch_size = batch_size
      self.dataset_iterator = iter(dataset)
      self.elements = elements
      self.validation = validation

    def __len__(self):
      return int(np.ceil(self.elements// (self.batch_size)))

    def data_augmentation(self, image): #potentially all augmentations
        img = random_brightness_contrast(image)
        img = random_monochrome(img)
        return random_flip(img)

    def custom_augmentation(self, image): #only a kind of augmentation but surely
        aug = random.randint(1,3)
        if aug == 1:
          img = random_brightness_contrast(image)
        elif aug == 2:
          img = monochrome(image)
        else:
          img = flip(image)
        return img

    def decode_image(self, image): #decode bytes to image
        image = tf.image.decode_jpeg(image, channels=3)
        return image

    def __getitem__(self, idx):
        custom = False #True=augmentation with only one kind randomly chosen

        parsing_dict = self.dataset_iterator.get_next() #take a batch of TFRecord
        
        #read each image-label of the batch
        images = []
        labels = []
        for i in range(self.batch_size): 
          images.append((self.decode_image(parsing_dict["image_raw"][i])).numpy())
          labels.append(parsing_dict["label"][i])

        # process images&labels
        processed_images = []
        processed_labels = []        
        
        # process images
        for img in images:
          if not self.validation: #if it's a validation generator, not apply augmentation
            to_aug = random.randint(0,1) #random decision between aug e no_aug
            if to_aug:
              if custom:
                img = self.custom_augmentation(img)
              else:
                img = self.data_augmentation(img)
          img = vggface2_preprocessing(img)
          height, width = img.shape[0], img.shape[1]
          img = custom_resize(img, height, width, TARGET_SHAPE)
          processed_images.append(img)
        processed_images = np.asarray(processed_images)

        # transform labels to categorical
        for l in labels:
          l = np.array(keras.utils.to_categorical(int(l.numpy()), num_classes=NUM_CLASSES))
          processed_labels.append(l)
        processed_labels = np.asarray(processed_labels)

        return processed_images, processed_labels

## **Training**

Creation of training and validation generator on the respective datasets

In [None]:
train_generator = Generator(batch_size, tot_train_sample, train_dataset, validation=False)
valid_generator = Generator(batch_size, tot_valid_sample, valid_dataset, validation=True)

Launch of the training

In [None]:
history = model.fit_generator(
    epochs=n_epochs,
    verbose = 1,
    generator=train_generator,
    validation_data=valid_generator,
    initial_epoch = initial_epoch,
    callbacks = callbacks_list,
)

## Model saving and analysis plots

In [None]:
import os
from keras.models import save

model.save(os.path.join(log_dir,'/model/resnet50.h5'))

In [None]:
%load_ext tensorboard
%tensorboard --logdir=log_dir

In [None]:
import matplotlib.pyplot as plt

# Plot training & validation accuracy values
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

# Plot training & validation loss values
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()