# Age Estimation and Gender Classification

In this assignment, you will train CNN models to estimate a person's age and gender by given a face image. Please read carefully all the instructions before you start to write your code.

**Your tasks**

You need train two CNN models:
- one is defined by you with a few restrictions and be trained from scratch, save it as `age_gender_A.h5`
- the other is to finetune a pretrained model, save it as `age_gender_B.h5`

**Dataset**

Your models will be trained and validated on a folder `train_val/` containing 5,000 labeled face images (size: 128 x 128), originated from the UTKFace dataset. During marking, your code will be tested on unseen test data. 

**Performance metric**

The metrics for measuring the performance on the test set are:
- age estimation: MAE (Mean Absolute Error)
- gender classification: accuracy

**Please use the GPU time wisely.**

Just be aware that there is some limit of free GPU usage (It is said the users can use up to 12 hours in row. But many people found they reached its limit far less than 12 hours.). Therefore, I would give you three suggestions to mimimise the risk of reaching the limit.

1. Make sure you have a stable internet connection.
2. Double check all the hyperparameters are good before you start to train the model.
3. According to my experience, each model should be trained in less than 2 hours. If much longer than that, you'd better consider adjusting the architecture.

In [None]:
import tensorflow as tf
import numpy as np
import tensorflow.keras as keras
import cv2
import random
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from os import listdir
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers

image_width = image_height = 128
image_channels = 3 # RGB
data_frame_labels = ['Filenames', 'Age', 'Gender']

## Setting Up: Mount the google drive
Mount your google drive to the notebook. 

Also don't forget to **enable GPU** before your training.


In [None]:
#
# Add your code here
#
from google.colab import drive
drive.mount('/content/drive', force_remount = True)
content_dir = "/content/drive/MyDrive/ML2CW1/"

In [None]:
# Load data locally
#### remove before submission
content_dir = '/mnt/g/My Drive/'

In [None]:
image_dir = content_dir + "train_val/"
file_names = listdir(image_dir)

image_paths = list(map(lambda x : image_dir + x, file_names))

labels = []
for name in file_names:
  label = name.split("_")[:2]
  labels.append(label)

## Visualize a few photos
It is always benificial to know your data well before you start. Here display a few (at least 20) images together with its corresponding age and gender from the `train_val/` folder to have a first impression of the dataset. You may also check what the size of the images are.

In [None]:
imgs_to_show = []
num_images = 20

random_image_paths = random.sample(image_paths, num_images)

for i in range(num_images):
  imgs_to_show.append(mpimg.imread(random_image_paths[i]))

axes=[]

for i in range(20):
  image = imgs_to_show[i]
  label = random_image_paths[i].split("/")[-1].split("_")[:2]
  fig = plt.figure()
  plt.xlabel("Age: " + label[0] + " Gender: " + label[1])
  plt.imshow(image)

plt.show()

## Rearrange the dataset
You may do any arrangement for the dataset to suit your later process, such as splitting into training set and validation set, saving the gender labels and age some how, and so on.


In [None]:
#
# Add your code here
#
import pandas as pd


ages = [np.asarray(item[0]).astype(float) for item in labels]
genders = [np.asarray(item[1]).astype(float) for item in labels]
data = np.array([image_paths, ages, genders]).T
df = pd.DataFrame(data = data, columns = data_frame_labels)


## STEP1: Data pre-processing
Now you need do some pre-processing before feeding data into a CNN network. You may consider:

1.	Rescale the pixel values (integers between 0 and 255) to [0,1]. **You must do this rescaling.** Otherwise the testing performance will be affected significantly, as the test images will be rescaling in this way. 
2.	Data augmentation.

**Don't rescale the age to [0,1].** Otherwise the testing performance will be affected significantly, as the original age is used in the testing stage. 

In [None]:
#
# Add your code here
#

image_datagen = ImageDataGenerator(rescale=1./255, rotation_range=20, zoom_range=0.1, horizontal_flip=True, validation_split = 0.2)
train_generator = image_datagen.flow_from_dataframe(df, x_col=data_frame_labels[0], y_col=data_frame_labels[1:], target_size=(image_width,image_height), color_mode='rgb', class_mode='raw', subset='training', batch_size=50)
validation_generator = image_datagen.flow_from_dataframe(df, x_col=data_frame_labels[0], y_col=data_frame_labels[1:], target_size=(image_width,image_height), color_mode='rgb', class_mode='raw', subset='validation', batch_size=50)

print(train_generator.n, train_generator.batch_size, train_generator.n // train_generator.batch_size)
print(validation_generator.n, validation_generator.batch_size, validation_generator.n // validation_generator.batch_size)


In [None]:
def train_gen_wrapped():
  batch = train_generator.next()
  casted = (np.array(batch[0]), np.array([[int(float(item[0])), int(float(item[1]))] for item in batch[1][:]]))
  while True:
    yield casted

def val_gen_wrapped():
  batch = validation_generator.next()
  casted = (np.array(batch[0]), np.array([[int(float(item[0])), int(float(item[1]))] for item in batch[1][:]]))
  while True:
    yield casted


sample = next(train_gen_wrapped())
sample[1][0:5]


## STEP2A: Build your own CNN network
Define your own CNN for classifying the gender and predicting the age. Though there are two tasks, you need **only one CNN model, but with two outputs** - you may search online for solution.

There are a few restrictions about your network as follows.
1.	The input size must be 128 x 128 x 3, which means you **should not resize** the original images. This is because my test code relies on this particular input size. Any other size will cause problem in the testing stage.
2.  Please treat the gender classification as a binary problem, i.e., **the output layer for the gender branch has only 1 unit**, instead of 2 (though it is correct to treat the gender classification as a mutli-class problem where class number is 2, the last layer has 2 units). This is because my test code only works for the 1-unit-last-layer gender branch setting. 
3.	The size of feature maps being fed to the first fully connected layer must be less than 10 x 10, while there is no number limitation about the depth.
4.	You may choose any techniques for preventing overfitting. 

In the end of the cell, use `modelA.summary()` to output the model architecture. You may also use `plot_model()` to visualize its architecture.

In [None]:
#
# Add your code here
#

# Can only really tune hyper-params, everything-else is managed by Keras.
# Hyper-params:
# - Kernel Size (1st layer must be 10x10 or smaller)
# - Number of Layers (Tho kinda dependent on kernal sizes? Must also be reasonable, pressume nothing crazy like 100 layers...)
# - Activation Function (probably some variation on ReLU, e.g. LeakyReLU)
# - Pooling Layers (Where to use, e.g. between each layer, or spread-out a bit)
# - Number of 'Features', e.g. how many kernals for a given layer
# - Using Single Layer & Kernel Size Vs Multiple Layers With Smaller Kernels (e.g. one 5x5 is equivilent to two 3x3 layers (with no pooling), latter reduces number of params, but obviously doubles number of layers)
# - Where to branch (e.g. how many layers do we share with each branch: split right away, or at point of classification? Basically how many features reused between branches)
# - 

# Not sure how best to tune the hyper-params to ensure avoid over-fitting to data, e.g. not just re-running with different config over-and-over.

# Train w/ & w/out for comparision w/in the report, see if any actual performance benefits?
#greyscale = Lambda(lambda c: tf.image.rgb_to_grayscale(c))(inputs) # Might want to branch here as colour could be helpful to distinguish grey/white hair from other colours, might be helpful for age branch

# Need one branch for age, other for gender
#   Worth encapsulating into methods to create each branch?

# Conv2d args (at least obviously important ones): https://keras.io/api/layers/convolution_layers/convolution2d/
# tf.keras.layers.Conv2D(
#     filters,     # Number of filters/features, depth of next layer
#     kernel_size, # Can either be int: n for n*n, or (int, int): (n, m) for n*m, can also do additional dimensions, e.g. to try and reduce the depth of prior layers
#     strides=(1, 1),
#     padding="valid",
#     ...
#     activation=None,
#     ...
# )

# MaxPooling2D & AveragePooling2D args: https://keras.io/api/layers/pooling_layers/max_pooling2d/  https://keras.io/api/layers/pooling_layers/average_pooling2d/
# tf.keras.layers.MaxPooling2D/AveragePooling2D(
#     pool_size=(2, 2), 
#     strides=None, 
#     padding="valid", 
#     ...
# )

# Layer Dimensions Calculations:
#   new_width  = (old_width - kernel_width + (2 * padding_x)) / (stride_x + 1)
#   new_height = (old_height - kernel_height + (2 * padding_y)) / (stride_y + 1)
#   new_depth  = old_depth # though was also *number_of_features, but seems these arre flattened for each feature

def create_modelA(greyscale):
  inputs = keras.Input((image_width, image_height, image_channels))

  if (greyscale == 2): # only apply greyscale to gender branch 
    gender_inputs = Lambda(lambda c: tf.image.rgb_to_grayscale(c))(inputs)
    gender_branch = create_modelA_common_layers(gender_inputs)
    gender_branch = create_gender_branch(gender_branch)

    age_branch = create_modelA_common_layers(inputs)
    age_branch = create_age_branch(age_branch)
  else: # greyscale for both or neither branch
    # greyscale for both branches
    greyscale_layer = Lambda(lambda c: tf.image.rgb_to_grayscale(c))(inputs) if greyscale == 1 else inputs
    common_layers = create_modelA_common_layers(inputs)

    gender_branch = create_gender_branch(common_layers)
    age_branch = create_age_branch(common_layers)

  # Creating 1 model w/ two branches per https://pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/
  return keras.Model(inputs=inputs, outputs=[age_branch, gender_branch], name="ModelA")

def create_modelA_common_layers(inputs):
  
  common_layers = layers.Conv2D(256, 5)(inputs) # 124, 124, 512
  common_layers = layers.LeakyReLU(0.2)(common_layers)
  common_layers = layers.Conv2D(128, 5)(common_layers) # 120, 120, 512
  # common_layers = layers.LeakyReLU(0.2)(common_layers)
  # common_layers = layers.Conv2D(128, 5)(common_layers) # 58, 58, 1024
  common_layers = layers.BatchNormalization(axis=1)(common_layers)
  common_layers = layers.MaxPooling2D((9,9), strides=(2,2))(common_layers) # 50, 50, 1024

  common_layers_a = layers.Conv2D(64, 5)(common_layers) # 46, 46, 64
  common_layers_a = layers.Conv2D(64, 5)(common_layers_a)  # 42, 42, 64
  common_layers_a = layers.LeakyReLU(0.2)(common_layers_a)

  common_layers_b = layers.Conv2D(64, 9)(common_layers) # 42, 42, 64
  common_layers_b = layers.LeakyReLU(0.2)(common_layers_b)

  common_layers = layers.Add()([common_layers_a, common_layers_b])
  common_layers = layers.BatchNormalization(axis=1)(common_layers)

  common_layers = layers.MaxPooling2D((7,7), (4,4))(common_layers) # 9, 9, 64
  # common_layers = layers.Conv2D(128, 3)(common_layers) # 7, 7, 128
  
  common_layers_c = layers.Conv2D(128, 3, padding='same')(common_layers) # 7, 7, 128 # padding to keep output same size as input
  common_layers_c = layers.LeakyReLU(0.2)(common_layers_c)
  common_layers_c = layers.Conv2D(64, 3, padding='same')(common_layers_c) # 7, 7, 64
  common_layers_c = layers.LeakyReLU(0.2)(common_layers_c)
  common_layers_c = layers.Conv2D(64, 3, padding='same')(common_layers_c) # 7, 7, 64
  common_layers_c = layers.LeakyReLU(0.2)(common_layers_c)
  common_layers_c = layers.Conv2D(32, 3, padding='same')(common_layers_c) # 7, 7, 32
  common_layers_c = layers.BatchNormalization(axis=1)(common_layers_c)

  common_layers_d = layers.Conv2D(128, 3, padding='same')(common_layers) # 7, 7, 128 # padding to keep output same size as input
  common_layers_d = layers.LeakyReLU(0.2)(common_layers_d)
  common_layers_d = layers.Conv2D(32, 3, padding='same')(common_layers_d) # 7, 7, 32
  common_layers_d = layers.BatchNormalization(axis=1)(common_layers_d) 
  
  common_layers = layers.Add()([common_layers_c, common_layers_d])
  common_layers = layers.GlobalAveragePooling2D()(common_layers)

  return common_layers

# Probably just need fully connected layers here, no more convolutions?
def create_gender_branch(inputs):
  gender_branch = layers.Dense(128)(inputs)
  gender_branch = layers.LeakyReLU(0.2)(gender_branch)

  gender_branch_a = layers.Dense(256)(gender_branch)
  gender_branch_a = layers.LeakyReLU(0.2)(gender_branch_a)
  gender_branch_a = layers.Dropout(0.05)(gender_branch_a)
  gender_branch_a = layers.Dense(64)(gender_branch_a)
  gender_branch_a = layers.LeakyReLU(0.2)(gender_branch_a)
  gender_branch_a = layers.Dropout(0.05)(gender_branch_a)
  gender_branch_a = layers.Dense(32)(gender_branch_a)
  gender_branch_a = layers.LeakyReLU(0.2)(gender_branch_a)
  gender_branch_a = layers.Dropout(0.05)(gender_branch_a)

  gender_branch_b = layers.Dense(32)(gender_branch)
  gender_branch_b = layers.LeakyReLU(0.2)(gender_branch_b)

  gender_branch = layers.Add()([gender_branch_a, gender_branch_b])
  gender_branch = layers.LeakyReLU(0.2)(gender_branch)
  gender_branch = layers.Dropout(0.05)(gender_branch)
  gender_branch = layers.Dense(2)(gender_branch)
  gender_branch = layers.ReLU()(gender_branch)
  gender_branch = layers.Dropout(0.05)(gender_branch)
  gender_branch = layers.Dense(1)(gender_branch)
  gender_branch = layers.Activation('sigmoid', name='gender_output')(gender_branch)
  return gender_branch

def create_age_branch(inputs):
  age_branch = layers.Dense(256)(inputs)
  age_branch = layers.LeakyReLU(0.2)(age_branch)
  age_branch = layers.Dropout(0.05)(age_branch)
  age_branch = layers.Dense(128)(age_branch)
  age_branch = layers.LeakyReLU(0.2)(age_branch)
  age_branch = layers.Dropout(0.05)(age_branch)
  age_branch = layers.Dense(16)(age_branch)
  age_branch = layers.LeakyReLU(0.2)(age_branch)
  age_branch = layers.Dense(1, name='age_output')(age_branch)

  return age_branch

modelA = create_modelA(0)

modelA.summary()
from keras.utils.vis_utils import plot_model
plot_model(modelA, show_shapes=True)

## STEP3A: Compile and train your model
Compile and train your model here. 
Save your model by `modelA.save(your_model_folder+"age_gender_A.h5")` after training. 

**DON'T use any other name for your model file.** This is because my test code relies on this particular model name. Any other file name will cause problem in the testing stage.

**Save the model with `save()` instead of `save_weights()`.** This is because I will load the model by 

`modelA = load_model(model_folder+"age_gender_A.h5")`. 


In [None]:
#
# Add your code here
#

losses = {
	"age_output": keras.losses.MeanAbsoluteError(),
  "gender_output": keras.losses.BinaryCrossentropy()
}
metrics= {
  "age_output": keras.metrics.MeanAbsoluteError(),
  "gender_output": keras.metrics.BinaryAccuracy()
}

# compile model
modelA_compiled = modelA.compile(loss=losses, optimizer=keras.optimizers.Adam(learning_rate=1e-3), metrics=metrics)
print("Compiling Model A")

modelA_epoch_count = 50 # ?
modelA_train_steps_per_epoch = train_generator.n // train_generator.batch_size
modelA_val_steps_per_epoch = validation_generator.n // validation_generator.batch_size

print("training steps per epoch: {}\nvalidation steps per epoch: {}".format(modelA_train_steps_per_epoch, modelA_val_steps_per_epoch))

print("Fitting Model A")
modelA_history = modelA.fit(
    x=train_gen_wrapped(),
    validation_data=val_gen_wrapped(),
    epochs=modelA_epoch_count,
    steps_per_epoch=modelA_train_steps_per_epoch,
    validation_steps=modelA_val_steps_per_epoch
)

modelA.save(content_dir+"age_gender_A.h5")

## STEP4A: Draw the learning curves
Draw four figures as follows
1.	The loss of the gender classification over the training and validation set
2.	The accuracy of the gender classification over the training and validation set
3.	The loss of the age estimation over the training and validation set
4.	The MAE of the age estimation over the training and validation set


In [None]:
#
# Add your code here
#
epochs = np.linspace(0, modelA_epoch_count, modelA_epoch_count)

fig, (gender_loss, gender_accuracy) = plt.subplots(2, sharex=True)
fig.suptitle("Gender Learning Curves")
fig.set_size_inches(12,8)

gender_loss.set_xlabel("Epoch")
gender_loss.set_ylabel("Gender Loss")
gender_loss.plot(epochs, modelA_history.history['gender_output_loss'], label='training')
gender_loss.plot(epochs, modelA_history.history['val_gender_output_loss'], label='validation')
gender_loss.legend()

gender_accuracy.set_xlabel("Epoch")
gender_accuracy.set_ylabel("Gender Accuracy")
gender_accuracy.plot(epochs, modelA_history.history['gender_output_binary_accuracy'], label='training')
gender_accuracy.plot(epochs, modelA_history.history['val_gender_output_binary_accuracy'], label='validation')
gender_accuracy.legend()

plt.show()

fig, (age_loss, age_mae) = plt.subplots(2, sharex=True)
fig.suptitle("Age Learning Curves")
fig.set_size_inches(12,8)

age_loss.set_xlabel("Epoch")
age_loss.set_ylabel("Age Loss")
age_loss.plot(epochs, modelA_history.history['age_output_loss'], label='training')
age_loss.plot(epochs, modelA_history.history['val_age_output_loss'], label='validation')
age_loss.legend()

age_mae.set_xlabel("Epoch")
age_mae.set_ylabel("Age Mean Absolute Error (MEA)")
age_mae.plot(epochs, modelA_history.history['age_output_mean_absolute_error'], label='training')
age_mae.plot(epochs, modelA_history.history['val_age_output_mean_absolute_error'], label='validation')
age_mae.legend()

plt.show()

## STEP2B: Build a CNN network based on a pre-trained model 
Choose one existing CNN architecture pre-trained on ImageNet, and fine-tune on this dataset.

The same as required in Model A, **don’t resize the input image size**. **The output layer for the gender branch is set to have only 1 unit**. 

In the end of the cell, use `modelB.summary()` to output the model architecture. You may also use `plot_model()` to visualize its architecture.


In [None]:
#
# Add your code here
#

# list of available models: https://keras.io/api/applications/
import tensorflow.keras.applications.resnet50 as ResNet50 # quick, lighweight, and fairly accurate  

inputs_b = keras.Input((image_width, image_height, image_channels))
base_resnet50 = ResNet50.ResNet50(input_tensor=inputs_b, weights='imagenet', include_top=False)

globalPool = layers.GlobalAveragePooling2D()(base_resnet50.output)

gender_branch = create_gender_branch(globalPool)
age_branch = create_age_branch(globalPool)

modelB = keras.Model(inputs=base_resnet50.inputs, outputs=[age_branch, gender_branch])

modelB.summary()
from tensorflow.keras.utils import plot_model
plot_model(modelB, show_shapes=True)

## STEP3B: Compile and train your model
Compile and train your model here. 
Save your model to `age_gender_B.h5` after training. 

**DON'T use any other name for your model file.** This is because my test code relies on this particular model name. Any other file name will cause problem in the testing stage.

**Also, save the model with `save()` instead of `save_weights()`.** 


In [None]:
#
# Add your code here
#
losses = {
  "age_output": keras.losses.MeanAbsoluteError(),
	"gender_output": keras.losses.BinaryCrossentropy()
}
metrics= {
  "age_output": keras.metrics.MeanAbsoluteError(),
  "gender_output": keras.metrics.BinaryAccuracy()
}

modelB_compiled = modelB.compile(loss=losses, optimizer='adam', metrics=metrics)
print("Compiling Model B")

modelB_epoch_count = 50 # ?
modelB_train_steps_per_epoch = train_generator.n // train_generator.batch_size
modelB_val_steps_per_epoch = validation_generator.n // validation_generator.batch_size

modelB_history = modelB.fit(
    x=train_gen_wrapped(),
    validation_data=val_gen_wrapped(),
    epochs=modelB_epoch_count,
    steps_per_epoch=modelB_train_steps_per_epoch,
    validation_steps=modelB_val_steps_per_epoch
)

modelB.save(content_dir+"age_gender_B.h5")

## STEP4B: Draw the learning curve
Draw four figures as follows
1.	The loss of the gender classification over the training and validation set
2.	The accuracy of the gender classification over the training and validation set
3.	The loss of the age estimation over the training and validation set
4.	The MAE of the age estimation over the training and validation set

In [None]:
#
# Add your code here
#
epochs = np.linspace(0, modelB_epoch_count, modelB_epoch_count)

fig, (gender_loss, gender_accuracy) = plt.subplots(2, sharex=True)
fig.suptitle("Gender Learning Curves")
fig.set_size_inches(12,8)

gender_loss.set_xlabel("Epoch")
gender_loss.set_ylabel("Gender Loss")
gender_loss.plot(epochs, modelB_history.history['gender_output_loss'], label='training')
gender_loss.plot(epochs, modelB_history.history['val_gender_output_loss'], label='validation')
gender_loss.legend()

gender_accuracy.set_xlabel("Epoch")
gender_accuracy.set_ylabel("Gender Accuracy")
gender_accuracy.plot(epochs, modelB_history.history['gender_output_binary_accuracy'], label='training')
gender_accuracy.plot(epochs, modelB_history.history['val_gender_output_binary_accuracy'], label='validation')
gender_accuracy.legend()

plt.show()

fig, (age_loss, age_mae) = plt.subplots(2, sharex=True)
fig.suptitle("Age Learning Curves")
fig.set_size_inches(12,8)

age_loss.set_xlabel("Epoch")
age_loss.set_ylabel("Age Loss")
age_loss.plot(epochs, modelB_history.history['age_output_loss'], label='training')
age_loss.plot(epochs, modelB_history.history['val_age_output_loss'], label='validation')
age_loss.legend()

age_mae.set_xlabel("Epoch")
age_mae.set_ylabel("Age Mean Absolute Error (MEA)")
age_mae.plot(epochs, modelB_history.history['age_output_mean_absolute_error'], label='training')
age_mae.plot(epochs, modelB_history.history['val_age_output_mean_absolute_error'], label='validation')
age_mae.legend()

plt.show()

## STEP5: Evaluate the model on the test set
I will add my test code here to test the two models you trained. The test set will not be available before your submission. 

The metrics for measuring the performance on the test set are:
- age estimation: MAE (Mean Absolute Error)
- gender classification: accuracy


In [None]:
#
# Don't add code in this cell when submitting this file
#