<a href="https://colab.research.google.com/github/nyp-sit/iti107/blob/main/session-3/1.baseline_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Baseline model

Welcome to this week's programming exercise. In this exercise, we will be training a model to recognise if an image depicts positive (e.g. happy, pleasant, beautiful) or negative (e.g. sad, angry, death, etc) emotion . We will first train a baseline model without using transfer learning. The dataset is a collection of around 1600 images from Flickr, and labelled with Positive or Negative label. We only apply data augmentation to our training set. In the next exercise, we will use transfer learning technique to train another model and compare the performance of both.

At the end of this exercise, you will be able to: 
- apply data augmentation to your training data

In [None]:
import os
import numpy as np

import tensorflow as tf
import tensorflow.keras as keras

from sklearn.metrics import classification_report

import matplotlib
import matplotlib.pyplot as plt


## Download the dataset


In [None]:
dataset_URL = 'https://nyp-aicourse.s3.ap-southeast-1.amazonaws.com/iti107/datasets/intel_emotions_dataset.zip'
path_to_zip = keras.utils.get_file('intel_emotions_dataset.zip', origin=dataset_URL, extract=True, cache_dir='.')
print(path_to_zip)

The zip file will be expanded into two subfolders, 'Positive' and 'Negative', containing images that evokes positive emotions and negative emotions respectively. 

In [None]:
dataset_dir = os.path.dirname(path_to_zip)
print(dataset_dir)
pos_path = os.path.join(dataset_dir, 'Positive')
neg_path = os.path.join(dataset_dir, 'Negative')

### Visualizing sample images

We randomly select `n_examples` and display them.

**WARNING**: Some of the images may be too graphic and offensive. Please feel free to skip the following two cells. 

In [None]:
n_examples = 5
np.random.seed(42)
positive_expamples = np.random.choice(os.listdir(pos_path), size=n_examples, replace=False)
negative_expamples = np.random.choice(os.listdir(neg_path), size=n_examples, replace=False)

In [None]:
plt.figure(figsize=(5, n_examples * 2))
for i in range(n_examples):
    plt.subplot(n_examples, 2, i * 2 + 1)
    img = keras.utils.load_img(os.path.join(pos_path, positive_expamples[i]))
    plt.imshow(img)
    plt.axis("off")
    if i == 0:
        plt.title("Positive", fontsize=18)
    plt.subplot(n_examples, 2, i * 2 + 2)
    img = keras.utils.load_img(os.path.join(neg_path, negative_expamples[i]))
    plt.imshow(img)
    plt.axis("off")
    if i == 0:
        plt.title("Negative", fontsize=18)

## Create train and validation dataset

We will use the `tf.keras.preprocessing.image_dataset_from_directory` to generate tf.data.Dataset from the data folder. Feel free to adjust the batch_size to the maximum without incurring OOM (out-of-memory) error. GPU usually have limited memory. We also use a smaller image size (128,128) to speed up our training. Although `label_mode` is not required to be specified (and can be infered from the number of subfolders), we specifically set the `label_mode='binary'`, in case our datasets folder contains more than 2 subfolders, as sometimes jupyter notebook will generate a hidden folder called '.ipynb_checkpoints, and keras may think there are 3 different labels. By setting the label_mode to binary allows us to specifically detect this kind of issues. 

In [None]:
batch_size = 8
image_size = (128,128)

train_ds = keras.preprocessing.image_dataset_from_directory(
    dataset_dir,
    validation_split=0.2,
    subset="training",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
    label_mode='binary'
)
val_ds = keras.preprocessing.image_dataset_from_directory(
    dataset_dir,
    validation_split=0.2,
    subset="validation",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
    label_mode='binary'
)


In [None]:
# Print the class names 
print(val_ds.class_names)

In [None]:
train_ds = train_ds.cache().prefetch(buffer_size=tf.data.AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=tf.data.AUTOTUNE)

## Data Augmentation 

Since tensorflow 2.2, Keras introduces new types of layers for doing image data augmentation, such as Random Cropping, Random Flipping, etc. Previously, we have to depend on ImageDataGenerator() (which is a lot slower) to do so. Before tensorflow 2.6, they are available as experimental layers (available in the tf.keras.layers.experimental.preprocessing package), but has been officially supported from tensorflow 2.6 onwards (i.e. available as part of the tf.keras.layers). 

In the code below, we create a Sequential model to add the image augmentation layer: `RandomRotation()`. The value `0.3` refers to the maximum rotation angle in both clock-wise and anti-clockwise direction. You can find out more info from the [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers/RandomRotation)

In [None]:
data_augmentation = keras.Sequential(
    [
        keras.layers.RandomRotation(0.3),
    ]
)

To see the effects of data augmentation, let us apply our data_augmentation layer to a sample image.

In [None]:
images, _ = next(train_ds.take(1).as_numpy_iterator())
sample_image = images[0]/255.
plt.imshow(sample_image)
sample_image = tf.expand_dims(sample_image, 0)
print(sample_image.shape)

In [None]:
plt.figure(figsize=(8, 4))
for i in range(8):
    augmented_image = data_augmentation(sample_image)
    ax = plt.subplot(2, 4, i + 1)
    plt.imshow(augmented_image[0])
    plt.axis("off")

**Exercise 1:**

Modify the code above to add in Random Contrast and Random Cropping. Choose the appropriate values for the contrast and cropping factor.

<details><summary>Click here for answer</summary>

```python
    
if tf.version.VERSION >= '2.6.0':
    data_augmentation = keras.Sequential(
        [
            layers.RandomRotation(0.3),
            layers.RandomContrast(0.8),
            layers.RandomZoom(0.8)
        ]
    )
else: 
    data_augmentation = keras.Sequential(
        [
            layers.experimental.preprocessing.RandomRotation(0.3),
            layers.experimental.preprocessing.RandomContrast(0.8),
            layers.experimental.preprocessing.RandomZoom(0.8),
        ]
    )

    
```
    
</details>

## Build the model

Previously we have built the mini-Xception network and it works well on our small cats and dogs dataset.  We will apply the same network for this more challenging emotions dataset and see if data augmentation helps.

The following codes are same as previous xception network that you have coded. 

**Exercise 2:**

Modify the code in `make_model()` to apply data augmention layers you have created earlier. Where should you place your augmentation layer?  

<details><summary>Click here for answer</summary>

```python
def make_model(input_shape, num_classes): 
    inputs = keras.Input(shape=input_shape)    
    
    ## Add your augmentation layers here !! 
    x = data_augmentation(inputs) 

    x = layers.Rescaling(1.0 / 255)(inputs)

    ## the rest of the codes.... 
    
    return keras.Model(inputs, outputs)    
```
    
</details>

In [None]:
def xception_block(x, depth): 

    skip_connection = x
    
    x = keras.layers.SeparableConv2D(depth, 3, padding="same")(x)
    x = keras.layers.BatchNormalization()(x)
    x = keras.layers.Activation("relu")(x)
    x = keras.layers.SeparableConv2D(depth, 3, padding="same")(x)
    x = keras.layers.BatchNormalization()(x)
    x = keras.layers.MaxPooling2D(3, strides=2, padding="same")(x)
    residual = keras.layers.Conv2D(depth, 1, strides=2, padding="same")(
        skip_connection
    )
    x = keras.layers.add([x, residual])  # Add back residual
    x = keras.layers.Activation("relu")(x)
    
    return x # Set aside next residual

In [None]:
## TODO: Modify the code to add data augmentation

def make_model(input_shape, num_classes): 
    inputs = keras.Input(shape=input_shape)
    x = data_augmentation(inputs) 
    x = keras.layers.Rescaling(1.0 / 255)(inputs)

    x = keras.layers.Conv2D(32, 3, strides=2, padding="same")(x)
    x = keras.layers.BatchNormalization()(x)
    x = keras.layers.Activation("relu")(x)
    x = keras.layers.Conv2D(64, 3, padding="same")(x)
    x = keras.layers.BatchNormalization()(x)
    x = keras.layers.Activation("relu")(x)
    
    # our xception blocks
    for size in [128, 256, 512, 728]:
        # Code here
        x = xception_block(x, size)
    
    x = keras.layers.SeparableConv2D(1024, 3, padding="same")(x)
    x = keras.layers.BatchNormalization()(x)
    x = keras.layers.Activation("relu")(x)
    x = keras.layers.GlobalAveragePooling2D()(x)
    
    if num_classes == 2:
        activation = "sigmoid"
        units = 1
    else:
        activation = "softmax"
        units = num_classes

    x = keras.layers.Dropout(0.5)(x)
    outputs = keras.layers.Dense(units, activation=activation)(x)
    
    return keras.Model(inputs, outputs)

In [None]:
model = make_model(input_shape= image_size + (3,), num_classes=2)

## Train the model

Let's train our new model with the data augmentation layer. 

In [None]:
def create_tb_callback(): 

    import os
    
    root_logdir = os.path.join(os.curdir, "tb_logs")

    def get_run_logdir():    # use a new directory for each run
        
        import time
        
        run_id = time.strftime("run_%Y_%m_%d-%H_%M_%S")
        return os.path.join(root_logdir, run_id)

    run_logdir = get_run_logdir()

    tb_callback = keras.callbacks.TensorBoard(run_logdir)

    return tb_callback

model_checkpoint_callback = keras.callbacks.ModelCheckpoint(
    filepath="bestcheckpoint",
    save_weights_only=True,
    monitor='val_accuracy',
    mode='max',
    save_best_only=True)


# compile our model with loss and optimizer 
model.compile(
    optimizer=keras.optimizers.Adam(1e-3),
    loss="binary_crossentropy",
    metrics=["accuracy"],
)

model.fit(
    train_ds, 
    epochs=15, 
    validation_data=val_ds,
    callbacks=[model_checkpoint_callback, create_tb_callback()]
)

In [None]:
%load_ext tensorboard
%tensorboard --logdir tb_logs


As you can see from the plot, our model starts to overfit from epoch 10 onwards and the validation accuracy fluctuates around 56-57%. Even with data augmentation, the augmented images are still heavily correlated, since they come from a small number of original images -- we cannot produce new information, we can only remix existing information. As next step to improve our accuracy on this problem, we will have to leverage transfer learning using pre-trained model, which will be the focus of next exercises.