<img style="float: left;" src="./images/PalleAI-Banner1.png" width="800">

# Data Augumentation

### Build model by training on CIFAR Dataset, but now with data augumentation

<img style="float: left;" src="./images/cifar3.png" width="1000">

### Import needed libraries 

In [1]:
#Basic Python packages for data wrangling
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import random

#Tensorflow & Keras related packages
import tensorflow as tf
from tensorflow import keras
from keras import layers

from utils import plot_history

In [2]:
# same steps as before. 

### Load Cifar Dataset Preloaded in Keras

In [3]:
from tensorflow.keras.datasets import cifar10

In [4]:
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

### Understanding the Data 

In [5]:
type(train_images)

numpy.ndarray

In [6]:
train_images.shape 

(50000, 32, 32, 3)

In [7]:
train_labels.shape 

(50000, 1)

In [8]:
test_images.shape

(10000, 32, 32, 3)

In [9]:
test_labels.shape

(10000, 1)

### Preprocess the Input Data

In [10]:
train_images.shape

(50000, 32, 32, 3)

In [11]:
# Scale the data
#-------------------------------------------------------------
train_images = train_images.astype("float32") / 255 
test_images = test_images.astype("float32") / 255 

### Build the Neural Network Model Architecture

In [12]:
# Set random seeds for reproducibility
random.seed(42)
np.random.seed(42)
tf.random.set_seed(42)

In [13]:
# Build similar neural network except we are going to add augumentation layers

def model_cifar_augumented(): 
    inputs = keras.Input(shape = (32,32,3)) # Define Input shape
    
    augumentation_layers = keras.Sequential([layers.RandomFlip(), layers.RandomRotation(0.1),
                                             layers.RandomZoom(0.1), layers.RandomTranslation(0.2,0.2)])
    
    x = augumentation_layers(inputs)

    x = layers.Conv2D(filters=32, kernel_size = 3, activation="relu")(x) 
    # Convolution Layer with no padding and stride=1 (default)
    
    x = layers.MaxPooling2D(pool_size=2, strides = (2,2))(x) 
    # MaxPool Layer with size = 2 x 2, strides = 2

    x = layers.Conv2D(filters=64, kernel_size = 3, activation="relu")(x) 
    # Convolution Layer with no padding and stride=1 (default)
    
    x = layers.MaxPooling2D(pool_size=2, strides = (2,2))(x) 
    # MaxPool Layer with size = 2 x 2, strides = 2

    x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x) 
    # Convolution Layer with no padding and stride=1 (default)

    x = layers.Flatten()(x) 
    # Flatten

    outputs = layers.Dense(10, activation="softmax")(x) 
    # Dense output Layer
    
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

* **Dataset size is constant**: Augmentation does not change the dataset size but increases its effective variability.

* **Independently Random**: Each image in a batch undergoes independent random transformations 

* **Transformations Driven by Layer Parameters**: The specifics of the transformations are determined by the settings in each augmentation layer

For instance, if a batch contains 32 images, each image has an independent chance of being flipped, rotated, zoomed, or translated according to the specified probabilities and ranges in the augmentation layers.

In [14]:
model = model_cifar_augumented()
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 sequential (Sequential)     (None, 32, 32, 3)         0         
                                                                 
 conv2d (Conv2D)             (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 15, 15, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 6, 6, 64)         0         
 2D)                                                         

### Compile & Train the Model

In [15]:
#Compile the Model by configuring the loss function, optimizer type, 
#..& metrics to monitor the model performance

# we will use sgd optimizer
sgd = tf.keras.optimizers.SGD(lr=0.01, momentum=0.9) 

#compile the model
model.compile(optimizer=sgd, loss='sparse_categorical_crossentropy',  metrics = ["accuracy"]) 

  super().__init__(name, **kwargs)


In [16]:
# Call backs
# learning rate scheduler callback
def lr_scheduler(epoch):
    return 0.01 * (0.5 ** (epoch // 20))
reduce_lr = keras.callbacks.LearningRateScheduler(lr_scheduler) 

#model checkpoint callback
model_checkpoint = keras.callbacks.ModelCheckpoint(filepath = "./models/model_cifar_augumented.keras",
                                                   save_best_only=True, monitor="val_loss") 

callbacks = callbacks = [model_checkpoint, reduce_lr] 

In [17]:
# Train the model
history = model.fit(train_images, train_labels, epochs = 30, batch_size = 32, 
                    validation_split = 0.2, callbacks=callbacks)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30


Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


### Plotting the loss & Accuracy curves

In [None]:
plot_history(history)

In [None]:
# Model started overfitting after 10th epoch

### Evaluate the trained model on previously unseen test data 

In [None]:
model.evaluate(test_images, test_labels) 

In [None]:
# Due to randomness of neural network initialization, numbers may be slightly different each time u train