<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 5px; height: 75px">

# The Caterpillar Effect : A Caterpillar Recognition Model
Author: Sharifah Nurulhuda, DSI-SG-41 

### 01_Data Augmentation

In this notebook, we use image transformations to augment our data in order to have a more robust dataset for modelling.

We start off with 20 images of each caterpillar, which we then pass through this notebook to get 260 images each.

### Import Libraries

In [1]:
# libraries for importing and exporting images
from PIL import Image
import os
import cv2

#libraries for augmenting images
import imgaug.augmenters as iaa


### Defining the Function for Transforming the Images

The function `transforming_images` takes in an image and transforms the image in the following ways:

1. Gamma Contrast
2. Sigmoid Contrast
3. Linear Contrast
4. Cropping
5. Elastic
6. Polar
7. Jigsaw
8. Shear
9. Adding Noise
10. Rotation
11. Horizontal Flip
12. Vertical Flip

The function returns a list of arrays `transformed_images_list` representing the orginal image and one from each transformation.

In [2]:
def transforming_images(input_img):

    # gamma contrast
    contrast=iaa.GammaContrast((0.5, 2.0))
    input_contrast = contrast.augment_image(input_img)
    
    # sigmoid contrast
    contrast_sig = iaa.SigmoidContrast(gain=(5, 10), cutoff=(0.4, 0.6))
    sigmoid_contrast = contrast_sig.augment_image(input_img)

    # linear contrast
    contrast_lin = iaa.LinearContrast((0.6, 0.4))
    linear_contrast = contrast_lin.augment_image(input_img)

    # cropping the image by 30%
    crop1 = iaa.Crop(percent=(0, 0.3)) 
    input_crop1 = crop1.augment_image(input_img)
    
    # # applying an elastic transformation
    # elastic = iaa.ElasticTransformation(alpha=60.0, sigma=4.0)
    # input_elastic = elastic.augment_image(input_img)

    # # warping image
    # polar = iaa.WithPolarWarping(iaa.CropAndPad(percent=(-0.2, 0.7)))
    # input_polar = polar.augment_image(input_img)

    # # pixellising image to look like a jigsaw
    # jigsaw = iaa.Jigsaw(nb_rows=20, nb_cols=15, max_steps=(3, 7))
    # input_jigsaw = jigsaw.augment_image(input_img)

    # shearing image by random amounts ranging from -40 to 40 degrees
    shear = iaa.Affine(shear=(-40,40))
    input_shear = shear.augment_image(input_img)

    # adding noise to the image
    noise = iaa.AdditiveGaussianNoise(10,40)
    input_noise = noise.augment_image(input_img)

    # rotation in degrees
    rot1 = iaa.Affine(rotate=(-50,20))
    input_rot1 = rot1.augment_image(input_img)

    # horizontal Flip
    hflip = iaa.Fliplr(p=1.0)
    input_hf = hflip.augment_image(input_img)

    # vertical Flip
    vflip = iaa.Flipud(p=1.0) 
    input_vf = vflip.augment_image(input_img)

    #creating a list of all the transformed images
    transformed_images_list = [input_img, input_contrast, sigmoid_contrast, linear_contrast, input_crop1, input_hf, input_vf, input_rot1, input_noise, input_shear]     # input_polar, input_elastic, input_jigsaw

    return transformed_images_list

### Loading the Images and Passing them through the Transformation

In this section, the images are loaded into the notebook and passed through the function `transforming_images`. 

We use the `os`, `cv2` and `PIL` libraries to access and save the images.

First, the images are read from their respective folders. They are then passed through the function `transforming_images` and saved as an augmented image in their respective destination folders. These augmented images will then be used for modelling.

In [3]:
folder_list = ['chocolate_pansy', 'lime_caterpillar', 'painted_jezebel', 'plain_tiger']

for folder in folder_list:

    # defining directory where images are stored and where to call for images
    image_folder = f'../data/caterpillars/{folder}'
    image_files = os.listdir(image_folder)

    # Read images using cv2.imread() and store them in a list
    orig_images_list = []

    for filename in image_files:
        if not filename.startswith('.'):         #to ignore hidden files with '.' as first char in extension (e.g. '.DS_Store')
            filepath = os.path.join(image_folder, filename)
            image = cv2.imread(filepath)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)          #change colour scheme from bgr (os default) to rgb for saving
            if image is not None:
                orig_images_list.append(image)

    # Create the destination folder if it doesn't exist
    destination_folder = f'../data/augmented/{folder}'
    os.makedirs(destination_folder, exist_ok=True)

    # looping through the list of original images for transforming, naming and saving image

    i=0                 #setting i = 0 at the start, so that all filenames will be unique

    for image in orig_images_list:

        transformed_list = transforming_images(image)

        for transformed_image in transformed_list:
            
            # Convert each NumPy array in the list to a PIL Image object
            image_pil = Image.fromarray(transformed_image)

            # Save each image with a filename in the destination folder
            filename = os.path.join(destination_folder, f'aug_{folder}_{i}.jpg')
            image_pil.save(filename)

            # adding one to i at each loop to change the filename accordingly
            i+=1

### Summary of Augmented Data

We now have our set of images to be used for modelling. 

They are stored in the following path: 

|Species|Destination Folder|Original Number of Images|Final Number of Augmented Images|
|-----|-----|-----|-----|
|Chocolate Pansy|`data/augmented/chocolate_pansy`|20|260|
|Lime|`data/augmented/lime_caterpillar`|20|260|
|Painted Jezebel|`data/augmented/painted_jezebel`|20|260|
|Plain Tiger|`data/augmented/plain_tiger`|20|260|