# Data Augmentation
Data augmentation is applied exclusively to the training set to expand the dataset size and enhance variability. The augmentation techniques used include rotation, shifting, flipping, and shearing to introduce diversity and improve model robustness.

### Import the necessary package

In [1]:
# handling file and directory
import os
# implement augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator

### Get the necessary variables

In [27]:
path_trainset = "./dataset/splited_dataset/train"
path_aug = "./dataset/augmented_dataset"
labels = os.listdir(path_trainset)

### Create data generator
The data generator is set up to apply data augmentation techniques tailored to the dataset’s characteristics. ImageDataGenerator is selected for its ease of use and efficient handling of augmented image storage, allowing the augmentation to be seamlessly integrated into the training process.

In [21]:
traingen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    horizontal_flip=True,
    vertical_flip=True,
    rescale=1./255,
)

### Generate the augmented image
The augmented image is generated by looping the traingen created

In [32]:
# loop through each label
for label in labels:
    # create a new directory for each label
    os.makedirs(os.path.join(path_aug, 'train', label), exist_ok=True)
    # implement augmentation on the trainset and configure the save directory
    trainset = traingen.flow_from_directory(
        path_trainset,
        target_size=(224, 224),
        batch_size=32,
        classes=[label],
        class_mode='categorical',
        save_to_dir=f'./dataset/augmented_dataset/train/{label}',
        save_prefix=f'augmented_{label}',
        save_format='jpeg',
    )
    # create an exit trigger to stop the loop (the loop will run 6 times the length of the trainset)
    exit_trigger = len(trainset) * 6
    # generate the augmented images
    for batch_count, batched in enumerate(trainset):
        # exit the loop if the exit trigger is reached
        if batch_count+1 == exit_trigger:
            break
    print(f'{label} augmented images')

Found 322 images belonging to 1 classes.


cardboard augmented images
Found 401 images belonging to 1 classes.
glass augmented images
Found 328 images belonging to 1 classes.
metal augmented images
Found 475 images belonging to 1 classes.
paper augmented images
Found 385 images belonging to 1 classes.
plastic augmented images
Found 110 images belonging to 1 classes.
trash augmented images
