<a href="https://colab.research.google.com/github/engcarlo/TransferLearning-Datasets/blob/main/DataAug-App.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

"""
https://keras.io/api/data_loading/image/

## Data Augmentation

Data augmentation is a technique used to increase the size and diversity of a training dataset by applying random transformations to the original images. This helps to prevent overfitting and improve the generalization ability of the model.

In this section, we will define functions that perform common data augmentation techniques such as rotation, zooming, and horizontal flipping using Keras's `ImageDataGenerator`.
"""

## Libraries

In [1]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
import numpy as np
import os

### Getting a dataset

The first step is going to be to load our data. As our example, we will be using the dataset **Cats vs. Dogs**. This dataset is a classic image classification dataset containing images of cats and dogs, and is suitable for demonstrating transfer learning techniques.

To obtain this dataset, you can download it from the Github Repository using the following link:

[https://github.com/engcarlo/TransferLearning-Datasets/tree/main/Dataset](https://github.com/engcarlo/TransferLearning-Datasets/tree/main/Dataset)

Alternatively, it can be found in may Kaggle profile dataset using the following link:

[https://www.kaggle.com/datasets/quantyukio/cats-and-dogs-sample2transfer-learning](https://www.kaggle.com/datasets/quantyukio/cats-and-dogs-sample2transfer-learning)

Moreover, if you have already downloaded and organized the dataset, ensure it is arranged in a similar fashion to the structure expected by the code in this notebook, with images organized into subfolders, one for each class (e.g., 'Cats' and 'Dogs'). If you wish to use your own dataset with a different structure, you will need to modify the data loading code to correctly load your images and their corresponding labels. The helper function `get_image(path)` can be used to load and preprocess individual images to the required 224x224 size for the VGG16 model.

If your dataset is organized with subfolders for each class, the following cell should load all the data correctly by just replacing `root` with the path to your dataset folder.

In [5]:
user = "engcarlo"
repo = "TransferLearning-Datasets"

# remove local directory if it already exists
if os.path.isdir(repo):
    !rm -rf {repo}

!git clone https://github.com/{user}/{repo}.git

Cloning into 'TransferLearning-Datasets'...
remote: Enumerating objects: 240, done.[K
remote: Counting objects: 100% (240/240), done.[K
remote: Compressing objects: 100% (236/236), done.[K
remote: Total 240 (delta 16), reused 218 (delta 4), pack-reused 0 (from 0)[K
Receiving objects: 100% (240/240), 5.04 MiB | 21.94 MiB/s, done.
Resolving deltas: 100% (16/16), done.


## Functions

In [39]:
def augment_image(path, datagen, setupgen):
  """
  Applies data augmentation to a single image.

  Args:
    image_array: A numpy array representing the image.
    datagen: An instance of ImageDataGenerator for data augmentation with the parameters specified.
  Returns:
    A numpy array representing the augmented image.
  """

  img = load_img(path,
                 color_mode         = setupgen['Color Mode'],
                 target_size        = setupgen['Target Size'],
                 interpolation      = setupgen['Interpolation'],
                 keep_aspect_ratio  = setupgen['Keep Aspect Ratio']
                 )            # this is a PIL image
  x = img_to_array(img)
  x = x.reshape((1,) + x.shape)

  # Create the save directory if it doesn't exist
  if not os.path.exists(setupgen['Save Directory']):
      os.makedirs(setupgen['Save Directory'])

  # the .flow() command below generates batches of randomly transformed images
  i = 0
  for batch in datagen.flow(x,
              batch_size  = setupgen['Batch Size'],
              save_to_dir = setupgen['Save Directory'],
              save_prefix = setupgen['File Prefix'],
              save_format = setupgen['File Format'],
              seed        = setupgen['Seed']):
      i += 1
      if i > setupgen['Number of Generations']:
          break  # otherwise the generator would loop indefinitely
  return

## Define data augmentation parameters

#### Data Gen Setup

The fill_mode parameter in ImageDataGenerator accepts the following values:

"nearest":  
- Fills the missing pixels with the nearest pixel value.

"reflect":  
- Fills the missing pixels with a reflection of the pixels near the edge.

"wrap":
- Fills the missing pixels by wrapping the image around.

"constant":
- Fills the missing pixels with a constant value (specified by the cval parameter).

In [40]:
set_rotation_rng        = 45 # @param {type:"slider", min:0, max:180, step:5}
set_width_shift_rng     = 0.2 # @param {type:"slider", min:0, max:1, step:0.05}
set_height_shift_rng    = 0.2 # @param {type:"slider", min:0, max:1, step:0.05}
set_shear_rng           = 0.2 # @param {type:"slider", min:0, max:1, step:0.05}
set_zoom_rng            = 0.2 # @param {type:"slider", min:0, max:1, step:0.05}
set_horizontal_flip     = True # @param {type: "boolean"}
set_vertical_flip       = True # @param {type: "boolean"}
set_fill_mode           = "reflect" # @param {type: "string"} ["constant", "wrap", "nearest", "reflect"]
set_cval                = 0 # @param {type:"slider", min: 0.0, max:1.0, step:0.01}

datagen = ImageDataGenerator(
    rotation_range      = set_rotation_rng,         # Rotate images by a random degree between -20 and 20
    width_shift_range   = set_width_shift_rng,
    height_shift_range  = set_height_shift_rng,
    shear_range         = set_shear_rng,
    zoom_range          = set_zoom_rng,             # Zoom in or out by a random factor between 0.9 and 1.1
    horizontal_flip     = set_horizontal_flip,      # Randomly flip images horizontally
    vertical_flip       = set_vertical_flip,        # Randomly flip images vertically
    fill_mode           = set_fill_mode,            # Fill in missing pixels after transformations
    cval                = set_cval,                 # Value to use for constant fill mode
  )


#### Data File Setup

color_mode:
- One of `"grayscale"`, `"rgb"`, `"rgba"`. Default: `"rgb"`.

target_size:
- Either `None` (default to original size) or tuple of ints
(img_height, img_width).

interpolation:
- Interpolation method used to resample the image if the target size is different from that of the loaded image.
- Supported methods are `"nearest"`, `"bilinear"`, and `"bicubic"`.
- If PIL version 1.1.3 or newer is installed, `"lanczos"` is also supported.
- If PIL version 3.4.0 or newer is installed, `"box"` and `"hamming"` are also supported.
- By default, `"nearest"` is used.

keep_aspect_ratio:
- Boolean, whether to resize images to a target size without aspect ratio distortion.
- The image is cropped in the center with target aspect ratio before resizing.


In [56]:
setupgen = {}
path = "/content/TransferLearning-Datasets/Dataset/Test Sample/sample02.jpg"
filename = (path).split('/')[-1].split(".")[0]
setupgen['File Prefix']     = filename

image_dimX                        = 299 # @param {type:"integer"}
image_dimY                        = 299 # @param {type:"integer"}
setupgen['Target Size']           = (image_dimX, image_dimY)

setupgen['Seed']                  = 1 # @param {type:"integer"}
setupgen['File Format']           = "jpg" # @param ["png", "jpg", "jpeg", "bmp", "ppm", "tif", "tiff"]
setupgen['Save Directory']        = "DataAugmentation" # @param {type: "string"}
setupgen['Batch Size']            = 1  # @param {type: "integer"}
setupgen['Number of Generations'] = 10  # @param {type: "integer"}
setupgen['Color Mode']            = 'rgb' # @param ["rgb", "rgba", "grayscale"]
setupgen['Interpolation']         = 'nearest' # @param ["nearest", "bilinear", "bicubic", "lanczos", "box", "hamming"]
setupgen['Keep Aspect Ratio']     = True # @param {type: "boolean"}


## Examples

In [57]:
augment_image(path, datagen, setupgen)