### Why this notebook
- Most notebooks simply print the (random) outputs after augmentation and I could not find any interactive data augmentation notebooks to experiment how continuous changes in augmentation parameters affect the output images.
- An interactive slider or "simulation" helps us visualise the augmentation properly and also decide which augmentation is relevant for our use-case instead of using them all by default just because they "work". This could also give us an idea about what parameter range we want to use for the given augmentation.
- This notebook also allows other users to experiment with different augmentations as it was implemented to be easily extended and customisable.

## Libraries
`albumentations.augmentations.functional` and `kornia` methods are used in this notebook. Ideally this code could work with other kind of functions but this has not been tested.

## Observations if you use this notebook
- <span style="color:red">This notebook might take some time to load up in view mode as it has many images so give it a few secs ;)</span>
- Make sure that the range values are valid for each transformation, for example: 
    - For `kornia.enhance.posterize` the argument `bits` must be an integer between 1 and 8
- __This notebook runs on a CPU instance, therefore some transformations will run slower than when using a GPU. To obtain a speed up from using a GPU one must first adapt the code, for example by copying the tensor images to the desired device first__.
- When using this notebook focus on the bounding boxes (or targets) as they are what matter when we augmentate the image. One can zoom into these and then move the slider to see how each augmentation affect them. This will give you a hint of how the targets (COTS in this competition) will be changed according to each parameter. This way one can judge if our model will learn new useful information or not.

In [None]:
from typing import Callable, Dict, Tuple, Sequence, Optional, Union
import collections

import cv2
import random
from pathlib import Path
import numpy as np
import xarray as xr
import pandas as pd

# https://www.kaggle.com/product-feedback/138599#787336
from plotly.offline import plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)
import plotly.express as px

import torch
import kornia as kor
import albumentations.augmentations.functional as alb_aug

# IMPORTANT: Fix seed for reproducibility
seed = 37
random.seed(seed)
np.random.seed(seed)

In [None]:
def fetch_images_and_boxes(
        csv_file: str,
        num_imgs: int,
        min_num_boxes: int,
        rescale_size: Optional[Tuple[int, int]] = None
) -> Tuple[Sequence[np.ndarray], Sequence[Dict[str, float]]]:
    """Fetch images and their resp. boxes"""

    def func_num_entries(_str: str):
        # Convert string to struct and get the number of bounding boxes in it
        return len(eval(_str))

    # Load dataframe
    df = pd.read_csv(csv_file)

    # Filter df by a minimum number of annotations/boxes
    anns = df['annotations'].apply(func_num_entries)
    _df = df[anns >= min_num_boxes]
    
    # Sample rows from df
    subdf = _df.iloc[random.sample(range(len(_df.index)), num_imgs)]
    
    img_list = []
    bbox_list = []
    
    for _, row in subdf.iterrows():
        video_num = row['video_id']
        frame_num = row['video_frame']
        base_path = Path("../input/tensorflow-great-barrier-reef/train_images")
        img_path = str(base_path / f"video_{video_num}" / f"{frame_num}.jpg")
        img = cv2.cvtColor(cv2.imread(img_path), cv2.COLOR_BGR2RGB)
        boxes = eval(row['annotations'])
        if rescale_size is not None:
            nw, nh = rescale_size
            oh, ow = img.shape[:2]
            img = cv2.resize(img, rescale_size)
            for box in boxes:
                box['x'] = int(box['x'] * nw / ow)
                box['width'] = int(box['width'] * nw / ow)
                box['y'] = int(box['y'] * nh / oh)
                box['height'] = int(box['height'] * nh / oh)
        img_list.append(img)
        bbox_list.append(boxes)
    return img_list, bbox_list


def show_bboxes(
        _img: np.ndarray,
        _bboxes: Dict[str, float],
        _color: Tuple[int, int, int] = (255, 0, 0)
) -> None:
    """In-place plotting of boxes on the given image"""
    for _bbox in _bboxes:
        x1, y1 = _bbox['x'], _bbox['y']
        x2, y2 = _bbox['x'] + _bbox['width'], _bbox['y'] + _bbox['height']
        cv2.rectangle(_img, (int(x1), int(y1)), (int(x2), int(y2)), _color, 2)


def show_images_with_slider(
        images: Sequence[np.ndarray],
        labels: Sequence[str],
        slider_name: str,
        title: str,
) -> None:
    """Load and show slider figure
    :param images: images to plot in order
    :param labels: labels to show for each image
    :param slider_name: slider value name
    :param title: figure name
    """
    stacked = np.stack(images, axis=0)
    xrData = xr.DataArray(
        data=stacked,
        dims=[slider_name, 'row', 'col', 'rgb'],
        coords={slider_name: labels}
    )
    # Hide the axes
    layout_dict = dict(yaxis_visible=False, yaxis_showticklabels=False, xaxis_visible=False, xaxis_showticklabels=False)
    fig = px.imshow(xrData, title=title, animation_frame=slider_name, width=800, height=600).update_layout(layout_dict)
    fig.show()


def load_simulation(
        base_img: Union[torch.Tensor, np.ndarray],
        aug_function: Callable,
        arg_name: str,
        min_val: Union[float, Sequence] = 0,
        max_val: Union[float, Sequence] = 1.,
        val_type: type = float,
        num_steps: int = 10,
        aug_img_arg_name: str = 'input',
        aug_kwargs: dict = None,
        bboxes: Optional[Dict[str, float]] = None,
        expand: bool = False,
) -> None:
    """Load and show slider figure given the augmentation function and its parameters
    :param base_img: initial tensor image to be augmentated
    :param aug_function: augmentation callable to use
    :param arg_name: augmentation argument name to vary
    :param min_val: minimum value/vector to vary
    :param max_val: maximum value/vector to vary
    :param val_type: type to cast variation values into (int or float)
    :param num_steps: number of step variation between minimum and maximum value
    :param aug_img_arg_name: image argument name for the selected `aug_function`
    :param aug_kwargs: other augmentation key word arguments
    :param bboxes: bounding boxes to plot on all augmented images
    :param expand: whether to expand the base image shape to simulate batches (required for some augmentations)
    """
    aug_kwargs = aug_kwargs or {}

    # Set value range for the parameter to vary
    val_range = np.linspace(min_val, max_val, num_steps + 1, dtype=val_type)
    
    # Convert to tuples if necessary
    if hasattr(val_range[0], '__len__'):
        val_range = [tuple(val_type(_v) for _v in val) for val in val_range]
    else:
        val_range = [val_type(val) for val in val_range]
    
    # Add a dimension, simulating a batch size
    if expand:
        base_img = base_img[None, ...]
    
    aug_kwargs.update({aug_img_arg_name: base_img,}) 
    
    # Compute variations according to the previous interval values
    x_out = [aug_function(**{arg_name: val}, **aug_kwargs) for val in val_range]
    
    # Convert images to numpy arrays
    images = [
        np.ascontiguousarray(
            (kor.utils.tensor_to_image(im) * 255 if isinstance(im, torch.Tensor) else im)
            , dtype=np.uint8
        ) 
        for im in x_out]
    
    # Plot boxes if given
    if bboxes is not None:
        for i, im in enumerate(images):
            show_bboxes(im, bboxes)
    
    # Format values to show on the slider
    def _format(val):
        if isinstance(val, (float, np.inexact)):
            return f"{val:.2f}"
        elif isinstance(val, (int, np.integer)):
            return str(val)
        elif isinstance(val, (collections.abc.Sequence, np.ndarray)):
            return str(tuple(_format(subval) for subval in val))
        else:
            raise Exception(f"Invalid type {type(val)}")
            
    # Show slider plot
    show_images_with_slider(images, [_format(val) for val in val_range], arg_name, aug_function.__name__)


# Original image
Load and visualize the original image

In [None]:
# Fetch some images and their bounding boxes
min_num_boxes = 3
num_imgs = 3
csv_file = "../input/tensorflow-great-barrier-reef/train.csv"

# Rescale images to accelerate computations
img_list, bbox_list = fetch_images_and_boxes(csv_file, num_imgs, min_num_boxes, rescale_size=(640, 360))

# Set numpy array image and boxes
np_img, bboxes = img_list[0], bbox_list[0]

# Set tensor image
tensor_img = kor.utils.image_to_tensor(np_img)
tensor_img = tensor_img.float() / 255.0

# Plot image with boxes
copy_img = np_img.copy()
show_bboxes(copy_img, bboxes)
fig = px.imshow(copy_img)
fig.show()

# Augmentations
Some of these augmentation you can find in `kornia` and some others in `albumentations`. Both libraries are easily integrable with `PyTorch` code. If you want to try different augmentations just make sure you use the right range value for sampling. `load_simulation` is well documented and can be easily adapted if it is ever needed.

# Brightness
In this example, one could argue that when `brightness_factor` is larger than $0.3\sim 0.4$ then the COTS are barely distinguishable/visible, therefore not adding new information for the model.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.adjust_brightness,
    'arg_name': 'brightness_factor',
    'bboxes': bboxes
}
load_simulation(**kwargs)

# Contrast
In this one, one could argue that values smaller than $0.5/sim 0.6$ then the COTS are too dark to be detected, therefore restricting our augmentation parameter between $0.6$ and $1$.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.adjust_contrast,
    'arg_name': 'contrast_factor',
    'bboxes': bboxes
}
load_simulation(**kwargs)

# Gamma
Here the range could be set to $0.4$ to $1$.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.adjust_gamma,
    'arg_name': 'gamma',
    'bboxes': bboxes
}
load_simulation(**kwargs)

# Saturation
For saturation one could argue that grey images are useful therefore using the whole shown range to augmentate the images.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.adjust_saturation,
    'arg_name': 'saturation_factor',
    'bboxes': bboxes,
    'min_val': 0,
    'max_val': 3,
    'num_steps': 20
}
load_simulation(**kwargs)

# Posterize
In this case, one could say that only `bits` values above $4$ are acceptable.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.posterize,
    'arg_name': 'bits',
    'bboxes': bboxes,
    'min_val': 1,
    'max_val': 8,
    'num_steps': 7,
    'val_type': int
}
load_simulation(**kwargs)

# Sharpness
In this one it might be a bit difficult to decide on a range for `factor` but at least one could say that this augmentation could help the model learn new characteristics of COTS.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.sharpness,
    'arg_name': 'factor',
    'bboxes': bboxes,
    'min_val': 0,
    'max_val': 10,
    'num_steps': 10,
}
load_simulation(**kwargs)

# Hue
One could say that values between $-0.31$ and $0.31$  simulate different water colors as it varies the color between green and blue.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.adjust_hue,
    'arg_name': 'hue_factor',
    'bboxes': bboxes,
    'min_val': -np.pi,
    'max_val': np.pi,
    'num_steps': 20
}
load_simulation(**kwargs)

# Solarize
As shown in the picture the values shown do no bring any new useful information so one could argue that this augmentation is not useful.. or one could go deeper and restrict the parameters between $0.9$ and $1$ to further experiment in this case.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.solarize,
    'arg_name': 'thresholds',
    'bboxes': bboxes,
    'min_val': 0.,
    'max_val': 1,
    'val_type': float,
    'num_steps': 10,
}
load_simulation(**kwargs)

# Equalize CLAHE
Similar to `Solarize` one could further investigate if values between $1$ and $3.93$ bring new information.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.enhance.equalize_clahe,
    'arg_name': 'clip_limit',
    'bboxes': bboxes,
    'min_val': 1,
    'max_val': 45,
    'num_steps': 15,
}
# Somehow this blur needs odd valued-kernels such as (3,3), (5,5), etc
load_simulation(**kwargs)

# Box blur
Here only $(2,2)$ kernels seem to useful as larger kernels only obscure the image, nevertheless one should be careful with small targets.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.filters.box_blur,
    'arg_name': 'kernel_size',
    'bboxes': bboxes,
    'min_val': (1, 1),
    'max_val': (11, 11),
    'val_type': int,
    'num_steps': 10,
    'expand': True
}
load_simulation(**kwargs)

# Unsharp mask
Due to the "noisyness" of the coral reefs, this augmentation brings no useful information for generalisation.

In [None]:
kwargs = {
    'aug_img_arg_name': 'input',
    'base_img': tensor_img,
    'aug_function': kor.filters.unsharp_mask,
    'arg_name': 'kernel_size',
    'bboxes': bboxes,
    'min_val': (1, 1),
    'max_val': (21, 21),
    'val_type': int,
    'num_steps': 10,
    'aug_kwargs': {'sigma': (3, 3)},
    'expand': True
}
load_simulation(**kwargs)

# Median blur
One could argue that blur with kernel size 3 is useful but it seems that small targets just dissapear.

In [None]:
kwargs = {
    'aug_img_arg_name': 'img',
    'base_img': np_img,
    'aug_function': alb_aug.median_blur,
    'arg_name': 'ksize',
    'bboxes': bboxes,
    'min_val': 1,
    'max_val': 15,
    'val_type': int,
    'num_steps': 7,
}
load_simulation(**kwargs)

# Fancy PCA
Visually PCA seems to be a combination of some previous augmentation so one muss decide whether to use it or just ignore since it is already represented by other augmentations.

In [None]:
kwargs = {
    'aug_img_arg_name': 'img',
    'base_img': np_img,
    'aug_function': alb_aug.fancy_pca,
    'arg_name': 'alpha',
    'bboxes': bboxes,
    'min_val': 0,
    'max_val': 1,
    'num_steps': 10,
}
load_simulation(**kwargs)

# Camera sensor noise (ISO)
ISO seems to be useful to simulate camera noises. In this competition the cameras might not have that problem so one should discuss further if this augmentation is useful.

In [None]:
kwargs = {
    'aug_img_arg_name': 'img',
    'base_img': np_img,
    'aug_function': alb_aug.iso_noise,
    'arg_name': 'intensity',
    'bboxes': bboxes,
    'min_val': 0,
    'max_val': 1,
    'num_steps': 10,
}
load_simulation(**kwargs)

# Downscale
This augmentation could be useful for `scale` between $0.9$ and $1$ but again one should be careful with small labels.

In [None]:
kwargs = {
    'aug_img_arg_name': 'img',
    'base_img': np_img,
    'aug_function': alb_aug.downscale,
    'arg_name': 'scale',
    'bboxes': bboxes,
    'min_val': 0.1,
    'max_val': 1,
    'num_steps': 19,
    'aug_kwargs': {'interpolation': cv2.INTER_NEAREST}
}
load_simulation(**kwargs)

# Image compression
This one is interesting as it a special `downscale`, therefore one should treat it similarly: small object might get harder to learn.

In [None]:
kwargs = {
    'aug_img_arg_name': 'img',
    'base_img': np_img,
    'aug_function': alb_aug.image_compression,
    'arg_name': 'quality',
    'bboxes': bboxes,
    'min_val': 100,
    'max_val': 10,
    'val_type': int,
    'num_steps': 9,
    'aug_kwargs': {'image_type': '.jpg'}
}
load_simulation(**kwargs)

# Summary
As seen from the previous figures, some of these augmentations are suitable for this challenge and some are not. As already mentioned before, it is important to see how these transformations affect the targets to detect, more specifically the COTS in this competition. As most already know, good generalisation and small object detection are important factors to improve our score, in that sense, these augmentations should be used carefully. Of course, we should not completely rely on visualizations to analyze augmentations, nevertheless they always are a good start to understand them.

# References
- [Kornia: differentiable computer vision library for PyTorch](https://kornia.readthedocs.io/en/latest/)
- [Albumentations: fast and flexible image augmentations](https://albumentations.ai/)
- For plotting [plotly](https://plotly.com/python/getting-started/)