# Data Augmentation
We'll show you examples of data augmentation with various techniques such as [MixUp](https://openreview.net/pdf?id=r1Ddp1-Rb), [CutMix](http://openaccess.thecvf.com/content_ICCV_2019/papers/Yun_CutMix_Regularization_Strategy_to_Train_Strong_Classifiers_With_Localizable_Features_ICCV_2019_paper.pdf), and [VH-MixUp](https://arxiv.org/pdf/1805.11272.pdf)!

| Image 1 | Image 2 | Mixup | CutMix | VH-Mixup |
| --- | --- | --- | --- | --- |
| &nbsp; <img src="https://blog.nnabla.org/wp-content/uploads/sites/2/2020/04/07130642/image1.png" alt="" width="128" height="128" class="size-full wp-image-1074" />&nbsp;  | &nbsp; <img src="https://blog.nnabla.org/wp-content/uploads/sites/2/2020/04/07130708/image2.png" alt="" width="128" height="128" class="size-full wp-image-1075" /> &nbsp; | &nbsp; <img src="https://blog.nnabla.org/wp-content/uploads/sites/2/2020/04/07131002/mixuped_img.png" alt="" width="128" height="128" class="size-full wp-image-1076" /> &nbsp; | &nbsp; <img src="https://blog.nnabla.org/wp-content/uploads/sites/2/2020/04/07131130/cutmixed_img.png" alt="" width="128" height="128" class="size-full wp-image-1077" /> &nbsp; | &nbsp; <img src="https://blog.nnabla.org/wp-content/uploads/sites/2/2020/04/07131216/VHmixuped.png" alt="" width="128" height="128" class="size-full wp-image-1078" /> &nbsp; |

# Preparation
Let's start by installing nnabla and accessing [nnabla-examples repository](https://github.com/sony/nnabla-examples). If you're running on Colab, make sure that your Runtime setting is set as GPU, which can be set up from the top menu (Runtime â†’ change runtime type), and make sure to click **Connect** on the top right-hand side of the screen before you start.

In [None]:
!pip install nnabla-ext-cuda100
!git clone https://github.com/sony/nnabla-examples.git
%run nnabla-examples/interactive-demos/colab_utils.py
%cd nnabla-examples/data_augmentation

Next, let's import required modules first.

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
import nnabla as nn
from nnabla.ext_utils import get_extension_context
from nnabla.utils.image_utils import imread, imresize
from nnabla.ext_utils import get_extension_context

from MixedDataLearning import *
from google.colab import files
from IPython.display import Image,display

ctx = get_extension_context("cudnn")
nn.set_default_context(ctx)

# Upload first image

Now, upload an image you'd like to use for data augmentation. 

In [None]:
img1 = files.upload()

#Renaming the file for convenience. You can ignore the lines below.
ext = os.path.splitext(list(img1.keys())[-1])[-1]
os.rename(list(img1.keys())[-1], "input_image1{}".format(ext)) 
input_img1 = "input_image1" + ext

In [None]:
display(Image(input_img1))

# Upload second image

Next, upload the second image you'd like to mix with the first one. 

In [None]:
img2 = files.upload()

#Renaming the file for convenience. You can ignore the lines below.
ext = os.path.splitext(list(img2.keys())[-1])[-1]
os.rename(list(img2.keys())[-1], "input_image2{}".format(ext)) 
input_img2 = "input_image2" + ext

In [None]:
display(Image(input_img2))

In [None]:
#@title Here we resize the uploaded images. To see the details, double-click this cell.
image1 = imread(input_img1, channel_first=True)[:3]
image2 = imread(input_img2, channel_first=True)[:3]
scale = float(image1.shape[1]) / image2.shape[1]
image2 = imresize(image2, size=(int(image2.shape[2]*scale), int(image2.shape[1]*scale)), channel_first=True)

larger_shape = [max(image1.shape[i], image2.shape[i]) for i in range(3)]
pad_length_1 = [larger_shape[i] - image1.shape[i] for i in range(3)]
pad_length_2 = [larger_shape[i] - image2.shape[i] for i in range(3)]

image1 = np.pad(image1, (
                (0, 0),
                (pad_length_1[1] // 2, pad_length_1[1] // 2 + pad_length_1[1] % 2),
                (pad_length_1[2] // 2, pad_length_1[2] // 2 + pad_length_1[2] % 2)),
                mode="reflect")

image2 = np.pad(image2, (
                (0, 0),
                (pad_length_2[1] // 2, pad_length_2[1] // 2 + pad_length_2[1] % 2),
                (pad_length_2[2] // 2, pad_length_2[2] // 2 + pad_length_2[2] % 2)),
                mode="reflect")

In [None]:
#@title Choose data augmentation config.

#@markdown Choose which data augmentation is used.
mixtype = "vhmmixup"  #@param ['mixup', 'cutmix', 'vhmmixup']
#@markdown choose alpha value. (default: 0.5)
alpha = 1.04  #@param {type: "slider", min: 0.0, max: 2.0, step: 0.01}


Now setup the mix augmentation and so on.

In [None]:
inshape = (2,) + image1.shape
if mixtype == "mixup":
    mdl = MixupLearning(2, alpha=alpha)
elif mixtype == "cutmix":
    mdl = CutmixLearning(inshape, alpha=alpha, cutmix_prob=1.0)
else:
    # "vhmixup" is used.
    mdl = VHMixupLearning(inshape, alpha=alpha)

image_train = nn.Variable(inshape)
label_train = nn.Variable((2, 1))
mix_image, mix_label = mdl.mix_data(image_train, F.one_hot(label_train, (2, )))
image_train.d[0] = image1 / 255.
image_train.d[1] = image2 / 255.

# Apply Mix Augmentation
Running the following cell executes the augmentation and displays the augmented images. Note that every time you run the cell, the output will be different due to the randomness. Simple as it is, these augmentation techniques are very useful and actually improve the network performance.

In [None]:
mdl.set_mix_ratio()
mix_image.forward()
plt.imshow(mix_image.d[1].transpose(1,2,0))