# **Dataset augmentation**

---

<font size = 4>To train a CARE (3D) model, we will need to have a dataset with enough images to avoid overfitting. If the original number of images is low, we will have to increase the dataset virtually. In this notebook there are 3 functions to augment the source (low resolution) and target (high resolution) images of the dataset by adding brightness, Gaussian noise and then rotate and flip each bright or noisy image. This way we will have a 9 times larger dataset.

## **1. Dependencies**
---


### **1.1. Install dependencies**
---
<font size = 4>

In [1]:
#@markdown ##Install dependencies
! pip install -q readlif
! pip install -q SimpleITK
! pip install -q aicsimageio

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m52.7/52.7 MB[0m [31m18.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.7/138.7 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m39.6/39.6 MB[0m [31m28.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m248.4/248.4 kB[0m [31m25.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m216.4/216.4 kB[0m [31m22.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m405.2/405.2 kB[0m [31m36.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m206.1/206.1 kB[0m [31m23.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m195.1/195.1 kB[0m [31m21.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing met

### **1.2. Load key dependencies**
---
<font size = 4>

In [2]:
#@markdown ##Load dependencies

from readlif.reader import LifFile
import numpy as np
import tifffile as tf
import os
import SimpleITK as sitk
from aicsimageio import AICSImage
import shutil
from skimage import io, img_as_uint, exposure
from skimage.util import random_noise
import xml.etree.ElementTree as ET
import ipywidgets as widgets
from IPython.display import display

## **2. Initialise the Colab session**
---

### **2.1. Mount your Google Drive**
---
<font size = 4> To use this notebook on the data present in your Google Drive, you need to mount your Google Drive to this notebook.

<font size = 4> Play the cell below to mount your Google Drive and follow the link. In the new browser window, select your drive and select 'Allow', copy the code, paste into the cell and press enter. This will give Colab access to the data on the drive.

<font size = 4> Once this is done, your data are available in the **Files** tab on the top left of notebook.

In [3]:
#@markdown ##Play the cell to connect your Google Drive to Colab

#@markdown * Follow the instructions.

#@markdown * Click on "Files" site on the right. Refresh the site. Your Google Drive folder should now be available here as "drive".

# mount user's Google Drive to Google Colab.
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


**<font size = 4> If you cannot see your files, reactivate your session by connecting to your hosted runtime.**


<img width="40%" alt ="Example of image detection with retinanet." src="https://github.com/HenriquesLab/ZeroCostDL4Mic/raw/master/Wiki_files/connect_to_hosted.png"><figcaption> Connect to a hosted runtime. </figcaption>

## **3. Select your paths**

---


<font size = 4> **Path and folders of the input images**

<font size = 4>**`base_path`:** this is the folder path were the folders with the high-resolution images an low-resolution images in '.tiff' format are located in your google drive

<font size = 4>**`input_folder_down`:** this is the source (low-resolution) images folder **name**,and where the output will be saved.

<font size = 4>**`input_folder_up`:** this is the target (hight-resolution) images folder **name**, and where the output will be saved.


 To find the paths, go to your Files on the left of the notebook, navigate to the folders containing your files and copy the path by right-clicking on the folder, **Copy path** and pasting it into the right box below.


In [4]:
#@markdown ##Base path and input folders names:

# base folder
base_path = "/content/gdrive/MyDrive/Colab Notebooks/res" #@param {type:"string"}

# low-resolution .tiff images folder name
input_folder_down = "train_down" #@param {type:"string"}

# high-resolution .tiff images folder name
input_folder_up = "train_up" #@param {type:"string"}

## **4. Augmentation**
---

### 4.1. Add Brightness
<font size = 4>**`brightness_range`:** this parameter sets the range (<1) for the brightness addition and is used to randomly sample a value within this range. We recommend the following range: 0.7, 0.8. Please use this format when specifying the range. Disregard the red warning message shown below this field.

In [6]:
#@markdown ### Add Brightness
def apply_brightness_to_images(base_path, input_folder_up, input_folder_down, gamma_range):
    ''' function that reads imgs from source and target folders, applies a random brightness adjustment,
    within the specified range of gamma (gamma >1: img will be darker; gamma <1: img will be brighter),
    to each pair of imgs, saves processed imgs into separate output folders,
    moves processed imgs back to their original input folders and finally removes intermediate folders'''

    # paths
    input_path_up = os.path.join(base_path, input_folder_up)
    input_path_down = os.path.join(base_path, input_folder_down)

    folder_name_up = 'brightness_up'
    folder_name_down = 'brightness_down'

    output_folder_up = os.path.join(base_path, folder_name_up)
    if folder_name_up not in os.listdir(base_path):
        os.makedirs(output_folder_up)

    output_folder_down = os.path.join(base_path, folder_name_down)
    if folder_name_down not in os.listdir(base_path):
        os.makedirs(output_folder_down)

    # filter files that do not contain 'brightness' or 'noise' in their names
    file_list_up = [file for file in os.listdir(input_path_up) if 'brightness' not in file.lower() and 'noise' not in file.lower() and not file.lower().endswith('.ds_store')]
    file_list_down = [file for file in os.listdir(input_path_down) if 'brightness' not in file.lower() and 'noise' not in file.lower() and not file.lower().endswith('.ds_store')]

    # process source and target imgs in parallel
    for image_up, image_down in zip(file_list_up, file_list_down):
        # read imgs
        img_up = io.imread(os.path.join(input_path_up, image_up))
        img_down = io.imread(os.path.join(input_path_down, image_down))

        # generate random brightness value within the specified range
        brightness = np.random.uniform(gamma_range[0], gamma_range[1])

        # apply brightness to imgs
        bright_img_up = exposure.adjust_gamma(img_up, gamma=brightness)
        bright_img_down = exposure.adjust_gamma(img_down, gamma=brightness)

        # convert imgs to uint16 format (16 bits per pixel)
        bright_img_up_16bit = img_as_uint(bright_img_up)
        bright_img_down_16bit = img_as_uint(bright_img_down)

        # output img names
        output_image_name_up = f'{os.path.splitext(image_up)[0]}_brightness.tiff'
        output_image_name_down = f'{os.path.splitext(image_down)[0]}_brightness.tiff'

        # save processed imgs to output folders
        io.imsave(os.path.join(output_folder_up, output_image_name_up), bright_img_up_16bit, compression='lzw')
        io.imsave(os.path.join(output_folder_down, output_image_name_down), bright_img_down_16bit, compression='lzw')

    # move processed imgs to original input folders
    for file_up, file_down in zip(os.listdir(output_folder_up), os.listdir(output_folder_down)):
        shutil.move(os.path.join(output_folder_up, file_up), os.path.join(input_path_up, file_up))
        shutil.move(os.path.join(output_folder_down, file_down), os.path.join(input_path_down, file_down))

    # remove intermediate folders
    shutil.rmtree(output_folder_up)
    shutil.rmtree(output_folder_down)


brightness_range = 0.7, 0.8 #@param{type:"number"}

if brightness_range == None:
  print('Please enter a value')
else:
# call function
  apply_brightness_to_images(base_path, input_folder_up, input_folder_down, brightness_range)

### 4.2. Add Gaussian noise
<font size = 4>**`noise_var_range`:** this parameter sets the range for the noise addition and is used to randomly sample a value within this range. We recommend the following range: 0.001, 0.002. Please use this format when specifying the range. Disregard the red warning message shown below this field.

In [8]:
#@markdown ### Add Gaussian noise
def apply_gaussian_noise_to_images(base_path, input_folder_up, input_folder_down, var_range):
    ''' function that reads imgs from the source and target folders, applies Gaussian noise with
    a randomly selected variance within the specified range to each pair of images, saves processed
    imgs into separate output folders, moves processed imgs back to their original input folders and
    finally removes intermediate folders'''

    # paths
    input_path_up = os.path.join(base_path, input_folder_up)
    input_path_down = os.path.join(base_path, input_folder_down)

    folder_name_up = 'noise_up'
    folder_name_down = 'noise_down'

    output_folder_up = os.path.join(base_path, folder_name_up)
    if folder_name_up not in os.listdir(base_path):
        os.makedirs(output_folder_up)

    output_folder_down = os.path.join(base_path, folder_name_down)
    if folder_name_down not in os.listdir(base_path):
        os.makedirs(output_folder_down)

    # filter files that do not contain 'brightness' or 'noise' in their names
    file_list_up = [file for file in os.listdir(input_path_up) if 'brightness' not in file.lower() and 'noise' not in file.lower() and not file.lower().endswith('.ds_store')]
    file_list_down = [file for file in os.listdir(input_path_down) if 'brightness' not in file.lower() and 'noise' not in file.lower() and not file.lower().endswith('.ds_store')]

    # process source and target imgs in parallel
    for image_up, image_down in zip(file_list_up, file_list_down):
        # read images
        img_up = io.imread(os.path.join(input_path_up, image_up))
        img_down = io.imread(os.path.join(input_path_down, image_down))

        # apply Gaussian noise to imgs from both folders
        noise_var = np.random.uniform(var_range[0], var_range[1])

        noisy_img_up = random_noise(img_up, mode='gaussian', mean=0.0, var=noise_var, clip=True)
        noisy_img_down = random_noise(img_down, mode='gaussian', mean=0.0, var=noise_var, clip=True)

        # convert imgs to uint16 format (16 bits per pixel)
        noisy_img_up_16bit = img_as_uint(noisy_img_up)
        noisy_img_down_16bit = img_as_uint(noisy_img_down)

        # output img names
        output_image_name_up = f'{os.path.splitext(image_up)[0]}_noise.tiff'
        output_image_name_down = f'{os.path.splitext(image_down)[0]}_noise.tiff'

        # save noisy imgs in separate output folders
        io.imsave(os.path.join(output_folder_up, output_image_name_up), noisy_img_up_16bit, compression='lzw')
        io.imsave(os.path.join(output_folder_down, output_image_name_down), noisy_img_down_16bit, compression='lzw')

    # move noisy imgs to their respective original input folders
    for file_up, file_down in zip(os.listdir(output_folder_up), os.listdir(output_folder_down)):
        shutil.move(os.path.join(output_folder_up, file_up), os.path.join(input_path_up, file_up))
        shutil.move(os.path.join(output_folder_down, file_down), os.path.join(input_path_down, file_down))

    # remove intermediate folders
    shutil.rmtree(output_folder_up)
    shutil.rmtree(output_folder_down)

noise_var_range = 0.001, 0.002 #@param{type:"number"}

if noise_var_range == None :
  print('Please enter a value')
else:
# Call function
  apply_gaussian_noise_to_images(base_path, input_folder_up, input_folder_down, noise_var_range)

### 4.3. Rotate and flip noisy and bright images

In [9]:
#@markdown ### Rotate and flip noisy and bright images

def rotation_flip(base_path, input_folder_up, input_folder_down):
    ''' function that applies rotation and flip transformations to imgs in the specified input folders,
    saves the transformed imgs into separate output folders, moves the processed imgs back to their original input folders,
    and removes intermediate folders'''

    # paths
    input_path_up = os.path.join(base_path, input_folder_up)
    input_path_down = os.path.join(base_path, input_folder_down)

    folder_name_up = 'rot_flip_up'
    folder_name_down = 'rot_flip_down'

    output_folder_up = os.path.join(base_path, folder_name_up)
    if folder_name_up not in os.listdir(base_path):
        os.makedirs(output_folder_up)

    output_folder_down = os.path.join(base_path, folder_name_down)
    if folder_name_down not in os.listdir(base_path):
        os.makedirs(output_folder_down)

    # filter files that contain 'brightness' or 'noise' in their names
    file_list_up = [file for file in os.listdir(input_path_up) if 'brightness' in file.lower() or 'noise' in file.lower()]
    file_list_down = [file for file in os.listdir(input_path_down) if 'brightness' in file.lower() or 'noise' in file.lower()]

    # process source and target imgs in parallel
    for image_up, image_down in zip(file_list_up, file_list_down):

        # read imgs
        img_up = io.imread(os.path.join(input_path_up, image_up))
        img_down = io.imread(os.path.join(input_path_down, image_down))

        # img_up rotation and flip
        img_up_90 = np.rot90(img_up,axes=(1,2))
        img_up_180 = np.rot90(img_up_90,axes=(1,2))
        img_up_270 = np.rot90(img_up_180,axes=(1,2))

        img_up_90_lr = np.fliplr(img_up_90)
        img_up_180_lr = np.fliplr(img_up_180)
        img_up_270_lr = np.fliplr(img_up_270)

        # img_down rotation and flip
        img_down_90 = np.rot90(img_down,axes=(1,2))
        img_down_180 = np.rot90(img_down_90,axes=(1,2))
        img_down_270 = np.rot90(img_down_180,axes=(1,2))

        img_down_90_lr = np.fliplr(img_down_90)
        img_down_180_lr = np.fliplr(img_down_180)
        img_down_270_lr = np.fliplr(img_down_270)

        # save up and down imgs in separate output folders
        # up_images
        io.imsave(os.path.join(output_folder_up, os.path.splitext(image_up)[0] + '_90_lr.tiff'), img_up_90_lr)
        io.imsave(os.path.join(output_folder_up, os.path.splitext(image_up)[0] + '_180_lr.tiff'), img_up_180_lr)
        io.imsave(os.path.join(output_folder_up, os.path.splitext(image_up)[0] + '_270_lr.tiff'), img_up_270_lr)
        # down_images
        io.imsave(os.path.join(output_folder_down, os.path.splitext(image_down)[0] + '_90_lr.tiff'), img_down_90_lr)
        io.imsave(os.path.join(output_folder_down, os.path.splitext(image_down)[0] + '_180_lr.tiff'), img_down_180_lr)
        io.imsave(os.path.join(output_folder_down, os.path.splitext(image_down)[0] + '_270_lr.tiff'), img_down_270_lr)

    # move imgs to their respective original input folders
    for file_up, file_down in zip(os.listdir(output_folder_up), os.listdir(output_folder_down)):
        shutil.move(os.path.join(output_folder_up, file_up), os.path.join(input_path_up, file_up))
        shutil.move(os.path.join(output_folder_down, file_down), os.path.join(input_path_down, file_down))

    # remove intermediate output folders
    shutil.rmtree(output_folder_up)
    shutil.rmtree(output_folder_down)

#call function
rotation_flip(base_path, input_folder_up, input_folder_down)