# **Augmentor**

<font size = 4>Data augmentation can improve training progress by amplifying differences in the dataset. This can be useful if the available dataset is small since, in this case, it is possible that a network could quickly learn every example in the dataset (overfitting), without augmentation. Augmentation is not necessary for training and if your training dataset is large you should disable it.


---

<font size = 4>*Disclaimer*:

<font size = 4>This notebook is part of the *Zero-Cost Deep-Learning to Enhance Microscopy* project (https://github.com/HenriquesLab/DeepLearning_Collab/wiki). Jointly developed by the Jacquemet (link to https://cellmig.org/) and Henriques (https://henriqueslab.github.io/) laboratories.

<font size = 4>[Augmentor](https://github.com/mdbloice/Augmentor) was described in the following article:

<font size = 4>Marcus D Bloice, Peter M Roth, Andreas Holzinger, Biomedical image augmentation using Augmentor, Bioinformatics, https://doi.org/10.1093/bioinformatics/btz259

<font size = 4>**Please also cite this original paper when using or developing this notebook.**

# **How to use this notebook?**

---

<font size = 4>Video describing how to use our notebooks are available on youtube:
  - [**Video 1**](https://www.youtube.com/watch?v=GzD2gamVNHI&feature=youtu.be): Full run through of the workflow to obtain the notebooks and the provided test datasets as well as a common use of the notebook
  - [**Video 2**](https://www.youtube.com/watch?v=PUuQfP5SsqM&feature=youtu.be): Detailed description of the different sections of the notebook


---
###**Structure of a notebook**

<font size = 4>The notebook contains two types of cell:  

<font size = 4>**Text cells** provide information and can be modified by douple-clicking the cell. You are currently reading the text cell. You can create a new text by clicking `+ Text`.

<font size = 4>**Code cells** contain code and the code can be modfied by selecting the cell. To execute the cell, move your cursor on the `[ ]`-mark on the left side of the cell (play button appears). Click to execute the cell. After execution is done the animation of play button stops. You can create a new coding cell by clicking `+ Code`.

---
###**Table of contents, Code snippets** and **Files**

<font size = 4>On the top left side of the notebook you find three tabs which contain from top to bottom:

<font size = 4>*Table of contents* = contains structure of the notebook. Click the content to move quickly between sections.

<font size = 4>*Code snippets* = contain examples how to code certain tasks. You can ignore this when using this notebook.

<font size = 4>*Files* = contain all available files. After mounting your google drive (see section 1.) you will find your files and folders here. 

<font size = 4>**Remember that all uploaded files are purged after changing the runtime.** All files saved in Google Drive will remain. You do not need to use the Mount Drive-button; your Google Drive is connected in section 1.2.

<font size = 4>**Note:** The "sample data" in "Files" contains default files. Do not upload anything in here!

---
###**Making changes to the notebook**

<font size = 4>**You can make a copy** of the notebook and save it to your Google Drive. To do this click file -> save a copy in drive.

<font size = 4>To **edit a cell**, double click on the text. This will show you either the source code (in code cells) or the source text (in text cells).
You can use the `#`-mark in code cells to comment out parts of the code. This allows you to keep the original code piece in the cell as a comment.

# **1. Mount your Google Drive**
---







<font size = 4> To use this notebook on the data present in your Google Drive, you need to mount your Google Drive to this notebook.

<font size = 4> Play the cell below to mount your Google Drive and follow the link. In the new browser window, select your drive and select 'Allow', copy the code, paste into the cell and press enter. This will give Colab access to the data on the drive. 

<font size = 4> Once this is done, your data are available in the **Files** tab on the top left of notebook.

In [None]:
#@markdown ##Run this cell to connect your Google Drive to Colab

#@markdown * Click on the URL. 

#@markdown * Sign in your Google Account. 

#@markdown * Copy the authorization code. 

#@markdown * Enter the authorization code. 

#@markdown * Click on "Files" site on the right. Refresh the site. Your Google Drive folder should now be available here as "drive". 

#mounts user's Google Drive to Google Colab.

from google.colab import drive
drive.mount('/content/gdrive')

# **2. Install Augmentor and Dependencies**
---


In [None]:
Notebook_version = ['1.11']

#@markdown ##Install Augmentor and dependencies



#Here, we install libraries which are not already included in Colab.


!pip install Augmentor
import Augmentor
import os




# ------- Common variable to all ZeroCostDL4Mic notebooks -------
import numpy as np
from matplotlib import pyplot as plt
import urllib
import os, random
import shutil 
import zipfile
from tifffile import imread, imsave
import time
import sys
from pathlib import Path
import pandas as pd
import csv
from glob import glob
from scipy import signal
from scipy import ndimage
from skimage import io
from sklearn.linear_model import LinearRegression
from skimage.util import img_as_uint
import matplotlib as mpl
from skimage.metrics import structural_similarity
from skimage.metrics import peak_signal_noise_ratio as psnr
from astropy.visualization import simple_norm
from skimage import img_as_float32
from skimage.util import img_as_ubyte
from tqdm import tqdm 


# Colors for the warning messages
class bcolors:
  WARNING = '\033[31m'

#Disable some of the tensorflow warnings
import warnings
warnings.filterwarnings("ignore")

print("Libraries installed")

# Check if this is the latest version of the notebook
Latest_notebook_version = pd.read_csv("https://raw.githubusercontent.com/HenriquesLab/ZeroCostDL4Mic/master/Colab_notebooks/Latest_ZeroCostDL4Mic_Release.csv")

if Notebook_version == list(Latest_notebook_version.columns):
  print("This notebook is up-to-date.")

if not Notebook_version == list(Latest_notebook_version.columns):
  print(bcolors.WARNING +"A new version of this notebook has been released. We recommend that you download it at https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki")



## **3.2. Data augmentation**
---
<font size = 4>





In [None]:
#Data augmentation


Training_source = "" #@param {type:"string"}

Matching_Training_target = False #@param {type:"boolean"}

Training_target = "" #@param {type:"string"}

Random_Crop = False #@param {type:"boolean"}

Crop_size = 1024  #@param {type:"number"}


#@markdown ####Choose a factor by which you want to multiply your original dataset

Multiply_dataset_by = 4 #@param {type:"slider", min:1, max:30, step:1}

Saving_path = "" #@param {type:"string"}


#@markdown ###If not, please choose the probability of the following image manipulations to be used to augment your dataset (1 = always used; 0 = disabled ):

#@markdown ####Mirror and rotate images
rotate_90_degrees = 0.5 #@param {type:"slider", min:0, max:1, step:0.1}

rotate_270_degrees = 0.5 #@param {type:"slider", min:0, max:1, step:0.1}

flip_left_right = 0.5 #@param {type:"slider", min:0, max:1, step:0.1}

flip_top_bottom = 0.5 #@param {type:"slider", min:0, max:1, step:0.1}

#@markdown ####Random image Zoom

random_zoom = 0 #@param {type:"slider", min:0, max:1, step:0.1}

random_zoom_magnification = 0 #@param {type:"slider", min:0, max:1, step:0.1}

#@markdown ####Random image distortion

random_distortion = 0 #@param {type:"slider", min:0, max:1, step:0.1}


#@markdown ####Image shearing and skewing  

image_shear = 0 #@param {type:"slider", min:0, max:1, step:0.1}
max_image_shear = 1 #@param {type:"slider", min:1, max:25, step:1}

skew_image = 0 #@param {type:"slider", min:0, max:1, step:0.1}

skew_image_magnitude = 0 #@param {type:"slider", min:0, max:1, step:0.1}


list_files = os.listdir(Training_source)
Nb_files = len(list_files)

Nb_augmented_files = (Nb_files * Multiply_dataset_by)


Augmented_folder =  Saving_path+"/Augmented_Folder"
if os.path.exists(Augmented_folder):
  shutil.rmtree(Augmented_folder)
os.makedirs(Augmented_folder)

  
Training_source_augmented = Saving_path+"/Training_source_augmented"

if os.path.exists(Training_source_augmented):
  shutil.rmtree(Training_source_augmented)
os.makedirs(Training_source_augmented)

if Matching_Training_target:
  #Training_target_augmented = "/content/Training_target_augmented"
  Training_target_augmented = Saving_path+"/Training_target_augmented"

  if os.path.exists(Training_target_augmented):
    shutil.rmtree(Training_target_augmented)
  os.makedirs(Training_target_augmented)


# Here we generate the augmented images
#Load the images
p = Augmentor.Pipeline(Training_source, Augmented_folder)

#Define the matching images
if Matching_Training_target:
  p.ground_truth(Training_target)
#Define the augmentation possibilities



if Random_Crop:
  p.crop_by_size(probability=1, width=Crop_size, height=Crop_size, centre=False)

if not rotate_90_degrees == 0:
  p.rotate90(probability=rotate_90_degrees)
  
if not rotate_270_degrees == 0:
  p.rotate270(probability=rotate_270_degrees)

if not flip_left_right == 0:
  p.flip_left_right(probability=flip_left_right)

if not flip_top_bottom == 0:
  p.flip_top_bottom(probability=flip_top_bottom)

if not random_zoom == 0:
  p.zoom_random(probability=random_zoom, percentage_area=random_zoom_magnification)
 
if not random_distortion == 0:
  p.random_distortion(probability=random_distortion, grid_width=4, grid_height=4, magnitude=8)

if not image_shear == 0:
  p.shear(probability=image_shear,max_shear_left=20,max_shear_right=20)
  
if not skew_image == 0:
  p.skew(probability=skew_image,magnitude=skew_image_magnitude)

p.sample(int(Nb_augmented_files))

print(int(Nb_augmented_files),"images generated")

# Here we sort through the images and move them back to augmented trainning source and targets folders

augmented_files = os.listdir(Augmented_folder)

for f in augmented_files:

  if (f.startswith("_groundtruth_(1)_")):
    shortname_noprefix = f[17:]
    shutil.copyfile(Augmented_folder+"/"+f, Training_target_augmented+"/"+shortname_noprefix) 
  if not (f.startswith("_groundtruth_(1)_")):
    shutil.copyfile(Augmented_folder+"/"+f, Training_source_augmented+"/"+f)
      

for filename in os.listdir(Training_source_augmented):
  os.chdir(Training_source_augmented)
  os.rename(filename, filename.replace('_original', ''))
  
  #Here we clean up the extra files
shutil.rmtree(Augmented_folder)






## **6.3. Download your images**
---

<font size = 4>**Store your data** and ALL its results elsewhere by downloading it from Google Drive and after that clean the original folder tree (datasets, results, trained model etc.) if you plan to train or use new networks. Please note that the notebook will otherwise **OVERWRITE** all files which have the same name.


#**Thank you for using Augmentor!**