# Example of SciAugment use for augmenting scientific images with YOLO anotations.

It uses albumentation (example of augmentation here: https://colab.research.google.com/drive/1JuZ23u0C0gx93kV0oJ8Mq0B6CBYhPLXy) and OpenCV. The goal is to create tools that make more sense for augmentation of scientific images. The way how the sensors capture data are important, and usualy the sensors and ways of capture are noc completely same as in capturing RGB data.

Thoughtful augmentation should improve robustnes of object detection and clasification. Bad augmentation not respecting characteristics of the sensor and data information/statistic may lead to increased erors or low usability of final model.

Clone SciAugment repository

In [1]:
!git clone https://github.com/martinschatz-cz/SciAugment.git

Cloning into 'SciAugment'...
remote: Enumerating objects: 53, done.[K
remote: Counting objects: 100% (53/53), done.[K
remote: Compressing objects: 100% (49/49), done.[K
remote: Total 53 (delta 21), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (53/53), done.


Install required packges

In [None]:
!pip install -r /content/SciAugment/requirements.txt -v

Import functions

In [3]:
from SciAugment.utilities.SciAug_tools import *

Connect to Google Drive or upload folder with your images (if needed).

In [4]:
# from google.colab import drive
# drive.mount('/content/gdrive')

Or unzip test folder (subsection320.zip) with subsection

In [5]:
!unzip -q /content/subsection320.zip -d /content/

Ideal tool for anotating images is https://www.makesense.ai/

Specify folder with images and YOLO anotations and run default augmentation. The process will automaticaly create train_data folder and randomly divide the images an labels to train/test folder with 70/30 distribution. The percentage of train part can be specified.

The default input format is .png (can be changed), and output format is .jpeg. The function expects images with three channels.

The ouput images have a name tag joined at the end of the name for better control over augmentatin.
name (string): Name where relevant to position is bit length 11 for:
 *     1:Shift
 *     2:Scale
 *     3:Rotate
 *     4:VerticalFlip
 *     5:HorizontalFlip
 *     6:RandomBrightnessContrast
 *     7:MultiplicativeNoise(multiplier=0.5, p=0.2)
 *     8:RandomSizedBBoxSafeCrop (250, 250, erosion_rate=0.0, interpolation=1, p=1.0)
 *     9:Blur(blur_limit=(50, 50), p=0)
 *     10:Transpose
 *     11:RandomRotate90

In [6]:
# @markdown Specifie a path to folder with images and YOLO anotations
input_images_folder = "/content/subsection320/"  # @param{type: 'string'}
input_image_format = ".jpeg"  # @param{type: 'string'}

For reproducible train/test distribution, select specific seed for random numbers.

In [7]:
random.seed(7)

Create default augmentation object.

Default augmentation does not augment brightness, as Albumentations package offers mainly RGB augmentation - which is not always usable for multi channel scientific images.

It will notify user about selected augmentation. Each augmentation will create one new image and label.

In [8]:
aug1 = SciAugment()

New instance of SciAugment.
Selected augmentation type: Default


Version: 0.1.0


Selected augmentation:
HorizontalFlip(p=1)
RandomSizedBBoxSafeCrop(250, 250, erosion_rate=0.0, interpolation=1, p=1.0)
Transpose(1)
RandomRotate90(p=1)
ShiftScaleRotate(p=1)
VerticalFlip(p=1)


Apply augmentatin on selected folder of images and YOLO labels (if there already exists train_folder, the function will stop).

In [9]:
aug1.augment_data(images_path=input_images_folder, image_format=input_image_format)

Num of files: 63
Processing: im_1.jpeg
/content/subsection320/
im_1.jpeg
Processing: im_1.txt
/content/subsection320/im_1.txt
Writing im_1_0_00001000000.jpg
Writing im_1_1_00000001000.jpg
Writing im_1_2_00000000010.jpg
Writing im_1_3_00000000001.jpg
Writing im_1_4_11100000000.jpg
Writing im_1_5_00010000000.jpg
Processing: im_10.jpeg
/content/subsection320/
im_10.jpeg
Processing: im_10.txt
/content/subsection320/im_10.txt
Writing im_10_6_00001000000.jpg
Writing im_10_7_00000001000.jpg
Writing im_10_8_00000000010.jpg
Writing im_10_9_00000000001.jpg
Writing im_10_10_11100000000.jpg
Writing im_10_11_00010000000.jpg
Processing: im_11.jpeg
/content/subsection320/
im_11.jpeg
Processing: im_11.txt
/content/subsection320/im_11.txt
Writing im_11_12_00001000000.jpg
Writing im_11_13_00000001000.jpg
Writing im_11_14_00000000010.jpg
Writing im_11_15_00000000001.jpg
Writing im_11_16_11100000000.jpg
Writing im_11_17_00010000000.jpg
Processing: im_12.jpeg
/content/subsection320/
im_12.jpeg
Processing: 

There exist another prepared version of augmentation (it will be tuned up in future after testing)

In [10]:
aug2 = SciAugment(aug_type="fluorescece_microscopy")

New instance of SciAugment.
Selected augmentation type: fluorescece_microscopy


Version: 0.1.0


Selected augmentation:
HorizontalFlip(p=1)
RandomBrightnessContrast(p=1)
MultiplicativeNoise(multiplier=0.5, p=0.2)
RandomSizedBBoxSafeCrop(250, 250, erosion_rate=0.0, interpolation=1, p=1.0)
Blur(blur_limit=(50, 50), p=0)
Transpose(1)
RandomRotate90(p=1)
ShiftScaleRotate(p=1)


It is possible to apply it in a same way (after renaming already existing train_data folder)

In [11]:
# aug2.augment_data(images_path=input_images_folder, input_image_format='.jpeg')

Zip up prepared train_data folder with augmented images and YOLO anotations for backup.

In [12]:
import shutil

shutil.make_archive("train_data", "zip", "/content/", base_dir="train_data")

'/content/train_data.zip'

Install and apply watermark

In [13]:
!pip install watermark

%load_ext watermark

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting watermark
  Downloading watermark-2.3.1-py2.py3-none-any.whl (7.2 kB)
Installing collected packages: watermark
Successfully installed watermark-2.3.1


In [14]:
%watermark -v -p albumentations,opencv-python-headless,imgaug,cv2

Python implementation: CPython
Python version       : 3.7.13
IPython version      : 5.5.0

albumentations        : 1.2.1
opencv-python-headless: not installed
imgaug                : 0.4.0
cv2                   : 4.1.2

