2.3. Augmenting datasets

This part of Allie's skills relates to data augmentation.

Data augmentation is used to expand the training dataset in order to improve the performance and ability of a machine learning model to generalize. For example, you may want to shift, flip, brightness, and zoom on images to augment datasets to make models perform better in noisy environments indicative of real-world use. Data augmentation is especially useful when you don't have that much data, as it can greatly expand the amount of training data that you have for machine learning.

Typical augmentation scheme is to take 50% of the data and augment it and leave the rest the same. You can read more about data augmentation here.

Getting started

To augment an entire folder of a certain file type (e.g. audio files of .WAV format), you can run:

cd /Users/jim/desktop/allie
cd augmentation/audio_augmentation
python3 augment.py /Users/jim/desktop/allie/train_dir/males/
python3 augment.py /Users/jim/desktop/allie/train_dir/females/

The code above will augment all the audio files in the folderpath via the default_augmenter specified in the settings.json file (e.g. 'augment_tasug'). In this case, it will augment both the males and females folders full of .WAV files

You should now have 2x the data in each folder. Here is a sample audio file and augmented audio file (in females) folder, for reference:

Non-augmented file sample (female speaker)
Augmented file sample (female speaker)

Click the .GIF below to follow along this example in a video format:

Expanding to other file types

Note you can extend this to any of the augmentation types. The table below overviews how you could call each as a augmenter. In the code below, you must be in the proper folder (e.g. ./allie/augmentation/audio_augmentations for audio files, ./allie/augmentation/image_augmentation for image files, etc.) for the scripts to work properly.

Data type	Supported formats	Call to featurizer a folder	Current directory must be
audio files	.MP3 / .WAV	`python3 augment.py [folderpath]`	./allie/augmentation/audio_augmentation
text files	.TXT	`python3 augment.py [folderpath]`	./allie/augmentation/text_augmentation
image files	.PNG	`python3 augment.py [folderpath]`	./allie/augmentation/image_augmentation
video files	.MP4	`python3 augment.py [folderpath]`	./allie/augmentation/video_augmentation
csv files	.CSV	`python3 augment.py [folderpath]`	./allie/augmentation/csv_augmentation

Implemented

Audio

augment_tsaug - adds noise and various shifts to audio files, addes 2x more data; see tutorial here.
augment_addnoise - adds noise to an audio file.
augment_noise - removes noise from audio files randomly.
augment_pitch - shifts pitch up and down to correct for gender differences.
augment_randomsplice - randomly splice an audio file to generate more data.
augment_silence - add silence to an audio file to augment a dataset.
augment_time - change time duration for a variety of audio files through making new files.
augment_volume - change volume randomly (helps to mitigate effects of microphohne distance on a model).

Text

augment_textacy - uses textacy to augment text files.

Image

augment_imaug - uses imaug to augment image files (random transformations).

Video

augment_vidaug - uses vidaug to augment video files (random transformations).

CSV

augment_tgan_classification - generative adverserial examples - can be done on class targets / problems.
augment_ctgan_regression - generative adverserial example on regression problems / targets.

Settings

Here are some settings that can be customized for Allie's augmentation API. Settings can be modified in the settings.json file.

setting	description	default setting	all options
augment_data	whether or not to implement data augmentation policies during the model training process via default augmentation scripts.	True	True, False
default_audio_augmenters	the default augmentation strategies used during audio modeling if augment_data == True	["augment_tsaug"]	["augment_tsaug", "augment_addnoise", "augment_noise", "augment_pitch", "augment_randomsplice", "augment_silence", "augment_time", "augment_volume"]
default_csv_augmenters	the default augmentation strategies used to augment .CSV file types as part of model training if augment_data==True	["augment_ctgan_regression"]	["augment_ctgan_classification", "augment_ctgan_regression"]
default_image_augmenters	the default augmentation techniques used for images if augment_data == True as a part of model training.	["augment_imgaug"]	["augment_imgaug"]
default_text_augmenters	the default augmentation strategies used during model training for text data if augment_data == True	["augment_textacy"]	["augment_textacy", "augment_summary"]
default_video_augmenters	the default augmentation strategies used for videos during model training if augment_data == True	["augment_vidaug"]	["augment_vidaug"]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly