# Data augmentation

If training data is sparse, it can be augmented by synthesizing new data from it. This is particularly easy for audio data because new data can be obtained by distorting the original signal. The following code snippeds show how this is done and let you listen to the result of synthetisation on the following audio signal.

In [None]:
import numpy as np
import librosa
import soundfile as sf
from IPython.display import HTML, Audio, display

audio_path = '/media/daniel/IP9/corpora/readylingua-de/readylingua-de-train-0100.wav'
transcript_path = '/media/daniel/IP9/corpora/readylingua-de/readylingua-de-train-0100.txt'
audio, rate = librosa.load(audio_path, sr=16000, mono=True)

## Original signal

In [None]:
display(HTML('<strong>original signal</strong>'))
display(Audio(data=audio, rate=rate))
display(HTML(filename=transcript_path))

## Audio shift

The audio signal can be shifted by zero-padding the original signal on the left. This makes the audio signal start later.

In [None]:
from util.audio_util import shift

shifted = shift(audio, rate, shift_s=1)

display(HTML('<strong>Original audio signal shifted 1s to the right:</strong>'))
display(Audio(data=shifted, rate=rate))

## Faster/slower speakers

Faster or slower speakers can be simulated by changing the tempo (without changing the pitch):

In [None]:
from util.audio_util import change_tempo

slower = change_tempo(audio, 0.8)
faster = change_tempo(audio, 1.3)

display(HTML('<strong>Original audio with slower speed:</strong>'))
display(Audio(data=slower, rate=rate))

display(HTML('<strong>Original audio with faster speed:</strong>'))
display(Audio(data=faster, rate=rate))

## Higher/lower pitch

Higher and lower voices can be simulated by changing the pitch. This could be used to simulate male speakers from female and vice versa. However, because the speaker gender was not in the meta data provided by _ReadyLingua_ the change was done randomly for both sexes.

In [None]:
from util.audio_util import change_pitch

higher = change_pitch(audio, rate, factor=+5)
lower = change_pitch(audio, rate, factor=-5)

display(HTML('<strong>Original audio with higher pitch:</strong>'))
display(Audio(data=higher, rate=rate))

display(HTML('<strong>Original audio with lower pitch:</strong>'))
display(Audio(data=lower, rate=rate))

## Louder/more silent speakers

Louder or more silent speakers can be simulated by adding/removing some loudness.

In [None]:
from util.audio_util import change_volume

louder = change_volume(audio, rate, db=+10)
more_quiet = change_volume(audio, rate, db=-10)

display(HTML('<strong>Original audio signal +10db louder:</strong>'))
display(Audio(data=louder, rate=rate))

display(HTML('<strong>Original audio signal -10db more quiet:</strong>'))
display(Audio(data=more_quiet, rate=rate))

## Echo/Reverb

Some distortion can be applied by adding some echo.

In [None]:
from util.audio_util import add_echo

echo = add_echo(audio)

display(HTML('<strong>Original audio signal with echo:</strong>'))
display(Audio(data=echo, rate=rate))

## Combining the effects

Multiple effects can be applied simultaneously.

In [None]:
from util.audio_util import distort_audio

distorted = distort_audio(audio, rate, shift_s=0.3, pitch_factor=1.5, tempo_factor=1.5, volume=10, echo=100)

display(HTML('<strong>Distorted signal:</strong>'))
display(Audio(data=distorted, rate=rate))