<a href="https://colab.research.google.com/github/duchaba/Data-Augmentation-with-Python/blob/main/data_augmentation_with_python_chapter_8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Augmentation with Python, Chapter 8

## 🌻 Welcome to Chapter 8, "Audio spectrograme Augmentation"


This chapter will cover the audio spectrogram standard format, variation of a spectrogram, Mel-spectrogram, Chroma Short-time Fourier transformation (STFT), and augmentation techniques. In particular, we will cover the following topics: 

- Initialize and download 

- Audio spectrogram 

- Various spectrogram format 

- Mel-spectrogram and Chroma STFT 

- Spectrogram augmentation  

- Spectrogram image augmentation 

# Load Notebook


- This Notebook original link is: 
  - https://github.com/PacktPublishing/Data-Augmentation-with-Python/blob/main/data_augmentation_with_python_chapter_7.ipynb

# GitHub Clone

In [None]:
# git version should be 2.17.1 or higher
!git --version

In [None]:
url = 'https://github.com/PacktPublishing/Data-Augmentation-with-Python'
!git clone {url}

In [None]:
# (optional) list the notebook magic commands
# %lsmagic

## Fetch file from URL (Optional)

- Uncommend the below 2 code cells if you want to use URL and not Git Clone

In [None]:
# import requests
# #
# def fetch_file(url, dst):
#   downloaded_obj = requests.get(url)
#   with open(dst, "wb") as file:
#     file.write(downloaded_obj.content)
#   return

In [None]:
# url = ''
# dst = 'pluto_chapter_1.py'
# fetch_file(url,dst)

# Run Pluto

- Instantiate up Pluto, aka. "Pluto, wake up!"

In [None]:
# %% CARRY-OVER code install

!pip install opendatasets --upgrade
!pip install pyspellchecker 
!pip install audiomentations


In [None]:
#load and run the pluto chapter 7 Python code.
pluto_file = 'Data-Augmentation-with-Python/pluto/pluto_chapter_7.py'
%run {pluto_file}

## Verify Pluto

In [None]:
pluto.say_sys_info()

In [None]:
# # (optional) list attributes, function, and everything else
# dir(pluto)

## (Optional) Export to .py

In [None]:
pluto_chapter_8 = 'Data-Augmentation-with-Python/pluto/pluto_chapter_8.py'
!cp {pluto_file} {pluto_chapter_8}

# ✋ Set up Kaggle username and app Key

- Install the following libraries, and import it on the Notebook.
- Follow by initialize Kaggle username, key and fetch methods.

- STOP: Update your Kaggle access username or key first.

In [None]:
# %%CARRY-OVER code 

# -------------------- : --------------------
# READ ME
# Chapter 2 begin:
# Install the following libraries, and import it on the Notebook.
# Follow by initialize Kaggle username, key and fetch methods.
# STOP: Update your Kaggle access username or key first.
# -------------------- : --------------------

!pip install opendatasets --upgrade
import opendatasets
print("\nrequired version 0.1.22 or higher: ", opendatasets.__version__)

!pip install pyspellchecker 
import spellchecker
print("\nRequired version 0.7+", spellchecker.__version__)

# STOP: Update your Kaggle access username or key first.
pluto.remember_kaggle_access_keys("duchaba", "059d7f10e1838693868b30e9dbb7c8ce")
pluto._write_kaggle_credit()
import kaggle

@add_method(PacktDataAug)
def fetch_kaggle_comp_data(self,cname):
  #self._write_kaggle_credit()  # need to run only once.
  path = pathlib.Path(cname)
  kaggle.api.competition_download_cli(str(path))
  zipfile.ZipFile(f'{path}.zip').extractall(path)
  return

@add_method(PacktDataAug)
def fetch_kaggle_dataset(self,url,dest="kaggle"):
  #self._write_kaggle_credit()    # need to run only once.
  opendatasets.download(url,data_dir=dest)
  return
# -------------------- : --------------------


# Fetch Kaggle Data

## Musical Emotions Classification

In [None]:
%%time
url = 'https://www.kaggle.com/datasets/kingofarmy/musical-emotions-classification'
pluto.fetch_kaggle_dataset(url)

In [None]:
f = 'kaggle/musical-emotions-classification/Train.csv'
pluto.df_music_data = pluto.fetch_df(f)
pluto.df_music_data.head(3)

In [None]:
# # /content/kaggle/musical-emotions-classification/Audio_Files/Audio_Files/Train/Happy/Happy10200.wav
# # remove white space in directory and filename
# # run this until no error/output
# f = 'kaggle/sea-animals-image-dataste'
# #!find {f} -name "* *" -type d | rename 's/ /_/g'
# !find {f} -name "* *" -type f | rename 's/ /_/g'

In [None]:
# %%writefile -a {pluto_chapter_7}

pluto.version = 7.0
# augment full path
@add_method(PacktDataAug)
def _append_music_full_path(self,x):
  y = re.findall('([a-zA-Z ]*)\d*.*', x)[0]
  return (f'kaggle/musical-emotions-classification/Audio_Files/Audio_Files/Train/{y}/{x}')
#
@add_method(PacktDataAug)
def fetch_music_full_path(self, df):
  df['fname'] = df.ImageID.apply(self._append_music_full_path)
  return df

In [None]:
pluto.fetch_music_full_path(pluto.df_music_data)
pluto.df_music_data.head(3)

## Human Speech

In [None]:
%%time
url = 'https://www.kaggle.com/datasets/ejlok1/cremad'
pluto.fetch_kaggle_dataset(url)

In [None]:
# /content/kaggle/cremad/AudioWAV/1001_DFA_ANG_XX.wav
# change method name to make_dir_dframe
f = 'kaggle/cremad/AudioWAV'
pluto.df_voice_data = pluto.make_dir_dataframe(f)
pluto.df_voice_data.head(3)

In [None]:
pluto.df_voice_data.head(3)

## Urban sound

In [None]:
%%time
url = 'https://www.kaggle.com/datasets/rupakroy/urban-sound-8k'
pluto.fetch_kaggle_dataset(url)

In [None]:
# /content/kaggle/urban-sound-8k/UrbanSound8K/UrbanSound8K/audio/fold1
# change method name to make_dir_dframe
f = 'kaggle/urban-sound-8k/UrbanSound8K/UrbanSound8K/audio'
pluto.df_sound_data = pluto.make_dir_dataframe(f)
pluto.df_sound_data.head(3)

# Audio control D Major

In [None]:
# %%writefile -a {pluto_chapter_7}

f = 'Data-Augmentation-with-Python/pluto_data/control-d-major.mp3'
data_amp, sam_rate, fname = pluto._fetch_audio_data(f)
pluto.audio_control_dmajor = [data_amp, sam_rate, fname, f]

# Double check on Waveform graph

In [None]:
# double check on waveform graph from previous chapter
pluto._draw_audio(data_amp, sam_rate, 'Original: ' + fname)
# display audio 
display(IPython.display.Audio(data_amp, rate=sam_rate))

In [None]:
# (Optional) view raw data
# pluto.audio_control_dmajor

# Spectrogram

In [None]:
# %%writefile -a {pluto_chapter_8}

@add_method(PacktDataAug)
def _draw_spectrogram(self, data_amp, sam_rate, 
  fname='Spectrogram',
  window=matplotlib.mlab.window_hanning,
  cmap='viridis',
  sides='default',
  mode='default'):
  canvas, pic = matplotlib.pyplot.subplots(1, 1, figsize=(11, 5))
  spec, freq, ts, ax = pic.specgram(data_amp, 
    Fs=sam_rate,
    window=window,
    cmap=cmap,
    sides=sides,
    mode=mode)
  pic.set_xlabel('Time, Sampling Rate: '+str(sam_rate),fontsize=16.)
  pic.set_ylabel('Frequency (Hz)',fontsize=16.)
  pic.set_title(fname,fontsize=18.0)
  #
  # display and save image
  canvas.tight_layout()
  self._drop_image(canvas)
  canvas.show()
  return spec, freq, ts, ax

In [None]:
# %%writefile -a {pluto_chapter_8}

@add_method(PacktDataAug)
def draw_spectrogram(self, df,
  fname='Spectrogram',
  window=matplotlib.mlab.window_hanning,
  cmap='viridis',
  sides='default',
  mode='default'):
  if (type(df) is list):
    data_amp, sam_rate, fname, lname = df
  else:
    lname = self._fetch_1_sample(df)
    data_amp, sam_rate, fname = self._fetch_audio_data(lname)
  #
  spec, freq, ts, ax = self._draw_spectrogram(data_amp, sam_rate,
    fname='Spectrogram: '+fname,
    window=window,
    cmap=cmap,
    sides=sides,
    mode=mode)
  # display audio 
  display(IPython.display.Audio(data_amp,rate=sam_rate))
  return
#
@add_method(PacktDataAug)
def _draw_melspectrogram(self, mel_db, sam_rate, data_amp,
  fname='MelSpectrogram',
  cmap='viridis',
  y_axis='mel',
  y_label='Mel scale (Hz)'):
  canvas, pic = matplotlib.pyplot.subplots(1, 1, figsize=(11, 5))
  #
  img = librosa.display.specshow(mel_db, 
    sr=sam_rate,
    x_axis='time',
    y_axis=y_axis,
    fmax=8000, 
    ax=pic,
    cmap=cmap)
  canvas.colorbar(img, ax=pic, format='%+2.0f dB')
  #
  pic.set_xlabel('Time, Sampling Rate: '+str(sam_rate),fontsize=16.)
  pic.set_ylabel(y_label,fontsize=16.)
  pic.set_title(fname,fontsize=18.0)
  #
  # display and save image
  canvas.tight_layout()
  self._drop_image(canvas)
  canvas.show()
  # display audio 
  display(IPython.display.Audio(data_amp,rate=sam_rate))
  return
#
@add_method(PacktDataAug)
def draw_melspectrogram(self, df,
  fname='MelSpectrogram',
  cmap='viridis',
  is_chroma=False):
  if (type(df) is list):
    data_amp, sam_rate, fname, lname = df
  else:
    lname = self._fetch_1_sample(df)
    data_amp, sam_rate, fname = self._fetch_audio_data(lname)
  #
  if (is_chroma):
    mel_db = librosa.feature.chroma_stft(data_amp, 
      sr=sam_rate)
    yax = 'chroma'
    ylab = 'Pitch class'
    tname = 'Chroma STFT: ' + fname
  else:
    mel = librosa.feature.melspectrogram(y=data_amp, 
    sr=sam_rate, 
    n_mels=128,
    fmax=8000)
    mel_db = librosa.power_to_db(mel, ref=numpy.max)
    yax = 'mel'
    ylab = 'Mel scale (Hz)'
    tname = 'Melspectrogram: ' + fname
  #
  self._draw_melspectrogram(mel_db, sam_rate, data_amp,
    cmap=cmap,
    fname=tname,
    y_axis=yax,
    y_label = ylab)
  return
#

In [None]:
pluto.draw_spectrogram(pluto.audio_control_dmajor)

In [None]:
spec, freq, ts, ax = pluto._draw_spectrogram(pluto.audio_control_dmajor[0], pluto.audio_control_dmajor[1])

## Spectrogram parameters

In [None]:
pluto.draw_spectrogram(pluto.df_music_data, cmap='plasma')

In [None]:
pluto.draw_spectrogram(pluto.df_voice_data, cmap='cool')

In [None]:
pluto.draw_spectrogram(pluto.df_sound_data, cmap='brg')

In [None]:
pluto.draw_spectrogram(pluto.audio_control_dmajor,
  window=matplotlib.mlab.window_none)

In [None]:
pluto.draw_spectrogram(pluto.df_voice_data,
  cmap='cool',
  window=matplotlib.mlab.window_none)

In [None]:
pluto.draw_spectrogram(pluto.audio_control_dmajor,
  window=matplotlib.mlab.window_none,
  sides='twosided')

In [None]:
pluto.draw_spectrogram(pluto.df_music_data, 
  cmap='plasma',
  sides='twosided')

In [None]:
pluto.draw_spectrogram(pluto.audio_control_dmajor,
  window=matplotlib.mlab.window_none,
  sides='twosided',
  mode='angle')

In [None]:
pluto.draw_spectrogram(pluto.df_sound_data,
  cmap='brg',
  window=matplotlib.mlab.window_none,
  sides='twosided',
  mode='angle')

In [None]:
# chiqwawa barking

In [None]:
pluto.draw_melspectrogram(pluto.audio_control_dmajor)

In [None]:
pluto.draw_melspectrogram(pluto.df_voice_data, cmap='cool')

In [None]:
# woman, that is exactly what happens
pluto.fname_id

In [None]:
pluto.draw_melspectrogram(pluto.audio_control_dmajor, is_chroma=True)

In [None]:
pluto.draw_melspectrogram(pluto.df_music_data, is_chroma=True, cmap='plasma')

In [None]:
# cinamic strong violen

In [None]:
pluto.draw_melspectrogram(pluto.df_sound_data, is_chroma=True, cmap='brg')

# Time shift

In [None]:
# %CARRY-OVER

!pip install audiomentations

In [None]:
# %%writefile -a {pluto_chapter_7}

import audiomentations
#
@add_method(PacktDataAug)
def _fetch_1_sample(self, df, dsize=1):
  p = df.sample(dsize)
  p.reset_index(drop=True, inplace=True)
  return p.fname[0]
#
@add_method(PacktDataAug)
def _audio_transform(self, df, xtransform, title='',is_waveform=True):
  if (type(df) is list):
    data_amp, sam_rate, fname, lname = self.audio_control_dmajor
  else:
    lname = self._fetch_1_sample(df)
    data_amp, sam_rate, fname = self._fetch_audio_data(lname)
  #
  xaug = xtransform(data_amp, sample_rate=sam_rate)
  if (is_waveform):
    # augmented
    self._draw_audio(xaug, sam_rate, title + ' Augmented: ' + fname)
    display(IPython.display.Audio(xaug, rate=sam_rate))
    # original
    self._draw_audio(data_amp, sam_rate, 'Original: ' + fname)
  else:
    xdata = [xaug, sam_rate, fname, 'cat']
    self.draw_spectrogram(xdata)
    self.draw_melspectrogram(xdata)
    self.draw_melspectrogram(xdata,is_chroma=True)
  display(IPython.display.Audio(data_amp, rate=sam_rate))
  return 

In [None]:
# %%writefile -a {pluto_chapter_7}

@add_method(PacktDataAug)
def play_aug_time_shift(self, df, min_fraction=-0.2,
  max_fraction=0.8,rollover=True,title='Time Shift', is_waveform=True):
  xtransform = audiomentations.Shift(
    min_fraction = min_fraction,
    max_fraction = max_fraction,
    rollover = rollover,
    p=1.0
  )
  self._audio_transform(df, xtransform, title=title,is_waveform=is_waveform)
  return 

In [None]:
pluto.play_aug_time_shift(pluto.audio_control_dmajor, min_fraction=0.8)

In [None]:
pluto.play_aug_time_shift(pluto.audio_control_dmajor, min_fraction=0.8, is_waveform=False)

In [None]:
pluto.play_aug_time_shift(pluto.df_voice_data, min_fraction=0.6, is_waveform=False)

# Time stretch

In [None]:
# %%writefile -a {pluto_chapter_7}

@add_method(PacktDataAug)
def play_aug_time_stretch(self, df, min_rate=0.2,max_rate=6.8,
  leave_length_unchanged=True,title='Time Stretch',
  is_waveform=True):
  xtransform = audiomentations.TimeStretch(
    min_rate = min_rate,
    max_rate = max_rate,
    leave_length_unchanged = leave_length_unchanged,
    p=1.0
  )
  self._audio_transform(df, xtransform, title=title, is_waveform=is_waveform)
  return 
  # librosa.effects.time_stretch under the hood 

In [None]:
pluto.play_aug_time_stretch(pluto.audio_control_dmajor, max_rate=5.4, is_waveform=False)

In [None]:
# pluto.play_aug_time_stretch(pluto.df_music_data, max_rate=3.0)

In [None]:
pluto.play_aug_time_stretch(pluto.df_voice_data, max_rate=3.5, is_waveform=False)

# Noise injection, add Gaussian noise

In [None]:
# %%writefile -a {pluto_chapter_7}

@add_method(PacktDataAug)
def play_aug_noise_injection(self, df, min_amplitude=0.002,
  max_amplitude=0.2,title='Gaussian noise injection',is_waveform=True):
  xtransform = audiomentations.AddGaussianNoise(
    min_amplitude = min_amplitude,
    max_amplitude = max_amplitude,
    p=1.0)
  self._audio_transform(df, xtransform, title=title, is_waveform=is_waveform)
  return

In [None]:
pluto.play_aug_noise_injection(pluto.audio_control_dmajor, 
  min_amplitude=0.02,max_amplitude=0.05,is_waveform=False)

In [None]:
pluto.play_aug_noise_injection(pluto.df_music_data, 
  min_amplitude=0.008,max_amplitude=0.05,is_waveform=False)

In [None]:
pluto.play_aug_noise_injection(pluto.df_voice_data, max_amplitude=0.05, is_waveform=False)

# Push up all changes (Optional)

- username: duchaba

- password: [use the token]

In [None]:
# import os
# f = 'Data-Augmentation-with-Python'
# os.chdir(f)
# !git add -A
# !git config --global user.email "duc.haba@gmail.com"
# !git config --global user.name "duchaba"
# !git commit -m "end of session"
# # do the git push in the xterm console
# #!git push

In [None]:
# %%script false --no-raise-error  #temporary stop execute for export file

In [None]:
# # compress/zip all the pluto generated images from this chapter for download
# f = 'pluto-img-'+str(pluto.fname_id)+'.zip'
# print(f)
# !zip {f} /content/Data-Augmentation-with-Python/pluto_img/*

# Summary 

Every chaper will begin with same base class "PacktDataAug".

✋ FAIR WARNING:

- The coding uses long and complete function path name.

- Pluto wrote the code for easy to understand and not for compactness, fast execution, nor cleaverness.

- Use Xterm to debug cloud server



In [None]:
# !pip install colab-xterm
# %load_ext colabxterm
# %xterm