# Instrumentify data pre processing: logmel

This notebook process the data from MedleyDB and OpenMIC to create a pipeline and transform the audio streams into audio-visual representations such as Log-Mel. 

Workflow:

wget zip --> process dataset --> store in SageMaker for further processing




In [None]:
# Common commands
!zip -r '/content/folder/"NewFileName.zip"' '/content/folder/"OrginalFileName"'
!unzip <filepath>
!wget <url>
%cp <zip_to_copy> /content/drive/MyDrive/ESP3201/Datasets/<dataset name>
plt.imsave() #for saving only the image without axes/borders and all
!du -h -s <filepath> #check size


## How processing works
transform (short time fft/ fourier/ constant q) --> spectogram (mel/gamma) --> (maybe) log scale

Extracting mel spectogram 
https://www.youtube.com/watch?v=TdnVE5m3o_0

1. Should we convert to mel-spectogram while importing for training? not sure

  maybe doing it as a form of pre-processing is more efficient

2. What other pre-processing steps need to be taken?

  Maybe data augmentation/weighting to adjust for the class imbalance?

# MedleyDB

In [None]:
!curl --cookie-jar zenodo-cookies.txt "https://zenodo.org/record/1715175?token=eyJhbGciOiJIUzUxMiIsImV4cCI6MTY2ODAzMTE5OSwiaWF0IjoxNjY1MzkwNjU1fQ.eyJkYXRhIjp7InJlY2lkIjoxNzE1MTc1fSwiaWQiOjI2NzA5LCJybmQiOiJiNzBmYWE3NyJ9.Xnz6zNOqAvGngR2YdvtkFKDP-12QyUbhkdgFUCHqHiYOuhJF_e1gqhAse658ZpLozLZlZxSY7-65y1NQvig2gA"
!curl --cookie zenodo-cookies.txt "https://zenodo.org/record/1715175/files/MedleyDB_V2.tar.gz?download=1" --output medleydb.tar.gz
# WORKS YAY

# OpenMIC

In [59]:
!pwd

/home/studio-lab-user/sagemaker-studiolab-notebooks/ESP3201-Instrument-indentification


In [15]:
# Download original zip
!wget https://zenodo.org/record/1432913/files/openmic-2018-v1.0.0.tgz

--2022-10-13 17:06:54--  https://zenodo.org/record/1432913/files/openmic-2018-v1.0.0.tgz
Resolving zenodo.org (zenodo.org)... 188.184.117.155
Connecting to zenodo.org (zenodo.org)|188.184.117.155|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2623376754 (2.4G) [application/octet-stream]
Saving to: 'openmic-2018-v1.0.0.tgz'


2022-10-13 17:08:48 (22.3 MB/s) - 'openmic-2018-v1.0.0.tgz' saved [2623376754/2623376754]



In [117]:
# no need to copy zip to drive anymore since download is fast
# %cp "/content/openmic-2018-v1.0.0.tgz" "/content/drive/MyDrive/ESP3201/Datasets/openmic-2018-v1.0.0.tgz"

In [16]:
# Extract and delete the zip
!tar --extract --file openmic-2018-v1.0.0.tgz
!rm openmic-2018-v1.0.0.tgz

In [1]:
# HELPFUL for getting list of audio files for running mel/gammatone on all, also to verify
import os
filelist=[]
for root, dirs, files in os.walk("openmic-2018/audio"):
	for file in files:
        #append the file name to the list
		filelist.append(os.path.join(root,file))
print(len(filelist))

20000


In [14]:
# to remove the folder if needed
!rm -r openmic-2018

# Convert to log mel


In [3]:
# CREATE THE FOLDER STRUCTURE inside the audio folder (the 000,001 folders), renamed original audio to audio-ogg for clarity
import shutil
import os
 
# defining the function to ignore the files if present in any folder
def ignore_files(dir, files):
    return [f for f in files if os.path.isfile(os.path.join(dir, f))]

In [2]:
os.rename("openmic-2018/audio","openmic-2018/audio-ogg")

In [19]:
# HELPFUL for getting list of audio files for running mel/gammatone on all
# use this here to rename audio to audio-ogg and get new file list
import os
oggfilelist=[]
for root, dirs, files in os.walk("openmic-2018/audio-ogg"):
	for file in files:
        #append the file name to the list
		oggfilelist.append(os.path.join(root,file))
print (len(oggfilelist))

20000


In [5]:
# check size for no specific reason
!du -h -s openmic-2018
# !rm -r openmic-2018/audio-wav/

2.9G	openmic-2018


In [22]:
# check size for no specific reason
!du -h -s openmic-2018

5.2G	openmic-2018


## convert ogg to logmel


In [1]:
import librosa
import librosa.display
import IPython.display as ipd
import matplotlib
import numpy as np

In [2]:
# parameters of the librosa.feature.melspectrogram is added for flexibility
def render_logmel_from_file(path, fft_window, hop_size, mel_bands, save_path):
    # load the file using Librosa and obtain the scale and sampling rate.
    scale, sampling_rate = librosa.load(path)
    # generate the mel_spectrogram and convert to dB (logmel)
    mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sampling_rate, n_fft=fft_window, hop_length=hop_size, n_mels=mel_bands)
    log_mel_spectrogram = librosa.power_to_db(mel_spectrogram)
    
    # convert spectrogram numpy to image
    img = librosa.display.specshow(log_mel_spectrogram, x_axis="time", y_axis="mel", sr=sampling_rate)
    
    # save the figure as a png
    matplotlib.pyplot.axis("off")
    matplotlib.pyplot.savefig(save_path, bbox_inches='tight',pad_inches = 0)
    # print(save_path)
    matplotlib.pyplot.close()
    
matplotlib.rcParams.update({'figure.max_open_warning': 0})

In [28]:
# test on one file if needed
# ogg_file = "/home/studio-lab-user/sagemaker-studiolab-notebooks/openmic-2018/audio-ogg/000/000046_3840.ogg"
# logmelfile = os.path.splitext(ogg_file)[0]+'.png'
# logmelfile = logmelfile.replace(logmelfile[logmelfile.index("audio-ogg"):],"audio-logmel" + logmelfile[logmelfile.index("audio-ogg")+9:])
# render_logmel_from_file(ogg_file, 2048, 512, 128, logmelfile)

## bulk convert


In [15]:
# remove the file used in testing
!rm -r openmic-2018/audio-logmel
# create data folder structure for logmel
shutil.copytree('openmic-2018/audio-ogg/',
                'openmic-2018/audio-logmel/',
                ignore=ignore_files)

'openmic-2018/audio-logmel/'

In [11]:
ogg_file = "openmic-2018/audio-ogg/000/000046_3840.ogg"

In [32]:
def bulk_convert(ogg_file):
    logmelfile = os.path.splitext(ogg_file)[0]+'.png'
    logmelfile = logmelfile.replace(logmelfile[logmelfile.index("audio-ogg"):],"audio-logmel" + logmelfile[logmelfile.index("audio-ogg")+9:])
    render_logmel_from_file(ogg_file, 2048, 512, 128, logmelfile)
    # print(logmelfile)
    
#bulk_convert(ogg_file)

In [None]:
#!pip install tqdm
from tqdm import tqdm
from multiprocessing import Pool

# convert with multiprocessing, 2 coz with 3 for some reason the third one stops working v early
p = Pool(processes=2)
a = list(tqdm(p.imap(bulk_convert, oggfilelist), total = 20000))
# p.map(bulk_convert, oggfilelist)
# the tqdm not rly working sadly :(
# IF IT GETS STUCK, RESTART KERNEL 
# for ogg in tqdm(oggfilelist):
#     bulk_convert(ogg)

In [26]:
import os
logmelfilelist=[]
for root, dirs, files in os.walk("openmic-2018/audio-logmel"):
	for file in files:
        
        #append the file name to the list
		logmelfilelist.append(os.path.join(root,file))
print(len(logmelfilelist))

19990


### Troubleshooting

In [12]:
%cd ..

/home/studio-lab-user/sagemaker-studiolab-notebooks


In [39]:
import os
oggfilelist=[]
for root, dirs, files in os.walk("openmic-2018/audio-ogg"):
    for file in files:
        #append the file name to the list
        oggfilelist.append(os.path.splitext(file)[0])
        #oggfilelist.append(file)
print(len(oggfilelist))
print(oggfilelist[1])

20000
000135_483840


In [35]:
import os
oggfilelist1=[]
for root, dirs, files in os.walk("openmic-2018/audio-ogg"):
    for dir in dirs:
        #append the file name to the list
        # oggfilelist.append(os.path.splitext(file)[0])
        oggfilelist1.append(dir)
print(len(oggfilelist1))
print(oggfilelist1[1])

156
001


In [8]:
import os
logmelfilelist=[]
for root, dirs, files in os.walk("openmic-2018/audio-logmel"):
	for file in files:
        
        #append the file name to the list
		logmelfilelist.append(os.path.splitext(file)[0])
print(len(logmelfilelist))
print(logmelfilelist[1])

0


IndexError: list index out of range

In [17]:
diff = list(set(logmelfilelist) ^ set(oggfilelist[:19988])) # Symmetric diff
print(len(diff))
print(diff)

26
['087246_145920', '155311_453120', '007116_107520-checkpoint', '029496_26880', '019964_88320', '155204_7680', '155294_184320', '052862_364800', '138182_311040', '155293_26880', '042332_11520', '155245_629760', '155278_211200', '149448_88320', '155225_126720', '007122_218880-checkpoint', '155307_211200', '115592_180480', '155197_34560', '155295_76800', '155310_372480', '104759_314880', '126419_207360', '155233_364800', '073544_533760', '062742_15360']


In [18]:
diff = list(set(logmelfilelist) ^ set(oggfilelist)) # check diff
print(len(diff))
print(diff)

14
['087246_145920', '007116_107520-checkpoint', '029496_26880', '019964_88320', '052862_364800', '138182_311040', '042332_11520', '149448_88320', '007122_218880-checkpoint', '115592_180480', '104759_314880', '126419_207360', '073544_533760', '062742_15360']


In [47]:
diff = []
for file in oggfilelist:
    if file not in logmelfilelist:
        diff.append(file)
print(len(diff))
print(diff)

0
[]


In [33]:
path = 'openmic-2018/audio-ogg/019/019964_88320.ogg'
bulk_convert(path)

 -0.24888012] as keyword args. From version 0.10 passing these as positional arguments will result in an error
  mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sampling_rate, n_fft=fft_window, hop_length=hop_size, n_mels=mel_bands)


In [44]:
file_paths = []
for file in diff:
    path = "openmic-2018/audio-ogg/"
    path += file[:3] + "/" + file + ".ogg"
    file_paths.append(path)

print(len(file_paths))
print(file_paths)

11
['openmic-2018/audio-ogg/029/029496_26880.ogg', 'openmic-2018/audio-ogg/042/042332_11520.ogg', 'openmic-2018/audio-ogg/052/052862_364800.ogg', 'openmic-2018/audio-ogg/062/062742_15360.ogg', 'openmic-2018/audio-ogg/073/073544_533760.ogg', 'openmic-2018/audio-ogg/087/087246_145920.ogg', 'openmic-2018/audio-ogg/104/104759_314880.ogg', 'openmic-2018/audio-ogg/115/115592_180480.ogg', 'openmic-2018/audio-ogg/126/126419_207360.ogg', 'openmic-2018/audio-ogg/138/138182_311040.ogg', 'openmic-2018/audio-ogg/149/149448_88320.ogg']


In [45]:
for path in file_paths:
    bulk_convert(path)
print("done")

  2.7783865e-03  8.6236047e-03] as keyword args. From version 0.10 passing these as positional arguments will result in an error
  mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sampling_rate, n_fft=fft_window, hop_length=hop_size, n_mels=mel_bands)
 -6.6021331e-02 -6.8275534e-02] as keyword args. From version 0.10 passing these as positional arguments will result in an error
  mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sampling_rate, n_fft=fft_window, hop_length=hop_size, n_mels=mel_bands)
  0.03891359] as keyword args. From version 0.10 passing these as positional arguments will result in an error
  mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sampling_rate, n_fft=fft_window, hop_length=hop_size, n_mels=mel_bands)
 8.4981807e-03] as keyword args. From version 0.10 passing these as positional arguments will result in an error
  mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sampling_rate, n_fft=fft_window, hop_length=hop_size, n_me

NameError: name 'done' is not defined

## Zip LogMel File

In [1]:
ls

LICENSE
data_preprocessing.ipynb
data_preprocessing_gammatone.ipynb
data_preprocessing_gammatone_studio.ipynb
data_preprocessing_logmel.ipynb
data_preprocessing_logmel_studio.ipynb
[0m[01;34mgammatone[0m/


In [2]:
cd ..

/home/studio-lab-user/sagemaker-studiolab-notebooks


In [3]:
ls

 [0m[01;34mESP3201-Instrument-indentification[0m/   [01;34mimages[0m/            log_mel_bulk.ipynb
'Getting Started.ipynb'                [01;34mlog_mel_OpenMIC[0m/   [01;34mopenmic-2018[0m/


In [4]:
cd openmic-2018/

/home/studio-lab-user/sagemaker-studiolab-notebooks/openmic-2018


In [21]:
conda install zip

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 4.10.3
  latest version: 22.9.0

Please update conda by running

    $ conda update -n base conda



## Package Plan ##

  environment location: /home/studio-lab-user/.conda/envs/default

  added / updated specs:
    - zip


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.1.1q             |       h166bdaf_1         2.1 MB  conda-forge
    zip-3.0                    |       h7f98852_1         110 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.2 MB

The following NEW packages will be INSTALLED:

  zip                conda-forge/linux-64::zip-3.0-h7f98852_1

The following packages will be UPDATED:

  openssl                                 1.1.1q-h166bdaf_0 --> 1.1.1q-h166bdaf_1



Dow

In [None]:
!zip -r audio-logmel.zip audio-logmel

  adding: audio-logmel/ (stored 0%)
  adding: audio-logmel/000/ (stored 0%)
  adding: audio-logmel/000/000046_3840.png (deflated 2%)
  adding: audio-logmel/000/000135_483840.png (deflated 1%)
  adding: audio-logmel/000/000141_153600.png (deflated 1%)
  adding: audio-logmel/000/000139_119040.png (deflated 1%)
  adding: audio-logmel/000/000144_30720.png (deflated 1%)
  adding: audio-logmel/000/000145_172800.png (deflated 1%)
  adding: audio-logmel/000/000154_288000.png (deflated 2%)
  adding: audio-logmel/000/000178_3840.png (deflated 1%)
  adding: audio-logmel/000/000182_145920.png (deflated 1%)
  adding: audio-logmel/000/000189_207360.png (deflated 1%)
  adding: audio-logmel/000/000190_126720.png (deflated 1%)
  adding: audio-logmel/000/000195_280320.png (deflated 1%)
  adding: audio-logmel/000/000201_168960.png (deflated 1%)
  adding: audio-logmel/000/000202_142080.png (deflated 1%)
  adding: audio-logmel/000/000203_7680.png (deflated 1%)
  adding: audio-logmel/000/000205_61440.png (d

In [None]:
!zip -r audio-logmel.zip audio-logmel

In [7]:
# check size for no specific reason
!du -h -s audio-logmel.zip

2.3G	audio-logmel.zip


In [None]:
!zip -r -s 500m audio-logmel-split.zip audio-logmel

In [14]:
# check size for no specific reason
!du -h -s audio-logmel-split.zip

283M	audio-logmel-split.zip
