Skip to content

ebilsanta/MLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IS460: Machine Learning and Applications Project G1T4

Datasets

This repository contains scripts we used for data processing. The raw datasets have to be downloaded from their original sources as they are too large to host here.

After the datasets are download, follow these steps in order:

  1. Instructions to set up and run the scripts are found in the respective folders eg. datasets/[dataset].
  2. run preprocess_csd_and_esmuc.py, check that now there exists the folder datasets/combined_processed_dataset. This will be used specifically for Open-Unmix.
  3. run batch_convert_mono_to_stereo_for_bsrnn.py, check that now there exists the folder datasets/combined_processed_dataset_converted_stereo. This will be used specifically for BSRNN.
  4. [for jukebox] bla bla

Training the Models

Since the 3 models are very different from each other, we have put them individually inside their respective directories in models/model. Inside each directory, please follow the respective README.md

Results

  1. Our models achieved the following Source-to-Distortion Ratio (SDR) results:
Soprano Alto Tenor Bass Average
SOTA* 1.67 10.70 -7.13 7.42 2.88
Open-Unmix 4.68 3.12 2.13 1.74 2.92
BSRNN 2.16 3.12 2.86 3.21 2.84

* P. Chandna, H. Cuesta, D. Petermann, and E. Gómez, “A Deep-Learning Based Framework for Source Separation, Analysis, and Synthesis of Choral Ensembles,” Front. Signal Process., vol. 2, p. 808594, Apr. 2022, doi: 10.3389/frsip.2022.808594.

  1. Check out some samples of our models' inferences on 4 choir songs: https://smu-my.sharepoint.com/:f:/g/personal/kevano_2020_business_smu_edu_sg/Ek2286izPx5Lkbew-vrgjUEBGDwSNMoahMWTMByRZ9-tzA?e=qj7A4L