Skip to content
Improving Single-Network Single-Channel Separation of Musical Audio with Convolutional Layers
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
RGT1
RGT2
Readme.md

Readme.md

Improving Single-Network Single-Channel Separation of Musical Audio with Convolutional Layers

This repository contains the code to reproduce the musical audio source separation presented in

Roma, G., Green, O. & Tremblay, P. A. Improving single-network single-channel separation of musical audio with convolutional layers 14th International Conference on Latent Variable Analysis and Signal Separation (2018)

The original evaluation used the DSD100 dataset via dsdtools. For the 2018 SiSEC campaign it was updated to use the musdb18 dataset.

Requirements

  • numpy
  • scipy
  • pytorch
  • resampy
  • musdb
  • mir_eval

How to use

The two proposed variants are in RGT1 and RGT2, corresponding to the SiSEC submissions. The first uses a soft mask and MSE loss, the second uses negative log likelihood loss with a 2D softmax output to predict a binary mask.

In each directory, configure the dataset path and the analysis path (for temporary analysis files). Then run each script in order as needed:

  • python analyze.py
  • python train.py
  • python predict.py

Also create the folders for results as needed. Using a GPU is advised for the train.py script.

For evaluation, the script eval_track.py will evaluate one track, so it can be run in parallel using e.g. GNU parallel:

parallel -j4 python eval_track.py {} ::: {0..49}

The scripts predict_sisec.py and eval_musdb.py run the evaluation of the Test part of the dataset according to the SiSEC campaign procedure. Since the original approach is single-channel it is run for each channel separately.

You can’t perform that action at this time.