This is an implementation of U-Net for vocal separation proposed at ISMIR 2017, with Chainer framework.
cupy 2.0 (required if you want to train U-Net yourself. CUDA environment required.)
Please refer to
DoExperiment.py for code examples (or simply modify it!).
How to prepare dataset for U-Net training
*If you want to train U-Net with your own dataset, prepare the mixed, instrumental-only, and vocal-only versions of each track, and pickle their spectrograms using
util.SaveSpectrogram() function. You should set
PATH_FFT (in const.py) to the directory you want to save the pickled data.
*If you have either iKala, MedleyDB, DSD100 dataset, you could make use of
ProcessXX.py scripts. Remember to set the
PATH_XX in each script to the right path.
*If you want to generate dataset with "original" and "instrumental version" audio pairs (as the original work did), refer to
The neural network is implemented according to the following publication:
Andreas Jansson, Eric J. Humphrey, Nicola Montecchio, Rachel Bittner, Aparna Kumar, Tillman Weyde, Singing Voice Separation with Deep U-Net Convolutional Networks, Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR), 2017.