Skip to content
This repository was archived by the owner on Nov 20, 2022. It is now read-only.

ccoreilly/streaming-source-separation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

streaming-source-separation

Overview:
This project utilizes the Open-Unmix source separation modeling architecture
to produce separated audio to the loudspeakers as it is being processed.



The original Open-Unmix repository can be found here.

Open-Unmix utilizes 3 bidirectional LSTM layers to generate a spectral mask of its targeted source.
The final separation is produced by Wiener filtering the original mixed signal with the estimated spectral mask.

The online, streaming version was accomplished by training unidirectional LSTM models
and implementing a producer-consumer multithreading system in Python.

Included in the 'models' folder are trained models for sung vocals and spoken speech targets.
These were uploaded using git lfs and may require lfs in order to obtain them locally.

The model for sung vocals was trained using the MUSDB dataset
and the spoken speech model was trained using a subset of 7000 examples
from Mozilla's Common Voice dataset and 7000 samples from the UrbanSound8k dataset of urban noise.

Examples:
Given the provided models, the program can separate sung vocals from a musical mix
or speech from environmental noise.
When evaluating music files, either sung vocals or the backing instruments may be extracted.

python3 unmix_stream.py path_to_music_file.wav acapella
python3 unmix_stream.py path_to_music_file.wav instrumental
python3 unmix_stream.py path_to_noisy_speech_file.wav speech

References:
Stöter, F.R., Uhlich, S., Liutkus, A., Mitsufuji, Y. (2019). Open-Unmix - A Reference Implementation for Music Source Separation. Journal of Open Source Software, Open Journals, 4(41), 1667.
Open-Unmix Repository

Mozilla (2017). Mozilla Common Voice.
Common Voice Dataset

Salamon, J., Jacoby, C., & Bello, J. P. (2014, November). A dataset and taxonomy for urban sound research. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 1041-1044). ACM.
UbanSound Dataset Paper

About

Streaming source separation for music and speech files, using the Open-Unmix LSTM architecture.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%