Intelligent drum mixing with the Wave-U-Net

Implementation of the Mix-Wave-U-Net for automatic mixing of drums.

Listening examples

Listen to drum mixing results here.

What is the Mix-Wave-U-Net?

The Wave-U-Net is a convolutional neural network applicable to audio source separation tasks, which works directly on the raw audio waveform, presented in this paper.

We adapted the architecture slightly and applied it to the task of mixing a set of stem recordings into a full music mixture.

See the diagram below for a summary of the network architecture.

Installation

Requirements

GPU strongly recommended to avoid very long training times (CUDA 10.0 required)

The project is based on Python 3.6 and requires libsndfile to be installed.

Then, the following Python packages need to be installed:

numpy==1.18.3
sacred==0.8.1
tensorflow-gpu==1.15.2
librosa==0.7.2
soundfile==0.10.3.post1
lxml==4.5.0
google
protobuf
soundfile
tqdm

Instead of tensorflow-gpu, the CPU version of TF, tensorflow can be used, if there is no GPU available.

All the above packages are also saved in the file requirements.txt located in this repository, so you can clone the repository and then execute the following in the downloaded repository's path to install all the required packages at once:

pip install -r requirements.txt

Download datasets

To directly use the pre-trained models we provide for download to separate your own songs, now skip directly to the last section, since the dataset is not needed in that case.

To reproduce the experiments in the paper (train all the models), you need to download the ENST dataset and extract it into a folder of your choice. It should then have "drummer_1", "drummer_2" and "drummer_3" subfolders in it.

Set-up filepaths

Now you need to set up the correct file paths for the dataset and the location where outputs should be saved.

Open the Config.py file, and set the enst_path entry of the model_config dictionary to the location of the main folder of the ENST dataset. Also set the estimates_path entry of the same model_config dictionary to the path pointing to an empty folder where you want the final model outputs to be saved into.

Training the models / model overview

Since the paper investigates many model variants of the Wave-U-Net and also trains the U-Net proposed for vocal separation, which achieved state-of-the-art performance, as a comparison, we give a list of model variants to train and the command needed to start training them:

Model name (from paper)	Description	Command for training
Dry	Wave-U-Net for dry mixing task	`python Training.py with cfg.context_dry`
Wet	Wave-U-Net for wet mixing task	`python Training.py with cfg.context_wet`

Test trained models on songs!

Downloading our pretrained models

We provide pre-trained models so you can mix your own collection of drum recordings right away, see the Github release section for downloads. Unzip the obtained archive into the checkpoints subfolder in this repository, so that you have one subfolder for each model (e.g. REPO/checkpoints/dry)

Run pretrained models

For a quick demo on a set of drum stems using the wet mixing model, simply execute

python Predict.py with cfg.context_wet

which will mix the drum stems contained in this repository's audio_examples/inputs subfolder. The output will be saved into audio_examples/outputs.

To use the dry model on the example inputs, use

python Predict.py with cfg.context_dry output_path=audio_examples/outputs/dry_mix.wav model_path=checkpoints/dry/dry-460000

which loads the model checkpoint file of the dry model fitting to the model configuration that is used. output_path specifies a custom output file path for the mixture audio.

To use the model on your own inputs, add the input_path parameter to the command-line arguments. For example, to set the tom_2 input to PATH, use input_path={'tom_2':PATH}. Set PATH to None if you don't want to use this particular stem for mixing. See Config.pyfor details.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Input		Input
Models		Models
audio_examples		audio_examples
checkpoints		checkpoints
data		data
.gitignore		.gitignore
Config.py		Config.py
Datasets.py		Datasets.py
Evaluate.py		Evaluate.py
LICENSE		LICENSE
Predict.py		Predict.py
PredictDataset.py		PredictDataset.py
README.md		README.md
Test.py		Test.py
Training.py		Training.py
Utils.py		Utils.py
mixwaveunet.png		mixwaveunet.png
requirements.txt		requirements.txt

License

f90/Mix-Wave-U-Net

Folders and files

Latest commit

History

Repository files navigation

Intelligent drum mixing with the Wave-U-Net

Listening examples

What is the Mix-Wave-U-Net?

Installation

Requirements

Download datasets

Set-up filepaths

Training the models / model overview

Test trained models on songs!

Downloading our pretrained models

Run pretrained models

About

Resources

License

Stars

Watchers

Forks

Languages