Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Intelligent drum mixing with the Wave-U-Net

Implementation of the Mix-Wave-U-Net for automatic mixing of drums.

Listening examples

Listen to drum mixing results here.

What is the Mix-Wave-U-Net?

The Wave-U-Net is a convolutional neural network applicable to audio source separation tasks, which works directly on the raw audio waveform, presented in this paper.

We adapted the architecture slightly and applied it to the task of mixing a set of stem recordings into a full music mixture.

See the diagram below for a summary of the network architecture.



GPU strongly recommended to avoid very long training times (CUDA 10.0 required)

The project is based on Python 3.6 and requires libsndfile to be installed.

Then, the following Python packages need to be installed:


Instead of tensorflow-gpu, the CPU version of TF, tensorflow can be used, if there is no GPU available.

All the above packages are also saved in the file requirements.txt located in this repository, so you can clone the repository and then execute the following in the downloaded repository's path to install all the required packages at once:

pip install -r requirements.txt

Download datasets

To directly use the pre-trained models we provide for download to separate your own songs, now skip directly to the last section, since the dataset is not needed in that case.

To reproduce the experiments in the paper (train all the models), you need to download the ENST dataset and extract it into a folder of your choice. It should then have "drummer_1", "drummer_2" and "drummer_3" subfolders in it.

Set-up filepaths

Now you need to set up the correct file paths for the dataset and the location where outputs should be saved.

Open the file, and set the enst_path entry of the model_config dictionary to the location of the main folder of the ENST dataset. Also set the estimates_path entry of the same model_config dictionary to the path pointing to an empty folder where you want the final model outputs to be saved into.

Training the models / model overview

Since the paper investigates many model variants of the Wave-U-Net and also trains the U-Net proposed for vocal separation, which achieved state-of-the-art performance, as a comparison, we give a list of model variants to train and the command needed to start training them:

Model name (from paper) Description Command for training
Dry Wave-U-Net for dry mixing task python with cfg.context_dry
Wet Wave-U-Net for wet mixing task python with cfg.context_wet

Test trained models on songs!

Downloading our pretrained models

We provide pre-trained models so you can mix your own collection of drum recordings right away, see the Github release section for downloads. Unzip the obtained archive into the checkpoints subfolder in this repository, so that you have one subfolder for each model (e.g. REPO/checkpoints/dry)

Run pretrained models

For a quick demo on a set of drum stems using the wet mixing model, simply execute

python with cfg.context_wet

which will mix the drum stems contained in this repository's audio_examples/inputs subfolder. The output will be saved into audio_examples/outputs.

To use the dry model on the example inputs, use

python with cfg.context_dry output_path=audio_examples/outputs/dry_mix.wav model_path=checkpoints/dry/dry-460000

which loads the model checkpoint file of the dry model fitting to the model configuration that is used. output_path specifies a custom output file path for the mixture audio.

To use the model on your own inputs, add the input_path parameter to the command-line arguments. For example, to set the tom_2 input to PATH, use input_path={'tom_2':PATH}. Set PATH to None if you don't want to use this particular stem for mixing. See Config.pyfor details.


Wave-U-Net for automatic (drum) mixing







No packages published