<a href="https://colab.research.google.com/github/Eddycrack864/Music-Source-Separation-Universal-Colab/blob/main/Music_Source_Separation_Universal_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Music Source Separation Universal
Repository for training models for music source separation. Repository is based on [kuielab code](https://github.com/kuielab/sdx23/tree/mdx_AB/my_submission/src) for [SDX23 challenge](https://github.com/kuielab/sdx23/tree/mdx_AB/my_submission/src). The main idea of this repository is to create training code, which is easy to modify for experiments. Brought to you by [MVSep.com](https://mvsep.com).

Repository by: [ZFTurbo](https://github.com/ZFTurbo)
Colab by: [Not Eddy (Spanish Mod)](http://discord.com/users/274566299349155851) on [AI HUB](https://discord.gg/aihub)

In [None]:
#@title Setup
import requests
import zipfile
from io import BytesIO
from google.colab import drive
from tqdm import tqdm
from IPython.display import clear_output
!git clone https://github.com/ZFTurbo/Music-Source-Separation-Training.git
!pip install -r Music-Source-Separation-Training/requirements.txt
clear_output()
print("Installation complete!. Starting downloading the models...")
zip_url = "https://huggingface.co/Eddycrack864/Music-Source-Separation-Universal/resolve/main/Models.zip"
response = requests.get(zip_url, stream=True)
total_size = int(response.headers.get('content-length', 0))
block_size = 1024
with BytesIO() as zip_data:
    with tqdm(total=total_size, unit='iB', unit_scale=True) as pbar:
        for data in response.iter_content(block_size):
            pbar.update(len(data))
            zip_data.write(data)
    zip_data.seek(0)
    print("Successful download, extracting...")
    with zipfile.ZipFile(zip_data, 'r') as zip_ref:
        zip_ref.extractall("Models")
drive.mount('/content/drive')
clear_output()
print("Ready!")

In [None]:
#@markdown #Separation
%cd /content/Music-Source-Separation-Training
from pathlib import Path
import glob

input_folder = '/content/drive/MyDrive/Separar' #@param {type:"string"}
output_folder = '/content/drive/MyDrive/Vocales' #@param {type:"string"}
model = 'MDX23C Inst' #@param ["MDX23C Inst", "MDX23C Vocals", "Demucs4HT", "VitLarge23", "Mel-Band RoFormer", "Swin Upernet", "BandIt Plus"]
#@markdown **Notes:**

#@markdown **It only works with audio in .wav format**

#@markdown **Demucs4HT** only works with audios that are 2 minutes long or shorter.

#@markdown **Mel-Band RoFormer** currently broken if you know how to fix it ping me.

#@markdown **Swin Upernet** currently broken if you know how to fix it ping me.
if model == 'MDX23C Inst':
    model_type= 'mdx23c'
    config_path = '/content/Models/config_musdb18_mdx23c.yaml'
    start_check_point = '/content/Models/model_mdx23c_ep_168_sdr_7.0207.ckpt'

elif model == 'MDX23C Vocals':
    model_type= 'mdx23c'
    config_path =  '/content/Models/config_vocals_mdx23c.yaml'
    start_check_point = '/content/Models/model_vocals_mdx23c_sdr_10.17.ckpt'

elif model == 'Demucs4HT':
    model_type= 'htdemucs'
    config_path = '/content/Models/config_vocals_htdemucs.yaml'
    start_check_point = '/content/Models/model_vocals_htdemucs_sdr_8.78.ckpt'

elif model == 'VitLarge23':
    model_type= 'segm_models'
    config_path = '/content/Models/config_vocals_segm_models.yaml'
    start_check_point = '/content/Models/model_vocals_segm_models_sdr_9.77.ckpt'

elif model == 'Mel-Band RoFormer':
    model_type= 'mel_band_roformer'
    config_path = '/content/Models/config_vocals_mel_band_roformer.yaml'
    start_check_point = '/content/Models/model_vocals_mel_band_roformer_sdr_8.42.ckpt'

elif model == 'Swin Upernet':
    model_type= 'swin_upernet'
    config_path = '/content/Models/config_vocals_swin_upernet.yaml'
    start_check_point = '/content/Models/model_swin_upernet_ep_56_sdr_10.6703.ckpt'

elif model == 'BandIt Plus':
    model_type= 'bandit'
    config_path = '/content/Models/config_dnr_bandit_bsrnn_multi_mus64.yaml'
    start_check_point = '/content/Models/model_bandit_plus_dnr_sdr_11.47.chpt'

Path(output_folder).mkdir(parents=True, exist_ok=True)
!python inference.py \
        --model_type {model_type} \
        --input_folder {input_folder} \
        --store_dir {output_folder} \
        --device_ids 0 \
        --config_path {config_path} \
        --start_check_point {start_check_point}

# Models

Model can be chosen with `--model_type` arg.

Available models:
* MDX23C based on [KUIELab TFC TDF v3 architecture](https://github.com/kuielab/sdx23/). Key: `mdx23c`.
* Demucs4HT [[Paper](https://arxiv.org/abs/2211.08553)]. Key: `htdemucs`.
* VitLarge23 based on [Segmentation Models Pytorch](https://github.com/qubvel/segmentation_models.pytorch). Key: `segm_models`.
* Mel-Band RoFormer [[Paper](https://arxiv.org/abs/2310.01809), [Repository](https://github.com/lucidrains/BS-RoFormer)]. Key: `mel_band_roformer`.
* Swin Upernet [[Paper](https://arxiv.org/abs/2103.14030)] Key: `swin_upernet`.
* BandIt Plus [[Paper](https://arxiv.org/abs/2309.02539), [Repository](https://github.com/karnwatcharasupat/bandit)] Key: `bandit`.


# Vocal models
| Model Type | Instruments | Metrics (SDR) |
|:-------------:|:-------------:|:-----:|
| MDX23C | vocals / other | SDR vocals: 10.17 |
| HT Demucs | vocals / other | SDR vocals: 8.78 |
| Segm Models (VitLarge23) | vocals / other | SDR vocals: 9.77 |
| Mel Band RoFormer | vocals (*) / other | SDR vocals: 8.42 |
| Swin Upernet | vocals / other | SDR vocals: 7.57 |

**Note**: Metrics measured on [Multisong Dataset](https://mvsep.com/en/quality_checker).

# Multi-stem models

| Model Type | Instruments | Metrics (SDR) |
|:-------------:|:-------------:|:-----:|
| MDX23C | bass / drums / vocals / other | MUSDB test avg: 7.15 (bass: 5.77, drums: 7.93 vocals: 9.23 other: 5.68) Multisong avg: 7.02 (bass: 8.40, drums: 7.73 vocals: 7.36 other: 4.57) |
| BandIt Plus | speech / music / effects | DnR test avg: 11.50 (speech: 15.64, music: 9.18 effects: 9.69) |

**Note**: Models were trained only on MUSDB18HQ dataset (100 songs train data)