Skip to content

Commit

Permalink
[egs] Add a new pre-trained model (X-UMXL) (#665)
Browse files Browse the repository at this point in the history
* [Fix] Update "egs/musdb18/X-UMX/requirements.txt"

* [Fix] Bug of X-UMX and README.md

* [egs] Add X-UMX Large (X-UMXL) to the list of pretrained models

* Apply black
  • Loading branch information
r-sawata committed May 21, 2023
1 parent bd3caa3 commit 78e419e
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 2 deletions.
1 change: 1 addition & 0 deletions asteroid/utils/hub_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
"tmirzaev-dotcom/ConvTasNet_Libri3Mix_sepnoisy": "https://zenodo.org/record/4020529/files/model.pth?download=1",
"mhu-coder/ConvTasNet_Libri1Mix_enhsingle": "https://zenodo.org/record/4301955/files/model.pth?download=1",
"r-sawata/XUMX_MUSDB18_music_separation": "https://zenodo.org/record/4704231/files/pretrained_xumx.pth?download=1",
"r-sawata/XUMXL_MUSDB18_music_separation": "https://zenodo.org/record/7128659/files/pretrained_xumxl.pth?download=1",
}

SR_HASHTABLE = {k: 8000.0 if not "DeMask" in k else 16000.0 for k in MODELS_URLS_HASHTABLE}
Expand Down
23 changes: 22 additions & 1 deletion egs/musdb18/X-UMX/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# CrossNet-Open-Unmix (X-UMX)

This recipe contains __CrossNet-Open-Unmix (X-UMX)__, an improved version of [Open-Unmix (UMX)](https://github.com/sigsep/open-unmix-nnabla) for music source separation. X-UMX achieves an improved performance without additional learnable parameters compared to the original UMX model. Details of X-UMX can be found in [this paper](https://arxiv.org/abs/2010.04228). X-UMX is one of the two official baseline models for the [Music Demixing (MDX) Challenge 2021](https://www.aicrowd.com/challenges/music-demixing-challenge-ismir-2021).
This recipe contains __CrossNet-Open-Unmix (X-UMX)__, an improved version of [Open-Unmix (UMX)](https://github.com/sigsep/open-unmix-nnabla) for music source separation. X-UMX achieves an improved performance without additional learnable parameters compared to the original UMX model. Details of X-UMX can be found in [this paper](https://arxiv.org/abs/2010.04228). X-UMX is one of the two official baseline models for the [Music Demixing (MDX) Challenge 2021](https://www.aicrowd.com/challenges/music-demixing-challenge-ismir-2021). Furthermore, the extended version trained on large-scale data, named X-UMX Large (X-UMXL), can be found in [this paper](https://arxiv.org/abs/2305.07855).

__Related Projects:__ [umx-pytorch](https://github.com/sigsep/open-unmix-pytorch) | [umx-nnabla](https://github.com/sigsep/open-unmix-nnabla) | x-umx-pytorch | [x-umx-nnabla](https://github.com/sony/ai-research-code/tree/master/x-umx) | [musdb](https://github.com/sigsep/sigsep-mus-db) | [museval](https://github.com/sigsep/sigsep-mus-eval)

Expand All @@ -9,13 +9,19 @@ Pretrained models on MUSDB18 for X-UMX, which reproduce the results from our pap
```
python eval.py --no-cuda --root [Path to MUSDB18]
```

You can also use X-UMXL by adding the option `--large` as follows:
```
python eval.py --no-cuda --root [Path to MUSDB18] --large
```
The separations along with the evaluation scores will be saved in `./results_using_pre-trained`.

Please note that X-UMX requires quite some memory due to its crossing architecture. Hence, switching on `--no-cuda` to prevent out-of-memory error is recommended.


### Results on MUSDB18

#### X-UMX
| Median of Median | SDR | SIR | ISR | SAR |
|:----------------:|:-------:|:------:|:------:|:------|
| vocals | 6.612 | 14.167 | 11.774 | 6.750 |
Expand All @@ -29,3 +35,18 @@ Please note that X-UMX requires quite some memory due to its crossing architectu
| drums | 5.793 | 11.167 | 10.164 | 5.687 |
| bass | 4.558 | 8.797 | 9.786 | 5.828 |
| other | 4.348 | 6.952 | 9.402 | 4.849 |

#### X-UMXL
| Median of Median | SDR | SIR | ISR | SAR |
|:----------------:|:-------:|:------:|:------:|:------|
| vocals | 7.565 | 16.737 | 13.900 | 7.469 |
| drums | 7.394 | 13.726 | 13.533 | 7.337 |
| bass | 6.283 | 12.650 | 10.421 | 6.022 |
| other | 4.833 | 8.315 | 11.352 | 5.292 |

| Mean of Median | SDR | SIR | ISR | SAR |
|:----------------:|:-------:|:------:|:------:|:------|
| vocals | 5.135 | 9.816 | 12.728 | 6.546 |
| drums | 7.257 | 12.478 | 12.274 | 7.005 |
| bass | 5.826 | 11.157 | 9.860 | 5.670 |
| other | 4.959 | 7.897 | 11.009 | 5.150 |
12 changes: 11 additions & 1 deletion egs/musdb18/X-UMX/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ def eval_main(
softmask=False,
residual_model=False,
model_name="xumx",
large=False,
outdir=None,
start=0.0,
duration=-1.0,
Expand All @@ -176,7 +177,11 @@ def eval_main(
model_name = os.path.abspath(model_name)
if not (os.path.exists(model_name)):
outdir = os.path.abspath("./results_using_pre-trained")
model_name = "r-sawata/XUMX_MUSDB18_music_separation"
model_name = (
"r-sawata/XUMXL_MUSDB18_music_separation"
if large
else "r-sawata/XUMX_MUSDB18_music_separation"
)
else:
outdir = os.path.join(
os.path.abspath(outdir),
Expand Down Expand Up @@ -276,6 +281,10 @@ def eval_main(
help="Audio chunk duration in seconds, negative values load full track",
)

parser.add_argument(
"--large", action="store_true", default=False, help="Download and use X-UMX Large (X-UMXL)"
)

parser.add_argument(
"--no-cuda", action="store_true", default=False, help="disables CUDA inference"
)
Expand All @@ -292,6 +301,7 @@ def eval_main(
niter=args.niter,
residual_model=args.residual_model,
model_name=model,
large=args.large,
outdir=args.outdir,
start=args.start,
duration=args.duration,
Expand Down

0 comments on commit 78e419e

Please sign in to comment.