Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More info on pretrained models #1

Closed
faroit opened this issue Oct 28, 2019 · 6 comments
Closed

More info on pretrained models #1

faroit opened this issue Oct 28, 2019 · 6 comments
Labels
model Discussion around TF model design question Further information is requested

Comments

@faroit
Copy link
Contributor

faroit commented Oct 28, 2019

First, congrats to your release. Glad to have more music separation ready to use :-)

Do you have some more information regarding

  • were the pretrained models just trained on MUSDB18?
  • If not, would the models be still performing as good as state of the art when trained on just MUSDB18?
  • what SDR values could we expect from these models on MUSDB18 test?
@Faylixe Faylixe added model Discussion around TF model design question Further information is requested labels Oct 28, 2019
@romi1502
Copy link
Member

romi1502 commented Oct 30, 2019

Hey Fabian! hope you're well :)!
Thank you for you question.

  • We did not use any MUSDB data for training or validation but datasets that we have in Deezer (which, may be part of the added value of Spleeter over other released models).
  • The model is based on convolutional U-nets (one per instruments). I think some models in SISEC 2018 (JY from what I remember, but not sure, you could probably confirm that) were quite similar and got quite good performances without other data than MUSDB.
    We also trained other kinds of model (such as LSTM based ones), but we finally kept this one because it makes possible very fast computation on GPUs (both for training and prediction) while having good separation results.
  • On MUSDB18 test, we get the following SDR values with the 4 stems model using multichannel Wiener filtering (which improves a bit the scores, but we believe is perceptively worst than basic ratio masks):
vocals SDR bass SDR drums SDR other SDR
Spleeter 4 stems 6.86dB 5.51dB 6.71dB 4.55dB

Note, that we did not try do any optimization on these scores and did not use any MUSDB training data in the training process so these scores are an actual measure of the generalization power of the model (on western pop/rock song though).

There are some more detailed on the extended abstract of the demo we'll present in ISMIR next week.

@faroit
Copy link
Contributor Author

faroit commented Oct 30, 2019

@romi1502 thanks for the info.

I think, especially for the research community, it would be cool to also present reproducible scores when just trained on MUSDB18. By doing it yourself, you might prevent people from using non-ideal parameters, hence, reporting scores that are too low. Oh and also we can save a bit of energy for the environment ;-)

@mmoussallam
Copy link
Collaborator

Hi @faroit Thanks for your Feedback.

Training on musdb is definitely something we can do but I'm not sure how much value it would bring to end users.
Our intent with Spleeter is not so much to compare ourselves with the latest separation models but rather to provide a fast and ready-to-use separation tool for researchers doing other tasks (e.g. transcription..). I'm afraid releasing multiple models trained on different datasets would complicate things for users.

@decivilizator
Copy link

Could you please share how large was your dataset for the pretrained models?

@faroit
Copy link
Contributor Author

faroit commented Nov 12, 2019

Training on musdb is definitely something we can do but I'm not sure how much value it would bring to end users.

I'm afraid releasing multiple models trained on different datasets would complicate things for users.

@mmoussallam I understand that but at one point people will use this to train on their own data and might publish results based on this repo.

I already trained and evaluated on MUSDB18 and it seems that there are some issues - See #81
It would be great if you could help to update the training configs for MUSDB18. Another options would be to maintain a fork of spleeter on sigsep to host a pretrained models on MUSDB18 for the source separation community, what do you think?

@faroit
Copy link
Contributor Author

faroit commented Nov 12, 2019

@mmoussallam I am closing this issue since it is not related to the pretrained beans model. Feel free to reply here or in #81

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model Discussion around TF model design question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants