Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to use a smaller model for faster computation #20

Closed
jongwook opened this issue May 30, 2018 · 3 comments
Closed

Option to use a smaller model for faster computation #20

jongwook opened this issue May 30, 2018 · 3 comments

Comments

@jongwook
Copy link
Member

Currently the model capacity multiplier is fixed to 32, but this can be adjusted as a trade-off between the computation time and accuracy. Roughly speaking, the number of parameters is quadratic to this multiplier.

The one that is deployed on https://marl.github.io/crepe uses model multiplier 4, and still achieves quite comparable performance:

multiplier   #params    RPA
        32    22.24M    93.75%
        16    5.879M    93.22%
         8    1.629M    92.47%
         4      486k    91.52%

(note that these numbers are on MedleyDB v1 and not comparable to what's reported in the paper)

We'd like to have an option to select a smaller model, for faster calculation in the cost of slightly lower accuracy.

I'd suggest the following options for specifying the model capacity.

CLI Option multiplier # of params Model file size
--model-capacity full 32 22M 88 MB
--model-capacity large 24 12M 48 MB
--model-capacity medium 16 5.9M 24 MB
--model-capacity small 8 1.6M 6.4 MB
--model-capacity tiny 4 486k 1.9 MB

Currently the size of the PyPI archive is 57.7 MB, very close to the 60 MB limit, and it's only possible to add the tiny for immediate upload to PyPI. Requesting quota increase on PyPI is possible, but it seems quite difficult and not so sure if they will allow us the increase.

We can alternatively put the models on a separate branch in this repo, and have the code download the models during the installation or first use.

Let me do the former first (adding tiny), and figure out how to add the other three later.

@justinsalamon
Copy link
Collaborator

This sounds like a good plan to me. A few things to note:

  1. Since we need a solution for all models, I'm not sure it makes sense to just add tiny to pypi (this is not an urgent matter) - it feels messy. We might as well work directly on the final solution which downloads all the models from the GH repo (maybe still include the full model on pypi, and download all the other models from gH) and only then push to pypi.
  2. We should ensure that the option to choose a smaller model is also available via the python API, not just the CLI
  3. Once all the above is implemented we should bump the version, create a new release on GH and push to pypi (using twine to ensure correct markdown display on pypi).

@jongwook
Copy link
Member Author

  1. Once I add the implementation for handling multiple models, the updated Python code under /crepe/ can stay the same when we change the storage to GH - so I'd say it's more gradual path toward the solution.

I'm inclined to downloading all models from GH if possible, because including the same model for every release is a waste/abuse of PyPI storage.

  1. Sure; the CLI will call the appropriate python API.

  2. Of course.

@justinsalamon
Copy link
Collaborator

OK, sounds good. Let me know if you need my help with any of this, cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants