Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] ADR model size #14

Closed
ruohoruotsi opened this issue Apr 27, 2019 · 3 comments
Closed

[FIX] ADR model size #14

ruohoruotsi opened this issue Apr 27, 2019 · 3 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@ruohoruotsi
Copy link
Member

The ADR model is too big.

  • 200MB is what the training script emits, but normal people cannot be downloading 200MB haba!!
  • What is comprising the large size?? My suspicion is that Pytorch is also saving other data along w/ the weights/biases. Investigate and optimize the size of the model so that we can store it either locally (within github's limits) or at least make it an easier download.

Ìrànlọ́wọ́:

@ruohoruotsi ruohoruotsi added bug Something isn't working enhancement New feature or request labels Apr 27, 2019
@ruohoruotsi ruohoruotsi self-assigned this Apr 27, 2019
@ruohoruotsi
Copy link
Member Author

Learning more about the Pytorch model structure, we see that the optimizer’s state_dict is also saved as it contains state and parameters that are updated as the model trains. So this is important for checkpoint models, to resume training, but it is NOT important for the final model, so we can delete this part of the model.

If you trained your model using Adam, you need to save the optimizer state 
dict as well and reload that. Also, if you used any learning rate decay, you need 
to reload the state of the scheduler because it gets reset if you don’t, and you 
may end up with a higher learning rate that will make the solution state oscillate. 
Finally, if you have any dropout or batch norm in your model architecture, and 
you saved your model after a test loop (in which case model.eval() was called), 
make sure to call model.train() before the training loop.

@ruohoruotsi
Copy link
Member Author

Fixed in 5781370, new model is 67MB which is still bigger than the recommended 50MB, but at least it still pushes. I should keep track of the history and other sources of bloat in this repo.

@ruohoruotsi
Copy link
Member Author

An alternative if you want to use big files on the web, but host elsewhere than somewhere easily curl-able or within the github repo itself. Appreciate the fiddly-ness:

https://medium.freecodecamp.org/how-to-transfer-large-files-to-google-colab-and-remote-jupyter-notebooks-26ca252892fa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant