You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
200MB is what the training script emits, but normal people cannot be downloading 200MB haba!!
What is comprising the large size?? My suspicion is that Pytorch is also saving other data along w/ the weights/biases. Investigate and optimize the size of the model so that we can store it either locally (within github's limits) or at least make it an easier download.
Learning more about the Pytorch model structure, we see that the optimizer’s state_dict is also saved as it contains state and parameters that are updated as the model trains. So this is important for checkpoint models, to resume training, but it is NOT important for the final model, so we can delete this part of the model.
If you trained your model using Adam, you need to save the optimizer state
dict as well and reload that. Also, if you used any learning rate decay, you need
to reload the state of the scheduler because it gets reset if you don’t, and you
may end up with a higher learning rate that will make the solution state oscillate.
Finally, if you have any dropout or batch norm in your model architecture, and
you saved your model after a test loop (in which case model.eval() was called),
make sure to call model.train() before the training loop.
Fixed in 5781370, new model is 67MB which is still bigger than the recommended 50MB, but at least it still pushes. I should keep track of the history and other sources of bloat in this repo.
An alternative if you want to use big files on the web, but host elsewhere than somewhere easily curl-able or within the github repo itself. Appreciate the fiddly-ness:
The ADR model is too big.
Ìrànlọ́wọ́:
The text was updated successfully, but these errors were encountered: