-
Notifications
You must be signed in to change notification settings - Fork 496
Reloading model and params from Checkpoint #51
Comments
34825ea |
Thank you for the quick response. Traceback (most recent call last): |
Ah yes, I also had this because I tried to reload on a single GPU a model trained on multiple GPU. Problem in that case is that with multi-GPU, the model is encapsulated in a module (this is why you have all the extra See 34825ea#diff-e750911d9404a6f817e2015251a4a654R458 |
Thanks !! |
This solved my issues, when I trained TLM on multi gpu's and translating using just 1 gpu. |
Hi,
How can I reload the checkpoint and model file in order to continue from the last epoch I have reached in previous (aborted) running ? I want to do this in the pretrain stage and also in the train stage
Thanks,
Odel
The text was updated successfully, but these errors were encountered: