-
Notifications
You must be signed in to change notification settings - Fork 965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to save and load model, optimizer and scheduler's state dictionary? #154
Comments
You should use
You create a brand new model, so you should pass it to the prepare method. Note that adding checkpointing utility in Accelerate is on the roadmap, to make all of this easier. |
thanks, I was able to load the model with |
Is this feature, currently available? |
It's under development on #255, we're hoping to have it merge next week. |
Closed with #255! 🎉 |
How do I save and load the model, optimizer and scheduler state dictionarys that has gone through
accelerator.prepare()
?for model
I used the unwrap function as described in the documentation
however, I get the following error when loading the model
model = MT5ForConditionalGeneration.from_pretrained(args.model_path, config=config)
For optimizer and scheduler
currently using
torch.save(optimizer.state_dict(), /exp1/file.opt
) for save gives the errorRuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable
when loading usingoptimizer.load_state_dict(torch.load('exp1/file.opt'))
Does
accelerator.unwrap(
work the same way as for a model?Using
torch.save(scheduler.state_dict(), /exp1/sch
)and loading with
scheduler.load_state_dict(torch.load('path')` is working.EDITS: I updated the original issue with more details and exact error messages.
The text was updated successfully, but these errors were encountered: