-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A new design of our Python save / load interfaces #8521
Comments
|
Related issue #7931 |
Just supply some requirements for inference:
For training, there may be some other requirement, like initializing part of parameters from a pre-trained model and randomizing the other part. |
After a discussion with @QiJune and @reyoung, we all agree that this design can be approximated via a few simple modifications of the current code:
It will hardly change the code structure. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Related issue: #7163
Issues
Currently, there are a few obvious issues in our program saving and loading interfaces:
The
save_params()
andload_params()
is useless and misleading. Some variables required by making checkpoints are notParameter
, which makes a model is unable to continue its training or make inference aftersave_params()
andload_params()
. The correct way of making a checkpoint is usingsave_persistables()
andload_persistables()
.save_var
andload_var
takes anexecutor
, builds a temporary program and then executes the program immediately. This makes variables saving and loading are triggered by Python code. We can't make checkpoints or save parameters in an environment without Python.Proposed Solution
Base functions
To fix existing issues, we redesign our Python io module. The proposed new io module mainly consists of following base functions:
save()
andload()
can be considered as the layers ofsave_op
andload_op
. They don't execute immediatly like currentload_vars()
andsave_vars()
. They just append thesave_op
orload_op
to the given program and leave the execution to the runtime.By using these base functions, we can save our
program
at any stage of model configuration, or save and load any specific variable values at any phrases of program execution.Checkpoints
To make it more user-friendly, we can add some high-level wrappers for checkpoint related functions:
A checkpoint consists of two parts: variables and a loader. A loader is a program. It acts like a startup program and the only difference between a loader and a regular startup program is that in a loader some variables may be initialized by existing file instead of initializer ops.
We can use the checkpoint as follows:
Inference Model Saving and Loading
Currently, we use
Program.prune
to cut main program to get inference model. However, prune algorithm is complex and easy to be buggy. In recent discussions, we tend to leave the building of inference model to users:The key to getting inference model is saving the main program and making checkpoints precisely before optimizers.
The text was updated successfully, but these errors were encountered: