-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task checkpointing #3
Comments
Task checkpointing is mostly done though the implementation is admittedly hacking. Implementing it as a context manager required some unconventional techniques. For one, to be able to distinguish between two checkpoints within the same scope, the checkpoints would have to recognize they are distinct. This was done by looking at the line number and the filename where the checkpoint was used. Finding out where the checkpoint was used requires that the checkpoint look down the stack. For more assurance, the checkpoint looks all the way down to the main module. For the db encoding, a hash function is used to generate a unique filename. Additionally, the checkpoint needs the save the namespace wherever it was called. Like the last case, it would have to look down the stack to the parent frame, this time, looking at the locals in the frame. Saving the namespace requires only being to read the locals while loading the namespace requires modifying the locals of the parent frame. Lastly, if loading a checkpoint, the with-block would have to be skipped (since that's the point of a checkpoint). PEP 377 specifically addresses this request but was denied. There this hack but my tests have shown that it only works in Python 2. Currently, the implementation requires that the user writes their code in the form with checkpoint() as cp:
cp.skip_with() # skip this block if we can load the checkpoint
# actual code here |
There is a minor issue with variables that are created outside of with-blocks. Call these variables runtime variables. Loading a checkpoint with overwrite the value of the runtime variables to their historic state. This can be mitigated by keeping track of whether a variable is a runtime variable or a static variable (variables defined within with-blocks). This bookkeeping can be done within the checkpoint class and must persist throughout all the checkpoints wherever it was called .We'd have to look at the function of the parent frame then -- fortunately, this is possible... with more hacks. |
For anyone wondering, these hacks mostly use the |
The checkpoint module allows for some long-running task to continue execution after exiting, either by interruption or exception. This module must allow for simple integration with existing scripts.
The implementation will be as follows:
__enter__
method checks whether it should create a checkpoint, load a checkpoint, or skip.__enter__
methodThe checkpoints saved to disk must be encoded such that each checkpoint is easily distinguishable. At the moment, following metadata will be looked at:
Note if the script is still under development, then the above metadata may change.
Each checkpoint can be cleaned up in the
__exit__
method removing the need for a dedicated cleanup function but there may be a case where you have code not encapsulated by checkpoints that raises an exception. The above template allows you to restore the last saved state.The text was updated successfully, but these errors were encountered: