Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a checkpointing scheme #17

Open
LivInTheLookingGlass opened this issue Jul 4, 2021 · 0 comments
Open

Implement a checkpointing scheme #17

LivInTheLookingGlass opened this issue Jul 4, 2021 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@LivInTheLookingGlass
Copy link
Owner

Potential idea for a general case, assuming no external I/O.

Provide a context object and wrap the requested job in a function that starts a helper thread. This helper thread should attempt to serialize this context object.

  • the job reports the object is busy by locking it
  • the job opens files via this object, whose states are copied/reflinked at the same frequency as program state
  • the serializer thread does this every 30s if no I/O active, 120s if I/O inactive
  • serializer thread stores the last {configurable} snapshots
  • serializer thread stores this with name job_{number}.snapshot_{number}.bin

Open questions:

  • is the random object snapshottable?

A better world would let me do this using this package, but I'm not sure that you use it this way given their API. I'd like to have this be as non-invasive as possible.

@LivInTheLookingGlass LivInTheLookingGlass added enhancement New feature or request help wanted Extra attention is needed labels Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant