New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic checkpointing #106

Closed
annawoodard opened this Issue Mar 6, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@annawoodard
Collaborator

annawoodard commented Mar 6, 2018

Checkpointing is currently manually triggered by the user. It would be convenient if users could specify automatic checkpointing at a given frequency and/or after each task completes.

As requested by @djf604.

@yadudoc yadudoc self-assigned this Mar 7, 2018

@yadudoc yadudoc added this to the Parsl-0.5.0 milestone Mar 7, 2018

@yadudoc

This comment has been minimized.

Contributor

yadudoc commented Mar 7, 2018

From the discussion with @djf604, this should have priority given user triggered checkpointing has limited usability. I can start adding functionality to support this with the following modes (copied over from the previous issue : #43

  1. (eager) checkpoint at the completion of every task or
  2. (lazy) at configurable intervals (say ~1hr).
  3. (atExit) write out a checkpoint when the DFK exits, capturing early termination from app failures,
    uncaught exceptions, keyboard interrupts etc. This won't work for cases when the DFK is killed abruptly (SIGKILL), machine powered off etc.
@annawoodard

This comment has been minimized.

Collaborator

annawoodard commented Mar 7, 2018

@yadudoc Sounds good to me. I vote for a minor tweak to the naming scheme. I would guess without documentation that "lazy checkpointing is only performed when it is needed", not "lazy checkpointing is performed at regular intervals". Here's a possibility:

checkpoint=None # default, no checkpointing
checkpoint='periodic' # configurable intervals
checkpoint='task_exit'
checkpoint='dfk_exit'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment