-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[air] pyarrow.fs
persistence: Don't automatically delete the local checkpoint
#38507
[air] pyarrow.fs
persistence: Don't automatically delete the local checkpoint
#38507
Conversation
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this change.
Previously, users had to create a tmpdir, copy existing ckpts to that tmp directory, and report it to Train. If the checkpoint is large, the copy step is unnecessary and takes a long time.
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…persistence/no_delete
Ah yeah I didn't even think about that! Saves 1 copy step if they already have a dir created by some integration library. |
…persistence/no_delete
…nable.save_ckpt Signed-off-by: Justin Yu <justinvyu@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @justinvyu!
…checkpoint (ray-project#38507) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: harborn <gangsheng.wu@intel.com>
…checkpoint (ray-project#38507) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…checkpoint (ray-project#38507) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
…checkpoint (ray-project#38507) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Victor <vctr.y.m@example.com>
Why are these changes needed?
This PR updates the
persist_current_checkpoint
logic to not delete the user's input checkpoint upon uploading. This is instead the job of the user to clean up if they want to. This is in line with making things more explicit.The new pattern for reporting a checkpoint (that gets cleaned up automatically) is:
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.