-
Notifications
You must be signed in to change notification settings - Fork 1.3k
checkpoints: completely remove checkpoint outs on exp run --reset
#5586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@dberenbaum I tested the new behavior against some of the toy checkpoints repos I have on hand (in addition to the CI tests), but would appreciate it if you can double check with a couple of your own repos as well. |
dberenbaum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are checkpoints prior to HEAD, --reset will not drop the existing checkpoints, but it will create a new experiment from scratch and duplicate those checkpoints. Is it important that --reset drops those checkpoints to avoid duplication? Will it seem inconsistent that checkpoints are dropped when based off HEAD but otherwise are not?
Experiments (checkpoints or not) are always considered unique per parent HEAD commit. Experiments derived from any other commit are considered separate, so we don't check to see if a matching experiment tied to any other commit in the entire git history. (But if an identical run does exist, we will still re-use run-cache for that experiment so execution would not be duplicated, and we would also re-use identical cache objects like with anything else in DVC) As it stands right now, checkpoints behave the same way as standalone experiments, making |
dberenbaum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the behavior as is, but we might have to revisit this workflow if users get confused. Both this and #5567 are related to running experiments with different Git histories but matching dvc-tracked info.
|
#5567 is bug that is not related to checkpoints |
e4c37ff to
2c57a77
Compare
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π
Will close #5553.
Docs PR: treeverse/dvc.org#2286
Adjusts
dvc exp run --resetbehavior forcheckpointouts:--resetis used, anycheckpointouts will be removed entirely before running the experiment (regardless of whether or not a hash for the given out exists indvc.lock)--resetis used, all other (non-checkpoint) outs indvc.lockwill be unchanged--resetwould reset the entiredvc.lockfile to HEAD (i.egit checkout HEAD -- dvc.lock)--queuenow explicitly implies--reset, and--resetis now mutually exclusive with--rev--resetis used in conjunction with--rev--revwould do the equivalent ofgit reset --hard <rev>(so the old behavior would do the same thing with or with--reset).--queue/--run-allwith regard to checkpoints have also been clarified