Skip to content

Conversation

@pmrowla
Copy link
Contributor

@pmrowla pmrowla commented Mar 12, 2021

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

Will fix #5567

@pmrowla pmrowla added the bugfix fixes bug label Mar 12, 2021
@pmrowla pmrowla self-assigned this Mar 12, 2021
@pmrowla pmrowla requested a review from skshetry March 12, 2021 06:46
if checkpoint:
raise CheckpointExistsError(ref_info.name)
raise ExperimentExistsError(ref_info.name)
new_rev = orig_rev
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new run generates an identical commit to an existing one, we should be reusing the existing commit (this logic was already happening for tempdir runs during git fetch, but not for workspace runs where we commit directly into the main git/dvc workspace)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmrowla, is it possible to find out before even running this, that the state is the same as before (might be problematic with non-deterministic stages probably).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's difficult because we also include git repo/workspace modifications with experiments (and not just DVC dependencies). So while two pipeline runs might be identical (so they get hashed into matching stages with matching DVC-tracked deps/outs), there could be other changes in the repo that show up in git but not to DVC. So we have to also generate the final git commit and then see if that actually conflicts/diffs with the previous run.

If there are git differences, we can't really tell which experiment should be preferred (so we would error out and then require running with -f/--force to overwrite the existing one)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yeah, also as you noted, I'm not sure we can rely on checkpoint stages to be deterministic, since they are persist outputs and it's not necessarily guaranteed that the user's code will always generate the identical sequence of checkpoints

Copy link
Collaborator

@skshetry skshetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I have made some minor comments inline.

@pmrowla pmrowla merged commit d365e6d into treeverse:master Mar 12, 2021
@pmrowla pmrowla deleted the 5567-workspace-duplicate branch March 12, 2021 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix fixes bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

exp run: multiple error messages

2 participants