-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvc exp run --temp: dvc-tracked dependencies are not checked out #10056
Comments
This is expected behavior. The behavior for checking out the dependency when does not exist at all (but is tracked in the .dvc file) is handled as a special case rather than assuming you intended to commit deleting that dependency |
Well if it's expected ok, I do find it confusing though. Especially the difference in behaviour when you compare what happens if You also state here in your docs that "Git-ignored files/dirs are excluded from queued/temp runs" and this is a git-ignored file, so I did expect it to be in fact excluded. https://dvc.org/doc/user-guide/experiment-management/running-experiments#how-are-experiments-isolated |
We can clarify that in the docs, but that really means "git-ignored files which are also not tracked by DVC" (since all DVC tracked files are git-ignored) If you had |
I agree with @ralfbanisch on this one. So all in all, imagine you have a
To me |
The distinction is that DVC runs the This is consistent with what happens if you have unstaged changes in any other git-tracked file. The experiment will contain any unstaged and uncommitted changes to git-tracked files and will also contain any unstaged and uncommitted changes to DVC-tracked files (with the caveat that the DVC tracked file itself is used as the source of truth and not the Let's say you have a git tracked We used to have a cc @dberenbaum |
Ok, so I agree that if I make changes to the git-tracked files I think the confusion here is in case of the dvc-tracked It is good to understand now why it happens like this (because of dvc commit under the hood) and I can work around it, however from the user perspective I still find it confusing. |
@ralfbanisch Could you explain your actual workflow? The minimal example is great, but this is now more of a product discussion where your real use case is more important. How did the files get to be in a modified/missing state in the workspace, and why do you want to ignore whatever changes you made there? |
@pmrowla The surprising part to me is that it works differently without |
For me the behaviour is actually the same without The use case for me was to generate data on model performance as a function of dataset size in an ablation experiment.
Since I never modified |
My mistake there. I misread the issue.
In general, the expectation is that you manage the actual data, and dvc manages the
@pmrowla What happens here when |
Bug Report
Issue name
dvc exp run --temp: dvc-tracked dependencies are not checked out
Description
dvc exp run --temp will not dvc checkout the dependency
file.txt
fromfile.txt.dvc
iffile.txt
has been modified.Reproduce
code/test/py
just prints the contents thefile.txt
, andfile.txt
contains the single line "foo".file.txt
is gitignored, all other files are git tracked. Content ofdvc.yaml
is:dvc init && dvc add data/file.txt && git add data/file.txt.dvc data/.gitignore
dvc exp run --dry
dvc exp run --temp
-> prints "foo", as expected.rm data/file.txt && dvc exp run --temp
-> checks out file.txt and prints "foo" as expected.touch data/file.txt && dvc exp run --temp
-> fails to dvc check out file.txt and runs with empty file.txt insteadExpected
dvc exp run --temp should not copy
file.txt
to the temporary folder, since it is not git tracked, and dvc checkoutfile.txt
from the local cache, just as it does whenfile.txt
is not present at all.Environment information
dvc==3.27.0
Output of
dvc doctor
:Additional Information (if any):
The text was updated successfully, but these errors were encountered: