-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diff: use RepoTree to compare directory contents #4518
Conversation
dvc/repo/diff.py
Outdated
# if dir hash is missing from cache, and no remote to pull it from, | ||
# there is nothing we can do here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we print a warning or at least a debug message?
|
||
|
||
def _output_paths(repo): | ||
on_working_tree = isinstance(repo.tree, LocalTree) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does on_working_tree
matter anymore though? RepoTree
will just use repo.tree
and will act accordingly if it is a wtree or a git rev tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue here is that DvcTree (and RepoTree) does not re-compute dir output hashes when the workspace is dirty. I'm not sure that we would want to change that behavior just for dvc diff
, it seems like in a majority of scenarios we would want to re-use the already computed hashes for dir outputs when we are using DvcTree and RepoTree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually an inconsistency that should be solved in RepoTree. Ideally, we should be able to work with both dirty state and clean repos. But could start with the current approach too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #4523 for this issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, but I am not sure if this is a good thing, as output could get quite lengthy.
5d40d76
to
7660513
Compare
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π
Partially addresses #2982.
dir/
added/modified/deleted status)RepoTree(stream=True)
This does not implement pulling all uncached file contents by default, or the file size related tasks.