Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get: refactor, clean up and fix dvc get implementation #2925

Merged
merged 1 commit into from
Dec 10, 2019

Conversation

Suor
Copy link
Contributor

@Suor Suor commented Dec 9, 2019

Things changed:

  • both no output and no git file result into PathMissingError now
  • fail on missing cache or cache download error loudly
  • removed sqlite bypass, since we use nolock now anyway

Refactoring:

  • streamlined logic and made it similar to Repo.open() one
  • moved exclusive exception classes to dvc.repo.get

Things are more complicated here than expected. Still have some issues, things to decide:

  • FileMissingError has bad message
  • should we unify FileMissingError and PathMissingError?
  • should we distinguish between?
    • neither out nor path is found
    • found external out with abs path doesn't use cache
    • abs path for git file
    • cache for out is missing
    • cache download failed for an arbitrary reason
  • how should we treat dvc get scheme://repo /an/absolute/git/path when there is no out:
    • strip leading /
    • show PathMissingError
    • show PathMissingError with a hint "did you mean "

P.S. Silent errors on DataCloud.pull() and `RemoteBase.download() are still an issue.

Things changed:
- both no output and no git file result into PathMissingError now
- fail on missing cache or cache download error loudly
- removed sqlite bypass, since we use nolock now anyway

Refactoring:
- streamlined logic and made it similar to Repo.open() one
- moved exclusive exception classes to dvc.repo.get
@Suor Suor requested a review from efiop December 9, 2019 18:07
@efiop
Copy link
Member

efiop commented Dec 9, 2019

FileMissingError has bad message
should we unify FileMissingError and PathMissingError?

The original agreement was to not unify them for now. "File missing" is accepable for api, but can be improved.

how should we treat dvc get scheme://repo /an/absolute/git/path when there is no out:

show PathMissingError because we can't strip automatically, as we support external outs. Hint might be good, though not necessary for this implementation.

should we distinguish between?

  • neither out nor path is found
  • found external out with abs path doesn't use cache
  • abs path for git file

This one should not be supported.

  • cache for out is missing
  • cache download failed for an arbitrary reason

if it failed, then there is no cache so cache for out is missing. At least for now.

We do distinguish by providing an appropriate cause exception, at least in Repo.open().

dvc/repo/get.py Show resolved Hide resolved
dvc/repo/get.py Show resolved Hide resolved
dvc/repo/get.py Show resolved Hide resolved
dvc/repo/get.py Show resolved Hide resolved
@Suor
Copy link
Contributor Author

Suor commented Dec 9, 2019

show PathMissingError because we can't strip automatically, as we support external outs

We already know there is no output with this path. So this is either typo in external out path or erroneous usage of absolute path for a for file

if it failed, then there is no cache so cache for out is missing. At least for now.

I can't find it, as far as I see both cases are silent. If we raise there then it will be ok.

EDIT. Ah, year, there is a warning on missing cache, but then will we get a checkout error?
EDIT2. Download fail raised an actual exception, so we won't get PathMissingError, which is good. We will though on missing cache, so it would be like:

Some of the cache files do not exist neither locally 
nor on remote. Missing cache files: ...

The path ... does not exist in the target repository ...
 neighther as an output nor a git-handled file.

Both messages are not great:

  • it's not said that cache is missing in external repo upstream, "locally" part is not relevant at all
  • the path ... does exist, the issue is missing cache

@Suor
Copy link
Contributor Author

Suor commented Dec 10, 2019

The meta takeout: we have permanent disconnections in what is happening and an error message. I think the reason is error messages texts are disjoint from error handling code.

@Suor
Copy link
Contributor Author

Suor commented Dec 10, 2019

So, should we merge it or do we need some clarifications?

@efiop efiop merged commit a9e829a into iterative:master Dec 10, 2019
@efiop
Copy link
Member

efiop commented Dec 10, 2019

@Suor Merged in a fast-lane mode to unblock import changes. Thank you! 🙂

@Suor Suor added this to In progress in DVC Sprint 17 Dec 2019 - 31 Dec 2019 via automation Dec 15, 2019
@Suor Suor self-assigned this Dec 15, 2019
@Suor Suor removed this from In progress in DVC Sprint 17 Dec 2019 - 31 Dec 2019 Dec 15, 2019
@Suor Suor added this to In progress in DVC Sprint 3 Dec 2019 - 17 Dec 2019 via automation Dec 15, 2019
@Suor Suor moved this from In progress to Review in progress in DVC Sprint 3 Dec 2019 - 17 Dec 2019 Dec 15, 2019
@Suor Suor moved this from Review in progress to Reviewer approved in DVC Sprint 3 Dec 2019 - 17 Dec 2019 Dec 15, 2019
@Suor Suor moved this from Reviewer approved to Done in DVC Sprint 3 Dec 2019 - 17 Dec 2019 Dec 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants