Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch giving unexpected error for all tags and all commits #3857

Closed
imhardikj opened this issue May 22, 2020 · 3 comments · Fixed by #3871
Closed

fetch giving unexpected error for all tags and all commits #3857

imhardikj opened this issue May 22, 2020 · 3 comments · Fixed by #3871
Assignees
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.

Comments

@imhardikj
Copy link

$ dvc --version
DVC version: 0.94.0
Python version: 3.7.6
Platform: Linux-5.3.0-51-generic-x86_64-with-debian-buster-sid
Binary: False
Package: pip
Supported remotes: http, https

After cloning example-get-started repo it gives unexpected error for fetch -aT.
I have tried this on windows too and it shows same error. It previously worked when I tried on version 0.92.x.

$ git clone https://github.com/iterative/example-get-started
$ cd example-get-started
$ dvc fetch -aT
ERROR: unexpected error
$ dvc fetch -aT -v
2020-05-22 23:02:45,092 DEBUG: PRAGMA user_version;                     
2020-05-22 23:02:45,092 DEBUG: fetched: [(3,)]
2020-05-22 23:02:45,092 DEBUG: CREATE TABLE IF NOT EXISTS state (inode INTEGER PRIMARY KEY, mtime TEXT NOT NULL, size TEXT NOT NULL, md5 TEXT NOT NULL, timestamp TEXT NOT NULL)
2020-05-22 23:02:45,093 DEBUG: CREATE TABLE IF NOT EXISTS state_info (count INTEGER)
2020-05-22 23:02:45,093 DEBUG: CREATE TABLE IF NOT EXISTS link_state (path TEXT PRIMARY KEY, inode INTEGER NOT NULL, mtime TEXT NOT NULL)
2020-05-22 23:02:45,093 DEBUG: INSERT OR IGNORE INTO state_info (count) SELECT 0 WHERE NOT EXISTS (SELECT * FROM state_info)
2020-05-22 23:02:45,093 DEBUG: PRAGMA user_version = 3;
2020-05-22 23:02:45,117 DEBUG: Assuming '/home/hardik/src/example-get-started/.dvc/cache/42/c7025fc0edeb174069280d17add2d4.dir' is unchanged since it is read-only
2020-05-22 23:02:45,117 DEBUG: Assuming '/home/hardik/src/example-get-started/.dvc/cache/42/c7025fc0edeb174069280d17add2d4.dir' is unchanged since it is read-only
2020-05-22 23:02:45,117 DEBUG: Assuming '/home/hardik/src/example-get-started/.dvc/cache/68/36f797f3924fb46fcfd6b9f6aa6416.dir' is unchanged since it is read-only
2020-05-22 23:02:45,118 DEBUG: Assuming '/home/hardik/src/example-get-started/.dvc/cache/68/36f797f3924fb46fcfd6b9f6aa6416.dir' is unchanged since it is read-only
2020-05-22 23:02:45,184 DEBUG: SELECT count from state_info WHERE rowid=?
2020-05-22 23:02:45,184 DEBUG: fetched: [(7,)]
2020-05-22 23:02:45,184 DEBUG: UPDATE state_info SET count = ? WHERE rowid = ?
2020-05-22 23:02:45,197 ERROR: unexpected error
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/main.py", line 49, in main
    ret = cmd.run()
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/command/data_sync.py", line 75, in run
    recursive=self.args.recursive,
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/repo/__init__.py", line 30, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/repo/__init__.py", line 536, in fetch
    return self._fetch(*args, **kwargs)
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/repo/fetch.py", line 45, in _fetch
    recursive=recursive,
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/repo/__init__.py", line 295, in used_cache
    filter_info=filter_info,
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/stage/__init__.py", line 761, in get_used_cache
    cache.update(out.get_used_cache(*args, **kwargs))
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/output/base.py", line 449, in get_used_cache
    self.checksum, self._collect_used_dir_cache(**kwargs),
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/output/base.py", line 363, in _collect_used_dir_cache
    if self.cache.changed_cache_file(self.checksum):
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/remote/base.py", line 805, in changed_cache_file
    if self.is_protected(cache_info):
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/remote/local.py", line 719, in is_protected
    if not self.exists(path_info):
  File "/home/hardik/miniconda3/lib/python3.7/site-packages/dvc/remote/local.py", line 97, in exists
    assert is_working_tree(self.repo.tree)
AssertionError
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label May 22, 2020
@efiop efiop added bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP. labels May 22, 2020
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label May 22, 2020
@efiop efiop added this to To do in DVC Sprint 19 May - 2 June 2019 via automation May 22, 2020
@jorgeorpinel
Copy link
Contributor

Seems very similar to #3812 on first glance (noted by the author of this issue in iterative/dvc.org#528 (comment)).

@pmrowla
Copy link
Contributor

pmrowla commented May 25, 2020

After changes for #3811 we still fail with a different error:

❯ dvc fetch -aT
ERROR: unexpected error - [Errno 2] No such file

...
2020-05-25 15:00:59,632 DEBUG: fetched: [(0,)]
2020-05-25 15:00:59,635 ERROR: unexpected error - [Errno 2] No such file
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/pmrowla/git/dvc/dvc/main.py", line 53, in main
    ret = cmd.run()
  File "/Users/pmrowla/git/dvc/dvc/command/data_sync.py", line 79, in run
    run_cache=self.args.run_cache,
  File "/Users/pmrowla/git/dvc/dvc/repo/__init__.py", line 25, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/Users/pmrowla/git/dvc/dvc/repo/__init__.py", line 527, in fetch
    return self._fetch(*args, **kwargs)
  File "/Users/pmrowla/git/dvc/dvc/repo/fetch.py", line 51, in _fetch
    used_run_cache=used_run_cache,
  File "/Users/pmrowla/git/dvc/dvc/repo/__init__.py", line 302, in used_cache
    filter_info=filter_info,
  File "/Users/pmrowla/git/dvc/dvc/stage/__init__.py", line 516, in get_used_cache
    cache.update(out.get_used_cache(*args, **kwargs))
  File "/Users/pmrowla/git/dvc/dvc/output/base.py", line 470, in get_used_cache
    self.checksum, self.collect_used_dir_cache(**kwargs),
  File "/Users/pmrowla/git/dvc/dvc/output/base.py", line 391, in collect_used_dir_cache
    self.get_dir_cache(jobs=jobs, remote=remote)
  File "/Users/pmrowla/git/dvc/dvc/output/base.py", line 358, in get_dir_cache
    if self.cache.changed_cache_file(self.checksum):
  File "/Users/pmrowla/git/dvc/dvc/remote/base.py", line 881, in changed_cache_file
    actual = self.get_checksum(cache_info)
  File "/Users/pmrowla/git/dvc/dvc/remote/base.py", line 350, in get_checksum
    checksum = self.state.get(path_info)
  File "/Users/pmrowla/git/dvc/dvc/state.py", line 400, in get
    actual_mtime, actual_size = get_mtime_and_size(path, self.repo.tree)
  File "/Users/pmrowla/git/dvc/dvc/utils/fs.py", line 53, in get_mtime_and_size
    base_stat = tree.stat(path)
  File "/Users/pmrowla/git/dvc/dvc/ignore.py", line 151, in stat
    return self.tree.stat(path)
  File "/Users/pmrowla/git/dvc/dvc/scm/git/tree.py", line 161, in stat
    raise OSError(errno.ENOENT, "No such file")
FileNotFoundError: [Errno 2] No such file
------------------------------------------------------------

@pmrowla pmrowla self-assigned this May 25, 2020
@pmrowla pmrowla moved this from To do to In progress in DVC Sprint 19 May - 2 June 2019 May 25, 2020
@pmrowla
Copy link
Contributor

pmrowla commented May 25, 2020

Issue is that state currently belongs to Repo, and always uses repo.tree. If repo.tree is a GitTree (because brancher is being used) state lookups for local cache paths will fail because local cache doesn't exist in the GitTree.

State should really belong to local cache, or to the WorkingTree used by local cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants