Skip to content

import/get: handle chained import data #4423

@jorgeorpinel

Description

@jorgeorpinel

UPDATE: Jump to #4423 (comment)

Bug Report

λ dvc import https://github.com/iterative/example-get-started data/data.xml
Importing 'data/data.xml (https://github.com/iterative/example-get-started)' -> 'data.xml'
ERROR: unexpected error - [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\tests\\.dvc\\cache\\a3\\04afb96060aad90176268345e10355'

Full --verbose output:

λ dvc import https://github.com/iterative/example-get-started data/data.xml -v
2020-08-18 23:22:41,569 DEBUG: Check for update is enabled.
2020-08-18 23:22:41,752 ERROR: interrupted by the user
------------------------------------------------------------
Traceback (most recent call last):
  File "c:\users\poj12\dvc\dvc\main.py", line 53, in main
    cmd = args.func(args)
  File "c:\users\poj12\dvc\dvc\command\base.py", line 40, in __init__
    updater.check()
  File "c:\users\poj12\dvc\dvc\updater.py", line 58, in check
    self._with_lock(self._check, "checking")
  File "c:\users\poj12\dvc\dvc\updater.py", line 44, in _with_lock
    func()
  File "c:\users\poj12\dvc\dvc\updater.py", line 62, in _check
    self.fetch()
  File "c:\users\poj12\dvc\dvc\updater.py", line 84, in fetch
    daemon(["updater"])
  File "c:\users\poj12\dvc\dvc\daemon.py", line 105, in daemon
    file_path = os.path.abspath(inspect.stack()[0][1])
  File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 1514, in stack
    return getouterframes(sys._getframe(1), context)
  File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 1491, in getouterframes
    frameinfo = (frame,) + getframeinfo(frame, context)
  File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 1465, in getframeinfo
    lines, lnum = findsource(frame)
  File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 792, in findsource
    module = getmodule(object, file)
  File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 754, in getmodule
    os.path.realpath(f)] = module.__name__
  File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\ntpath.py", line 647, in realpath
    path = _getfinalpathname(path)
KeyboardInterrupt
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-08-18 23:22:41,927 DEBUG: Analytics is disabled.
(.venv) poj12@AP-QDVJ7BLR ~/DVC-repos/tests (master)
λ
(.venv) poj12@AP-QDVJ7BLR ~/DVC-repos/tests (master)
λ
(.venv) poj12@AP-QDVJ7BLR ~/DVC-repos/tests (master)
λ dvc import https://github.com/iterative/example-get-started data/data.xml -v
2020-08-18 23:22:51,005 DEBUG: Check for update is enabled.
2020-08-18 23:22:51,516 DEBUG: Trying to spawn '['daemon', '-q', 'updater']'
2020-08-18 23:22:52,092 DEBUG: Spawned '['daemon', '-q', 'updater']'
2020-08-18 23:22:52,117 DEBUG: fetched: [(3,)]
2020-08-18 23:22:53,407 DEBUG: Removing output 'data.xml' of stage: 'data.xml.dvc'.
Importing 'data/data.xml (https://github.com/iterative/example-get-started)' -> 'data.xml'
2020-08-18 23:22:53,416 DEBUG: Computed stage: 'data.xml.dvc' md5: 'e7514d625f896d082cc0ca259453b732'
2020-08-18 23:22:53,421 DEBUG: 'md5' of stage: 'data.xml.dvc' changed.
2020-08-18 23:22:53,425 DEBUG: Creating external repo https://github.com/iterative/example-get-started@None
2020-08-18 23:22:53,430 DEBUG: erepo: git clone 'https://github.com/iterative/example-get-started' to a temporary dir
2020-08-18 23:22:56,420 DEBUG: Saving '..\..\AppData\Local\Temp\tmppx2petbedvc-clone\data\data.xml' to '.dvc\cache\a3\04afb96060aad90176268345e10355'.
2020-08-18 23:22:56,428 DEBUG: cache 'C:\Users\poj12\DVC-repos\tests\.dvc\cache\a3\04afb96060aad90176268345e10355' expected 'a304afb96060aad90176268345e10355' actual 'None'
2020-08-18 23:22:56,439 DEBUG: cache 'C:\Users\poj12\DVC-repos\tests\.dvc\cache\a3\04afb96060aad90176268345e10355' expected 'a304afb96060aad90176268345e10355' actual 'None'
2020-08-18 23:22:56,508 DEBUG: Preparing to download data from 'https://remote.dvc.org/get-started'
2020-08-18 23:22:56,512 DEBUG: Preparing to collect status from https://remote.dvc.org/get-started
2020-08-18 23:22:56,517 DEBUG: Collecting information from local cache...
2020-08-18 23:22:56,638 DEBUG: fetched: [(45,)]
2020-08-18 23:22:56,697 ERROR: unexpected error - [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\tests\\.dvc\\cache\\a3\\04afb96060aad90176268345e10355'
------------------------------------------------------------
Traceback (most recent call last):
  File "c:\users\poj12\dvc\dvc\main.py", line 54, in main
    ret = cmd.run()
  File "c:\users\poj12\dvc\dvc\command\imp.py", line 14, in run
    self.repo.imp(
  File "c:\users\poj12\dvc\dvc\repo\imp.py", line 6, in imp
    return self.imp_url(path, out=out, erepo=erepo, frozen=True)
  File "c:\users\poj12\dvc\dvc\repo\__init__.py", line 34, in wrapper
    ret = f(repo, *args, **kwargs)
  File "c:\users\poj12\dvc\dvc\repo\scm_context.py", line 4, in run
    result = method(repo, *args, **kw)
  File "c:\users\poj12\dvc\dvc\repo\imp_url.py", line 54, in imp_url
    stage.run()
  File "c:\users\poj12\dvc\.venv\lib\site-packages\funcy\decorators.py", line 39, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "c:\users\poj12\dvc\dvc\stage\decorators.py", line 36, in rwlocked
    return call()
  File "c:\users\poj12\dvc\.venv\lib\site-packages\funcy\decorators.py", line 60, in __call__
    return self._func(*self._args, **self._kwargs)
  File "c:\users\poj12\dvc\dvc\stage\__init__.py", line 429, in run
    sync_import(self, dry, force)
  File "c:\users\poj12\dvc\dvc\stage\imports.py", line 30, in sync_import
    stage.deps[0].download(stage.outs[0])
  File "c:\users\poj12\dvc\dvc\dependency\repo.py", line 97, in download
    _, _, cache_infos = repo.fetch_external([self.def_path])
  File "c:\users\poj12\dvc\dvc\external_repo.py", line 147, in fetch_external
    self.local_cache.save(
  File "c:\users\poj12\dvc\dvc\cache\base.py", line 282, in save
    return self._save(path_info, tree, hash_, save_link, **kwargs)
  File "c:\users\poj12\dvc\dvc\cache\base.py", line 290, in _save
    return self._save_file(path_info, tree, hash_, save_link, **kwargs)
  File "c:\users\poj12\dvc\dvc\cache\base.py", line 218, in _save_file
    with tree.open(path_info, mode="rb") as fobj:
  File "c:\users\poj12\dvc\dvc\repo\tree.py", line 372, in open
    return dvc_tree.open(path, mode=mode, encoding=encoding, **kwargs)
  File "c:\users\poj12\dvc\dvc\repo\tree.py", line 113, in open
    return open(cache_path, mode=mode, encoding=encoding)
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\tests\\.dvc\\cache\\a3\\04afb96060aad90176268345e10355'
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-08-18 23:22:56,877 DEBUG: Analytics is disabled.

Similar problem for get:

λ dvc get https://github.com/iterative/example-get-started data/data.xml
ERROR: unexpected error - [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\.kWasCRdYJMf8x9qBkgneqt\\a3\\04afb96060aad90176268345e10355'

I tried from different locations, inside DVC repos o not.

Please provide information about your setup

Output of dvc version:

DVC version: 1.5.1
---------------------------------
Platform: Python 3.8.2 on Windows-10-10.0.18362-SP0
Supports: All remotes
Workspace directory: NTFS on C:\
Repo: dvc, git

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugDid we break something?p2-mediumMedium priority, should be done, but less important

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions