-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Labels
A: experimentsRelated to dvc expRelated to dvc expp2-mediumMedium priority, should be done, but less importantMedium priority, should be done, but less importantresearch
Description
Bug Report
Description
When a pipeline uses an imported data file (with dvc import), the data gets cloned(?) and hashed every time dvc exp run is called.
Reproduce
- dvc import git@github.com:iterative/dataset-registry.git use-cases/cats-dogs
- dvc stage add -n foo -d cats-dogs echo foo
- dvc exp run
Expected
When using dvc repro the imported data doesn't get re-hashed. I would expect dvc exp run to behave the same.
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 2.6.3 (pip)
---------------------------------
Platform: Python 3.9.6 on macOS-10.16-x86_64-i386-64bit
Supports:
gdrive (pydrive2 = 1.9.1),
http (requests = 2.26.0),
https (requests = 2.26.0)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1s1
Caches: local
Remotes: None
Workspace directory: apfs on /dev/disk1s1s1
Repo: dvc, gitAdditional Information (if any):
$ dvc repro -v
2021-08-25 11:11:55,186 DEBUG: Computed stage: 'cats-dogs.dvc' md5: '5a135b297ee3c96465ce4b320f44fb8b'
'cats-dogs.dvc' didn't change, skipping
Stage 'foo' didn't change, skipping
Data and pipelines are up to date.$ dvc exp run -v
2021-08-25 11:12:15,672 DEBUG: Detaching HEAD at 'HEAD'
2021-08-25 11:12:15,690 DEBUG: Stashing workspace
2021-08-25 11:12:15,700 DEBUG: No changes to stash
2021-08-25 11:12:15,749 DEBUG: Creating external repo git@github.com:iterative/dataset-registry.git@ca140591a21c6d75a7057d1e2eb3f51d3115c5f5
2021-08-25 11:12:15,749 DEBUG: erepo: git clone 'git@github.com:iterative/dataset-registry.git' to a temporary dir
Computing file/dir hashes (only done once)
. . . mmeendez8 and Viv-Crowe
Metadata
Metadata
Assignees
Labels
A: experimentsRelated to dvc expRelated to dvc expp2-mediumMedium priority, should be done, but less importantMedium priority, should be done, but less importantresearch
Type
Projects
Status
Done