Skip to content

dvc run : possible bug with deeply nested dependencies #2483

@tdeboissiere

Description

@tdeboissiere

Please provide information about your setup

ubuntu 18.04, dvc==0.59.2, pip install with miniconda python 3.7

For deeply nested dependencies, it looks like dvc is not tracking them properly in the .dvc files

Following script reproduces the issue:

#!/bin/bash

set -x
set -e

rm -rf dvc_test
mkdir dvc_test && cd dvc_test
mkdir scripts
mkdir -p data/recommended/dataset1/dataset1_proc
echo bar > data/recommended/dataset1/v1.txt
git init
dvc init
echo -e "import sys\
\nwith open(sys.argv[1], 'w') as f: f.write(sys.argv[2])" > scripts/script.py

# Run works
dvc run -y \
    -w $PWD/data/recommended/dataset1/dataset1_proc\
    -f ./data/recommended/dataset1/dataset1_proc/v1.dvc\
    -d ../../../../scripts/script.py \
    -o v1 \
    "mkdir v1 && python ../../../../scripts/script.py v1/v1.txt proc_data"

# Inspecting v1.dvc shows that the script.py dependency is missing on ../
cat data/recommended/dataset1/dataset1_proc/v1.dvc

# Because of the, repro does not work
dvc repro data/recommended/dataset1/dataset1_proc/v1.dvc

Inspecting the v1.dvc file shows:

md5: 4ae72e168a0a6a2f1aaadfb5628640f7
cmd: mkdir v1 && python ../../../../scripts/script.py v1/v1.txt proc_data
deps:
- md5: 791b9c74b1d9308a3226b93a36689dad
  path: ../../../scripts/script.py
outs:
- md5: 188ed6cb603658d01ef7ba8fb7c434fe.dir
  path: v1
  cache: true
  metric: false
  persist: false
+ dvc repro data/recommended/dataset1/dataset1_proc/v1.dvc

Which indicates that the script.py dependency is indeed missing one ../

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions