Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc pipeline show --ascii - visual bug since version 0.82.9 #3410

Closed
Noizecube opened this issue Feb 26, 2020 · 3 comments · Fixed by #3421
Closed

dvc pipeline show --ascii - visual bug since version 0.82.9 #3410

Noizecube opened this issue Feb 26, 2020 · 3 comments · Fixed by #3421
Assignees
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.

Comments

@Noizecube
Copy link

DVC version: 0.82.9 and following
reproduced on several platforms, like:
4.15.0-55-generic #60-Ubuntu
macOS High Sierra 10.13.6
macOS Catalina 10.15.2

Hi,

the command "dvc pipeline show --ascii" seems to have a bug since version 0.89.9, which does not display the dependencies properly, when there is a branch in a Pipeline.

I created a simple pipeline, which looks like this:

Pipeline with Version 0.82.8
Pipeline with Version 0.82.8

If i use the command "dvc pipeline show --ascii validation.dvc" with version 0.82.8, the whole pipeline gets displayed properly.
If i upgrade my Version to e.g. 0.83.0, the connection between "val.dvc" and the previous Stage is missing. If i use the command on val.dvc itself, the stages "train.dvc" and "validation.dvc" are missing (see screenshots).

Pipeline with Version 0.83.0
Pipeline with Version 0.83.0

Pipeline with Version 0.83.0, when using val.dvc in the command
Pipeline with Version 0.83.0, when using val.dvc in the command

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Feb 26, 2020
@pared
Copy link
Contributor

pared commented Feb 26, 2020

Can confirm that issue still exists.
Reproduction script:

#!/bin/bash

rm -rf repo
mkdir repo

pushd repo
git init --quiet
dvc init -q

echo data>>data
dvc add data

dvc run -d data -f dvc_run.dvc -o processed_data "cat data>>processed_data"
dvc run -d processed_data -o val "cat processed_data>>val"
dvc run -d processed_data -o train "cat processed_data>>train"
dvc run -d val -d train -o validation "echo validated >> validation"
dvc pipeline show --ascii validation.dvc

@pared pared added the bug Did we break something? label Feb 26, 2020
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Feb 26, 2020
@pared pared added the p1-important Important, aka current backlog of things to do label Feb 26, 2020
@efiop
Copy link
Contributor

efiop commented Feb 26, 2020

Might be caused by #3217

@efiop efiop added p0-critical Critical issue. Needs to be fixed ASAP. and removed p1-important Important, aka current backlog of things to do labels Feb 26, 2020
@pared pared self-assigned this Feb 28, 2020
@pared
Copy link
Contributor

pared commented Mar 2, 2020

Ok, so:

  1. @efiop is right that the cause lies in pipeline: show: outs: eliminate extra edges in DAG #3217
  2. Before this change, when building graph for outs we iterated over all graph edges and have been adding them for display.
  3. After the change, we used dfs_edges algorithm which, as stated in source:
    Perform a depth-first-search over the nodes of G and yield the edges in order.
    So what does it mean, is that dfs_edges iterates over nodes using dfs and returns edges visited during traversal. That means that if some edges were not visited during traversal, it was not returned.
  4. In order to obtain all edges, we need to use edge_dfs which focuses on visiting edges, and not nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants