Skip to content

dag: do not include DVC-tracked dependencies as stages #4058

@shcheklein

Description

@shcheklein

DVC version:

DVC version: 1.0.0a10+4042d5
Python version: 3.7.7
Platform: Darwin-19.4.0-x86_64-i386-64bit
Binary: False
Package: None
Supported remotes: azure, gdrive, gs, hdfs, http, https, s3, ssh, oss
Cache: reflink - supported, hardlink - supported, symlink - supported
Repo: dvc, git

Reproduce:

clone the https://github.com/shcheklein/example-get-started, run dvc dag

Issue:

We print something like this:

   +-------------------+
    | data/data.xml.dvc  |
    +------------------+
              *
              *
              *
         +---------+
         | prepare   |
         +---------+
              *
              *
              *
        +-----------+
        | featurize    |
        +-----------+
         **        **
       **            *
      *               **
+-------+             *
|   train   |             **
+-------+            *
         **        **
           **    **
             *  *
        +----------+
        | evaluate   |
        +----------+

It's not clear why we consider data dependency a stage. Why do we consider only DVC-tracked or imported deps as stages, not any other deps?

Suggestion is to avoid showing any dependencies (at least in the default view) and focus on stages only.

Metadata

Metadata

Assignees

No one assigned

    Labels

    p2-mediumMedium priority, should be done, but less importantuiuser interface / interaction

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions