exp show: Include `deps` and `outs`. #7089

daavoo · 2021-12-03T20:51:12Z

❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.

Use repo.index.deps / repo.index.outs to collect dependencies / outputss associated with each experiment.

What's considered a dep

Currently, anything in repo.index.deps that is not a param dependency or an imported .dvc (because if the .dvc is used in the pipeline it would be a duplicated column).

Studio filters git tracked files but I think that showing those files (i.e. src deps) is also valuable.

I think that more complicated internal filtering (i.e. considering removing intermediate deps) it's not worthy and problematic when considering all use cases.

The table can get noisy but we provide the --only-changed flag (we could consider making it the default) and new improved filtering #7141 that should make it easy to customize the table.

JSON output

For --json output, this P.R. adds new deps and outs fields:

{
        "baseline": {
            "data": {
                "deps": {
                    "copy.py": {
                        "hash": "561f068574ab2a132d304dca3dd6510d",
                        "size": 310,
                        "nfiles": None,
                    }
                },
                "metrics": {"metrics.yaml": {"data": {"foo": 1}}},
                "outs": {
                    "model.pkl": {
                        "hash": "fb7792b6596fd12502dd132c0aba0568",
                        "size": 2000,
                        "nfiles": None,
                    }
                },
                "params": {"params.yaml": {"data": {"foo": 1}}},
                "queued": False,
                "running": False,
                "executor": None,
                "timestamp": None,
            }
        }
    }

Table

For the table, it creates a new type of colored columns and shows the hash (let the debate begin).

After some testing, it looks like the optimal value for showing in the data column highly varies between use cases.

Given the limitations of the CLI, I opted for showing hash as it's the value that allows, IMO, to easily identify differences between rows.

From example-get-started:

dvc exp show --only-changed

dvc exp show --all-branches --only-changed

Some deps might not be relevant, filtering with #7141 :

dvc exp show --all-branches --only-changed --drop '.+prepared|model'

daavoo · 2021-12-22T11:53:36Z

tests/func/experiments/test_show.py

+    data_dep = first(x for x in dvc.index.deps if "copy.py" in x.fspath)
+    data_hash = data_dep.hash_info.value[:7]


Didn't know how to something like the ANY usage above but for the type of assertions bellow

dberenbaum · 2021-12-22T17:57:21Z

Looks good so far!

Should deps come at the end (after params)? The noise would be less of an issue if they are on the right.

daavoo · 2021-12-22T19:33:00Z

Looks good so far!

Should deps come at the end (after params)? The noise would be less of an issue if they are on the right.

Done

dvc/repo/experiments/show.py

Use `repo.index.deps` to collect dependencies associated with each experiment.

dberenbaum · 2022-01-24T19:18:50Z

Should deps come at the end (after params)? The noise would be less of an issue if they are on the right.

By the way, Studio has data files in between metrics and params, right? Do you know why this order was preferred, and what do you think? I like them at the end, but consistency with Studio makes sense.

daavoo · 2022-01-26T17:45:41Z

By the way, Studio has data files in between metrics and params, right? Do you know why this order was preferred, and what do you think? I like them at the end, but consistency with Studio makes sense.

I don't know Studio preferences. I think that in our case it makes sense to introduce them at the end as they didn't exist before and table can't get noisy without --only-changed flag.

dberenbaum · 2022-01-26T19:47:29Z

@daavoo It looks like the data columns are in a random order and it sometimes changes on repeated calls to exp show. Can we sort in alphabetical order or some other predictable way that will make sense to users?

jorgeorpinel · 2022-01-28T05:00:14Z

or some other predictable way

Curious: what was the sorting in the end? I think the most natural would be as defined in dvc.yaml

daavoo · 2022-01-28T11:42:30Z

or some other predictable way

Curious: what was the sorting in the end? I think the most natural would be as defined in dvc.yaml

Alphabetical (it is what's currently used in Studio, afaik)

daavoo changed the title ~~[WIP]: exp show: Include deps columns.~~ [WIP] exp show: Include deps columns. Dec 3, 2021

daavoo force-pushed the exp-show-deps branch from 9452c99 to 0d0b1be Compare December 21, 2021 22:03

daavoo changed the title ~~[WIP] exp show: Include deps columns.~~ exp show: Include deps columns. Dec 21, 2021

daavoo marked this pull request as ready for review December 21, 2021 22:26

daavoo requested a review from a team as a code owner December 21, 2021 22:26

daavoo requested a review from dtrifiro December 21, 2021 22:26

daavoo changed the title ~~exp show: Include deps columns.~~ [WIP] exp show: Include deps columns. Dec 21, 2021

daavoo requested review from dberenbaum, shcheklein and skshetry and removed request for skshetry December 21, 2021 22:37

daavoo mentioned this pull request Dec 21, 2021

Include deps columns in experiment table (dataset columns) iterative/vscode-dvc#1183

Closed

daavoo force-pushed the exp-show-deps branch 2 times, most recently from b96420c to 901f9fb Compare December 22, 2021 11:50

daavoo commented Dec 22, 2021

View reviewed changes

daavoo force-pushed the exp-show-deps branch 2 times, most recently from 21782db to ce64f63 Compare December 22, 2021 19:31

skshetry reviewed Dec 23, 2021

View reviewed changes

dvc/repo/experiments/show.py Outdated Show resolved Hide resolved

daavoo changed the title ~~[WIP] exp show: Include deps columns.~~ [WIP] exp show: Include deps and outs. Dec 23, 2021

daavoo force-pushed the exp-show-deps branch 3 times, most recently from 63c1eb6 to b11e30b Compare January 3, 2022 15:49

daavoo changed the title ~~[WIP] exp show: Include deps and outs.~~ exp show: Include deps and outs. Jan 3, 2022

dtrifiro previously approved these changes Jan 10, 2022

View reviewed changes

daavoo dismissed dtrifiro’s stale review via b26f8d6 January 19, 2022 14:31

daavoo force-pushed the exp-show-deps branch 2 times, most recently from b26f8d6 to ea3821b Compare January 19, 2022 17:33

daavoo requested a review from skshetry January 19, 2022 18:07

dberenbaum previously approved these changes Jan 19, 2022

View reviewed changes

daavoo requested a review from dtrifiro January 20, 2022 19:06

daavoo mentioned this pull request Jan 20, 2022

dvctable: Support new color for data columns iterative/dvc.org#3202

Closed

daavoo self-assigned this Jan 20, 2022

daavoo dismissed dberenbaum’s stale review via 46afd1a January 21, 2022 16:52

daavoo force-pushed the exp-show-deps branch 2 times, most recently from 1d37e13 to ae16209 Compare January 21, 2022 16:56

daavoo added 2 commits January 21, 2022 22:14

exp show: Include deps columns.

5b5a603

Use `repo.index.deps` to collect dependencies associated with each experiment.

exp show: Include outs in --json output.

6027ffb

daavoo force-pushed the exp-show-deps branch from ae16209 to 6027ffb Compare January 21, 2022 21:15

dtrifiro approved these changes Jan 24, 2022

View reviewed changes

dberenbaum mentioned this pull request Jan 24, 2022

exp show: differentiate params, metrics, data #7304

Closed

daavoo mentioned this pull request Jan 26, 2022

Dvc exp show deps iterative/dvc.org#3220

Merged

daavoo merged commit 5e80fcc into main Jan 26, 2022

daavoo deleted the exp-show-deps branch January 26, 2022 17:45

daavoo added A: experiments Related to dvc exp A: diff/show enhancement Enhances DVC labels Jan 26, 2022

This was referenced Jan 27, 2022

Automate updating of exp show tables iterative/dvc.org#3219

Closed

exp show: Sort deps columns #7312

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

exp show: Include `deps` and `outs`. #7089

exp show: Include `deps` and `outs`. #7089

Uh oh!

daavoo commented Dec 3, 2021 •

edited

Loading

Uh oh!

daavoo Dec 22, 2021

Uh oh!

dberenbaum commented Dec 22, 2021

Uh oh!

daavoo commented Dec 22, 2021

Uh oh!

Uh oh!

dberenbaum commented Jan 24, 2022

Uh oh!

daavoo commented Jan 26, 2022

Uh oh!

dberenbaum commented Jan 26, 2022

Uh oh!

jorgeorpinel commented Jan 28, 2022 •

edited

Loading

Uh oh!

daavoo commented Jan 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

		data_dep = first(x for x in dvc.index.deps if "copy.py" in x.fspath)
		data_hash = data_dep.hash_info.value[:7]

exp show: Include deps and outs. #7089

exp show: Include deps and outs. #7089

Uh oh!

Conversation

daavoo commented Dec 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's considered a dep

JSON output

Table

Uh oh!

daavoo Dec 22, 2021

Choose a reason for hiding this comment

Uh oh!

dberenbaum commented Dec 22, 2021

Uh oh!

daavoo commented Dec 22, 2021

Uh oh!

Uh oh!

dberenbaum commented Jan 24, 2022

Uh oh!

daavoo commented Jan 26, 2022

Uh oh!

dberenbaum commented Jan 26, 2022

Uh oh!

jorgeorpinel commented Jan 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daavoo commented Jan 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

exp show: Include `deps` and `outs`. #7089

exp show: Include `deps` and `outs`. #7089

daavoo commented Dec 3, 2021 •

edited

Loading

jorgeorpinel commented Jan 28, 2022 •

edited

Loading