plots: return error messages for failed plots #7692

shcheklein · 2022-05-04T05:15:09Z

Description and Motivation

If plots can't be processed we log only basic message and skip those plots in the json output:

$ dvc plots diff main workspace -o .dvc/tmp/plots --split --show-json -v --targets missclassified.jpg

DVC failed to load some plots for following revisions: 'workspace, main'.
{
  "missclassified.jpg": []
}

We need to have better results, granular messages about failed plots so that we can show in VS Code properly instead of silently ignoring it, see

It's related to this issues - iterative/vscode-dvc#2277 and iterative/vscode-dvc#1649 in VS Code repo. Very high level - we need to distinguish absent plots from errors and show some signal to users vs silently ignoring things and/or showing misleading messages (refresh button when there is nothing to refresh in an experiment).

Current Output

All examples are done for multiple revisions, --json + --split flags.

Single image

"eval/importance.png": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/example-repos-dev/example-get-started/build/example-get-started/dvc_plots/workspace_eval_importance.png"
    },
    {
      "type": "image",
      "revisions": [
        "c475deb7448319fab434d5650264dd2dd91bad43"
      ],
      "url": "/Users/ivan/Projects/example-repos-dev/example-get-started/build/example-get-started/dvc_plots/c475deb7448319fab434d5650264dd2dd91bad43_eval_importance.png"
    },
    {
      "type": "image",
      "revisions": [
        "7e4e86ca117f1bbef288f2abebfc7c97d0a9925d"
      ],
      "url": "/Users/ivan/Projects/example-repos-dev/example-get-started/build/example-get-started/dvc_plots/7e4e86ca117f1bbef288f2abebfc7c97d0a9925d_eval_importance.png"
    }
  ]

Flexible (top-level) plot

"dvc.yaml::Precision-Recall": [
    {
      "type": "vega",
      "revisions": [
        "7e4e86ca117f1bbef288f2abebfc7c97d0a9925d",
        "c475deb7448319fab434d5650264dd2dd91bad43",
        "workspace"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>"
        },
        "title": "dvc.yaml::Precision-Recall",
        "width": 300,
        "height": 300,
        "mark": {
          "type": "line",
          "point": true,
          "tooltip": {
            "content": "data"
          }
        },
        "encoding": {
          "x": {
            "field": "recall",
            "type": "quantitative",
            "title": "recall"
          },
          "y": {
            "field": "precision",
            "type": "quantitative",
            "title": "precision",
            "scale": {
              "zero": false
            }
          },
          "color": {
            "field": "rev",
            "type": "nominal"
          }
        }
      },
      "datapoints": {
        "workspace": [
          {
            "precision": 0.30321774445485783,
            "recall": 1.0,
            "threshold": 0.0,
            "dvc_data_version_info": {
              "revision": "workspace",
              "filename": "eval/prc/train.json",
              "field": "precision"
            }
          },
         {"...."},
         {
            "precision": 0.6694635900509439,
            "recall": 0.9359028068705488,
            "threshold": 0.20869278966952978,
            "dvc_data_version_info": {
              "revision": "workspace",
              "filename": "eval/prc/test.json",
              "field": "precision"
            }
          }

Multiple images

"mispredicted/croissant/muffin-16115-13825-26827-1d8e67e0bffdfebcdb3b337787823ab6.jpeg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_croissant_muffin-16115-13825-26827-1d8e67e0bffdfebcdb3b337787823ab6.jpeg"
    }
  ],
  "mispredicted/muffin/croissant-0295ed7610487b3118febb5563bc58fd.jpg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_muffin_croissant-0295ed7610487b3118febb5563bc58fd.jpg"
    }
  ],
  "mispredicted/muffin/croissant-3f488b602f2a668e-3fd6af132b0dceafe014dfcf7809d2ff.jpg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_muffin_croissant-3f488b602f2a668e-3fd6af132b0dceafe014dfcf7809d2ff.jpg"
    }
  ],
  "mispredicted/muffin/dog-bec8602c36317744-1827c47f8ae15e6a3c4ee660035781a4.jpg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_muffin_dog-bec8602c36317744-1827c47f8ae15e6a3c4ee660035781a4.jpg"
    }
  ]

Stage linear plot

"dvclive/scalars/eval/loss.tsv": [
    {
      "type": "vega",
      "revisions": [
        "d82452a"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>"
        },
        "title": "dvclive/scalars/eval/loss.tsv",
        "width": 300,
        "height": 300,
        "mark": {
          "type": "line",
          "point": true,
          "tooltip": {
            "content": "data"
          }
        },
        "encoding": {
          "x": {
            "field": "step",
            "type": "quantitative",
            "title": "step"
          },
          "y": {
            "field": "eval/loss",
            "type": "quantitative",
            "title": "eval/loss",
            "scale": {
              "zero": false
            }
          },
          "color": {
            "field": "rev",
            "type": "nominal"
          }
        }
      },
      "datapoints": {
        "d82452a": [
          {
            "timestamp": "1660180711394",
            "step": "0",
            "eval/loss": "2.4602549076080322",
            "dvc_data_version_info": {
              "revision": "d82452a",
              "filename": "dvclive/scalars/eval/loss.tsv",
              "field": "eval/loss"
            }
          },
          {
            "timestamp": "1660180723400",
            "step": "1",
            "eval/loss": "1.3761318922042847",
            "dvc_data_version_info": {
              "revision": "d82452a",
              "filename": "dvclive/scalars/eval/loss.tsv",
              "field": "eval/loss"
            }
          }
        ]
      }
    }
  ],

Unblocks, Related

iterative/vscode-dvc#2277
iterative/vscode-dvc#1649

Next Steps

A bit of research. JSON structure looks extremely suboptimal (tons of duplication), since we are changing it, I'd like to have a bit better understanding of how it's being used. Entry point into VS Code is here.
⌛ Try to add an error for a single image
Classify and suggest how to add errors in all other cases - including directories, regular plots (e.g. linear).

The text was updated successfully, but these errors were encountered:

pared · 2022-05-04T10:36:51Z

What will be the error data used for? Do we only want to pass the error message to the user? Or will there be some logic involved with processing the errors on vs-code side?

mattseddon · 2022-05-04T11:21:50Z

We'll be processing the errors before displaying anything to the user. Would be good to have a way to identify where certain revisions are missing data due to errors (as in the example provided in #7691).

efiop · 2022-07-19T20:52:08Z

@pared Any progress on this one?

pared · 2022-07-20T09:46:37Z

No, but I believe we could include it as a part of implementing iterative/vscode-dvc#1757

dberenbaum · 2022-08-02T20:57:57Z

@pared Any updates on looking into this one?

pared · 2022-08-08T10:39:43Z

I consider it as a part of aforementioned issue on vscode - but the estimation for vscode depends on research on studio side. It is not yet finished.

pared · 2022-09-13T12:16:41Z

Note to self: since returning errors will probably require data structure change, we need to remember to get rid of filling rev in datapoints - as vscode sometimes need to assign their own revision (eg 'main' vs short sha of main).

pared · 2022-09-30T14:47:56Z

I didn't left any comment during research, so:
We were able to implement top level plots basing on old data format. In order to support errors we will need to change the data structure returned by dvc plots ... --json.

shcheklein · 2023-01-16T01:50:16Z

@mattseddon could you please share the location of the code that parses the --json result for plots on our end.

@dberenbaum do we know if anyone else besides vs code depends on --json?

⌛ A bit of research. JSON structure looks extremely suboptimal (tons of duplication), since we are changing it, I'd like to have a bit better understanding of how it's being used.
Another question to answer and agree on how we process directories (if the whole plot dir can't be expanded and we don't know what files it has we can't send an error per file then, we'll have to send it per directory)
Change the output

mattseddon · 2023-01-16T01:54:19Z

Use https://github.com/iterative/vscode-dvc/blob/main/extension/src/plots/model/index.ts#L108 as an entry point.

shcheklein · 2023-01-16T02:28:41Z

I see that data collection depends on the datapoints field and is not using data in the content:

https://github.com/iterative/vscode-dvc/blob/main/extension/src/plots/model/collect.ts#L372
https://github.com/iterative/vscode-dvc/blob/main/extension/src/plots/model/collect.ts#L423

@mattseddon do you remember from the top of your head if we need data in the template, it looks identical (at least in the sample I have), do we need it in VS Code? And why did we decide to keep both (e.g. why don't we parse plot.content.data instead of datapoints).

Are there some proposals, tickets, PRs for the plots JSON format?

mattseddon · 2023-01-16T04:11:01Z

I do not think that we need it.

mattseddon · 2023-01-16T09:05:51Z

Are there some proposals, tickets, PRs for the plots JSON format?

The original PR is here: #7367. From reading the description it looks like the data being duplicated is a bug for the --split flag.

dberenbaum · 2023-01-16T15:42:30Z

@dberenbaum do we know if anyone else besides vs code depends on --json?

No, I don't think so.

For the duplicated data, I'm missing something because I have different output from what @shcheklein shows above. I don't see all the data in content.data.values. My output for dvc plots diff 504206e f586d67 workspace -o .dvc/tmp/plots --split --json looks like this:

{
  "dvc.yaml::Accuracy": [
    {
      "type": "vega",
      "revisions": [
        "504206e",
        "f586d67",
        "workspace"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>" # Nothing else shows up in this field.
        },
...

shcheklein · 2023-01-17T02:23:08Z

@dberenbaum my bad, I didn't use split I think. I wonder what's the purpose of datapoints in the non-split mode then? (not critical I think at all, since JSON is not used anywhere now).

shcheklein · 2023-01-17T02:50:20Z

Updated the description - some examples of the current output. Next - try to add an error for an image plot (not directory with images_ case for now (example-get-started's importance.png).

skshetry · 2023-01-20T04:26:17Z

In general, I find returning errors to be a mistake. It increases a lot of maintenance burden, for which we are not ready internally.

shcheklein · 2023-01-20T04:50:26Z

@skshetry it's a bad user experience to not show anything at all in case something fails and since we. I think if it's done right it won't be a bigger burden at all and code doesn't have to be complicated. We already have this data we just need to propagate it (I think so at least, I can be wrong). And to clarify, we don't talk here about processing specific types of errors, we just need a signal that plot exists in a revision and that it can't be loaded for some reason.

On the maintenance side - I think the whole plots logic and related index part should be the first thing to improve. E.g. after the last refactoring we still have two plot accessors (_plots and plots), still some custom collect logic, a lot of logic with path manipulations, old code (like output plots, etc) - those are points that should be remove, refactored, etc to make it lighter and simpler.

skshetry · 2023-01-20T04:58:36Z

We already have this data we just need to propagate it

That's where the complexity is, right? It's easy to log or suppress but extremely hard to propagate up. We need small sets of APIs at the high level where we do this. At the moment we are spreading this logic to deep layers which increases the burden.

I think there should be a symmetry between the product and the engineering side, and here I think the expectation on the product side is too high (or, was too high). :)

shcheklein · 2023-01-20T05:07:08Z

That's where the complexity is, right?

Doesn't have to be. Sometimes dropping some code (that removes and / or transforms things) instead of exposing them directly (which might be just fine in this case) can simplify it. We'll see how it goes. I definitely want to avoid adding tons of custom code for this.

I think there should be a symmetry between the product and the engineering side, and here I think the expectation on the product side is too high (or, was too high). :)

I think it's a wrong dichotomy in this case. I'm not sure if it's possible to do it now w/o complicating things. It's definitely doesn't add much complexity to do this from scratch. If we had the standard in mind (it's not high at all) we would have spent some small additional percent of time.

Product expectation - we talk about VS Code, right (that's what I have in mind in the first place), not DVC? Just in case. I'm fine (more or less) for DVC to return a generic error (and write something in logs). In VS Code it leads to bad experience. It's not top priority (that's why I'm doing this in background), but it can and should be fixed. And we should have a higher standard for out products.

shcheklein · 2023-02-07T20:43:26Z

For visibility: got distracted by some other plots issues (broken smooth templates, new DVCLive release) and didn't have capacity for this hands on work (which is not a lot of time by default). I'll try to get back to this asap.

Some design decisions that are tricky here. If we have a plot directory we expand each file in that directory as its own plot when we return the result. It's fine. The problem is that we don't know the layout if we can't download the .dir in the first place. So, for these granular plots - we can't communicate errors at all- we don't know for sure if they exist or not in the directory in the failed revision. We'll have assume that they don't I guess + communicate that we were not able to process the whole directory.

dberenbaum · 2023-02-08T13:18:27Z

@shcheklein Can you clarify the full scope of the issue? Is it only about plot directories, or is that merely one case you are trying to solve for?

shcheklein · 2023-02-08T16:59:10Z

Yes, @dberenbaum . It's related to this issues - iterative/vscode-dvc#2277 and iterative/vscode-dvc#1649 in VS Code repo. Very high level - we need to distinguish absent plots from errors and show some signal to users vs silently ignoring things and/or showing misleading messages (refresh button when there is nothing to refresh in an experiment).

Can you clarify the full scope of the issue? Is it only about plot directories, or is that merely one case you are trying to solve for?

Thus: The full scope: show error message for all plot definitions, not only directories / images.

dberenbaum · 2023-02-08T21:57:40Z

@skshetry Can you follow up with questions you have, and @shcheklein and I can respond to define the scope better? By next week when you are finished with support duty, let's try to have a solid plan and estimate 🙏 .

skshetry · 2023-02-14T15:16:16Z

I could not look into this during support duty, as some p0s/bugs came.

dberenbaum · 2023-02-14T16:32:07Z

More related issues:

skshetry · 2023-02-21T12:33:57Z

We do seem to preserve errors during plots.collect(). We transform internal representation to the JSON format, where we lose most of the information. We could start with exposing that, what would be a good json format for incorporating errors for vscode?

shcheklein · 2023-02-21T19:07:17Z

We could start with exposing that, what would be a good json format for incorporating errors for vscode?

@skshetry 🤔 tbh I don't think VS Code requires anything specific here. We should come up with a decent general format for this data. We can adjust VS Code if needed.

dberenbaum · 2023-02-21T19:24:08Z

I think what we've learned is that it's helpful to share drafts early and often to get feedback as you go so we know mostly what works in both products by the time we are ready to merge.

shcheklein added feature request Requesting a new feature A: api Related to the dvc.api A: plots Related to the plots labels May 4, 2022

shcheklein mentioned this issue May 4, 2022

Handle DVC errors gracefully in plots iterative/vscode-dvc#1649

Closed

2 tasks

mattseddon mentioned this issue May 4, 2022

Check all revision data to decide whether or not it is cached iterative/vscode-dvc#1648

Closed

pared assigned pared and unassigned pared May 4, 2022

shcheklein added the product: VSCode Integration with VSCode extension label May 4, 2022

efiop assigned pared Jul 19, 2022

mattseddon mentioned this issue Aug 16, 2022

Handle associated git repo having no commits iterative/vscode-dvc#2194

Closed

mattseddon mentioned this issue Aug 29, 2022

Image plots show proper message and disable button if image is not available iterative/vscode-dvc#2277

Closed

pared mentioned this issue Aug 29, 2022

Data trees: handle better broken DVC files iterative/vscode-dvc#2254

Closed

5 tasks

pared mentioned this issue Oct 20, 2022

Research JavaScript wrapper iterative/dvc-render#72

Open

dberenbaum unassigned pared Nov 18, 2022

shcheklein mentioned this issue Jan 7, 2023

perf: remove fs exists check in plots, parallel data collect #8777

Merged

2 tasks

dberenbaum mentioned this issue Feb 14, 2023

Show notification on a global plots failure iterative/vscode-dvc#3222

Closed

omesser assigned skshetry Mar 10, 2023

skshetry mentioned this issue Mar 15, 2023

plots: return errors in json format #9146

Merged

dberenbaum added p3-nice-to-have It should be done this or next sprint p1-important Important, aka current backlog of things to do and removed p3-nice-to-have It should be done this or next sprint labels Mar 16, 2023

skshetry closed this as completed in #9146 Mar 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plots: return error messages for failed plots #7692

plots: return error messages for failed plots #7692

shcheklein commented May 4, 2022 •

edited

pared commented May 4, 2022

mattseddon commented May 4, 2022

efiop commented Jul 19, 2022

pared commented Jul 20, 2022

dberenbaum commented Aug 2, 2022

pared commented Aug 8, 2022

pared commented Sep 13, 2022

pared commented Sep 30, 2022

shcheklein commented Jan 16, 2023

mattseddon commented Jan 16, 2023

shcheklein commented Jan 16, 2023

mattseddon commented Jan 16, 2023

mattseddon commented Jan 16, 2023

dberenbaum commented Jan 16, 2023

shcheklein commented Jan 17, 2023

shcheklein commented Jan 17, 2023 •

edited

skshetry commented Jan 20, 2023

shcheklein commented Jan 20, 2023

skshetry commented Jan 20, 2023

shcheklein commented Jan 20, 2023

shcheklein commented Feb 7, 2023

dberenbaum commented Feb 8, 2023

shcheklein commented Feb 8, 2023 •

edited

dberenbaum commented Feb 8, 2023

skshetry commented Feb 14, 2023

dberenbaum commented Feb 14, 2023

skshetry commented Feb 21, 2023

shcheklein commented Feb 21, 2023

dberenbaum commented Feb 21, 2023

plots: return error messages for failed plots #7692

plots: return error messages for failed plots #7692

Comments

shcheklein commented May 4, 2022 • edited

Description and Motivation

Current Output

Unblocks, Related

Next Steps

pared commented May 4, 2022

mattseddon commented May 4, 2022

efiop commented Jul 19, 2022

pared commented Jul 20, 2022

dberenbaum commented Aug 2, 2022

pared commented Aug 8, 2022

pared commented Sep 13, 2022

pared commented Sep 30, 2022

shcheklein commented Jan 16, 2023

mattseddon commented Jan 16, 2023

shcheklein commented Jan 16, 2023

mattseddon commented Jan 16, 2023

mattseddon commented Jan 16, 2023

dberenbaum commented Jan 16, 2023

shcheklein commented Jan 17, 2023

shcheklein commented Jan 17, 2023 • edited

skshetry commented Jan 20, 2023

shcheklein commented Jan 20, 2023

skshetry commented Jan 20, 2023

shcheklein commented Jan 20, 2023

shcheklein commented Feb 7, 2023

dberenbaum commented Feb 8, 2023

shcheklein commented Feb 8, 2023 • edited

dberenbaum commented Feb 8, 2023

skshetry commented Feb 14, 2023

dberenbaum commented Feb 14, 2023

skshetry commented Feb 21, 2023

shcheklein commented Feb 21, 2023

dberenbaum commented Feb 21, 2023

shcheklein commented May 4, 2022 •

edited

shcheklein commented Jan 17, 2023 •

edited

shcheklein commented Feb 8, 2023 •

edited