Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research JavaScript wrapper #72

Open
daavoo opened this issue Jul 11, 2022 · 2 comments
Open

Research JavaScript wrapper #72

daavoo opened this issue Jul 11, 2022 · 2 comments

Comments

@daavoo
Copy link
Contributor

daavoo commented Jul 11, 2022

I now wonder if this use case is not another point for dvc-render to have some support for javascript. If there is some pythonic wrapper for js maybe we could leverage vega to produce images in non-linear template use cases. But I guess that would require some research to determine this idea's feasibility.

Originally posted by @pared in #69 (review)

@shcheklein
Copy link
Member

maybe we could leverage vega to produce images in non-linear template use cases

@pared could you please clarify (expand) a bit on this? :)

@pared
Copy link
Contributor

pared commented Oct 20, 2022

@shcheklein sorry for not responding at the time, I thought I answered it at the time, but clearly I haven't. I guess answer now will be fuller after top-level plots are implemented for vscode and studio is on the way.

Why do we have dvc-render? - initially dvc plots was producing an HTML, later, when we started producing derivative projects plots started to have additional requirements, for example splitting data, and returning partially filled templates. When the requirements started to stack we moved dvc-render functionalities out of DVC - to not focus only on DVC, but let, for example, studio use it too. To some extent it worked, though not as ideally as I presumed originally. Later, we also created vscode extension, where dvc-render could not be used at all.

What is the main problem with my thinking? I assumed that "DVC" is a source of our plots, and that DVC should be responsible for parsing configs and data and passing the results to derivative projects. This assumption is wrong on a few levels.

  1. This approach results in delays in development:

    a. vscode bases its plots on what it gets from DVC, so we cannot implement top-level plots there without doing that first in DVC, and then adjusting vscode

    b. studio has already been parsing plots by itself, reusing parts of DVC parsing code, but still they were processing ingredients for plots differently (for example storing plots configs in database and data, separately, in S3) - in this case top-level plots could be implemented there in parallel, but more people would have to be involved.

  2. We cannot create single data processing pipeline that would fit all projects - simply because of differences of UI in those projects. For example error handling: DVC logs it to stderr, the studio needs to collect them and display them in web UI, vscode does nothing, since it's not implemented yet: plots: return error messages for failed plots dvc#7692

So why do I think we should have a js dvc-render? Because both studio and vscode do some processing of data/plots configuration on frontend. For example, assigning different revision names, and overriding colors for particular data series.

What should we be doing:

  1. source the data from plots.collect() method - it returns the actual source of truth - data, plot configs, and errors, if we cannot get the data/configs - of course for vscode we need to serialize the errors
  2. Let all the projects handle the data however they see fit, especially handling the errors returned in the previous step, and for example do a caching (studio already does it, and it has already been said that data collection for vscode is too slow), and infer some additional requirements basing on config and data - for example, coloring, dashing of particular data series
  3. use a common library (dvc-render-js) that could be used to display the plots - it would take care of ingesting loaded data, parsing config, extracting data series, and applying coloring if needed. Basically what dvc.render.convert.vega.VegaConverter and dvc-render tries to do, but cannot for all projects as they cannot be used in frontend.

I should have seen this earlier but realized only after getting involved to our other projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants