Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow inserting named dataset so it could be referenced by name #106

Open
lucywang000 opened this issue Apr 3, 2020 · 2 comments
Open

Comments

@lucywang000
Copy link

lucywang000 commented Apr 3, 2020

problem

When I try to plot some pretty large dataset with oz, the delay of sending the data to the browser with oz/view! until it is rendered could be 3~5 seconds sometimes, which is not good.

analysis

The time includes:

  • json/transit serialization in server
  • network delay
  • json/transit deserialization in client
  • canvas rendering

If we could cache the data on client side, then the first three could be avoided all at once.

possible solution

According to the vega doc, vega supports three types of data:

  1. inline json object
  2. urls of data in supported format, i.e. csv/json/etc.
  3. named data

The first two works out of the box, the last one need to be added through the vega view instance, for instance in the doc it recommends to do this:

vegaEmbed('#vis', spec).then(res =>
  res.view
    .insert('myData', [
      /* some data array */
    ])
    .run()
);

I think it'll be great if oz could support named data sources.

  1. The vega view instance must be exposed somehow (this is also required in Ability to stream or update data in a viz #95)
  2. add websocket commands in server/client so the server could send data+name to the client.

@metasoarous Does this sound good to you? If so I can start work on a pr for it.

@lucywang000
Copy link
Author

I played with it a bit, but it looks like the view instance is recreated each time the spec is changed, so the solution I proposed above may not work.

@metasoarous
Copy link
Owner

Hi @lucywang000. Thanks so much for submitting this issue, and sorry for taking a while to get back to you.

I'm very happy that you've been thinking about this problem! I have as well, though not in any focused way (occasional musing is perhaps a better description).

I think the really slick thing to do here would be to traverse the specs, find the data elements are (keeping in mind layers might have their own data, and that hiccup docs might have multiple specs), checking to see if they've changed since last time you've updated the spec, and doing something smart that prevents you from having to resend the data (since it seems that what we're talking about here is how to tweak the visualization without having to resend all the data). This may be super challenging to orchestrate, but perhaps not impossible, and would make usage with larger datasets much nicer, as you suggest.

Something a little less automated, and maybe closer to what you had in mind, would be to more explicitly allow datasets to be specified separately from the rest of the specification in such a way prevents such data from having to be resent when visualizations (e.g.) update. This is closely related to what I had in mind for #9, but could maybe look different as well (as you mention, using a separate command for updating the data than the view(s) of it).

A few things to sus out though before we spend too much time on this:

  • In these cases, is sending the data the bulk of the time or is it rerendering?
  • If the problem is render-time, would something like the web-gl renderer help us?

I think you're also right that using the view object won't work as that gets recreated. But if sending the data is the problem, not re-rendering, then I think it might be possible to store datasets in a separate r/atom, and feed them into the visualizations from there.

For posterity's sake, this also seems to relate to #26 somewhat.

Please let me know if you have additional thoughts on this. Happy to bounce ideas around further. This isn't super high priority for me at the moment, but if you're still keen to crack this nut, I'd love the help!

Either way, thanks again for thinking about this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants