Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved layout for task graph in dashboard #3467

Open
MichaelSchreier opened this issue Feb 11, 2020 · 3 comments
Open

Improved layout for task graph in dashboard #3467

MichaelSchreier opened this issue Feb 11, 2020 · 3 comments

Comments

@MichaelSchreier
Copy link

Over the weekend I played around with improving the visuals of the task graph in the online dashboard and was wondering if you were interested in me refining this for a PR.

Some examples:

  • The custom graph from the documentation becomes
    dask_graph_2

  • The code

    from distributed import Client, LocalCluster
    import dask.array as da
    
    cluster = LocalCluster()
    client = Client(cluster)
    x = da.ones((1500, 1500), chunks=(300, 300))
    
    y = x + x.T
    
    client.compute(y)

    becomes
    dask_graph_1

  • A complex custom graph from my project at work transforms from this
    dask_custom_graph_old
    to this
    dask_custom_graph_new


To get these results I did the following

  • slightly tuned the colors to more eye friendly versions (I just noted that Visual suggestion for task graph #2711 exists) - loosely following this wikipedia visualization
  • replaced the straight edges with bezier curves
  • added arrow heads to the edges
  • removed the axes as they don't really add anything to the visualization
  • replaced the update_graph() method in graph_layout.py with a version leveraging on the Grandalf package to compute a nicer layout
    • my initial impression is that the overhead caused by proper layouting is not an issue for graphs not larger than few hundred nodes
    • for a final implementation one could use the old method as an automatic fallback in case of larger graphs or allow selecting the method via a config entry
    • dask already has graphviz as an optional dependency and it might make sense to use this instead to compute the node coordinates (though it's a heavier dependency, whereas Grandalf is self contained, pure Python)
    • it might also make sense to switch to a pure client side solution (e.g. based on d3.js or better yet something canvas or webgl Dashboard should use webgl to accomodate large task graphs #2976 based)

Any thoughts?

@mrocklin
Copy link
Member

Thanks @MichaelSchreier ! This looks really good.

for a final implementation one could use the old method as an automatic fallback in case of larger graphs or allow selecting the method via a config entry

Performance here is my main concern. Some workloads have thousands of nodes and they update their structure incrementally several times per second. This is hard both on the server (with things like layout) and on the client (browsers end up being a bottleneck when you have thousands of elements on the screen). Fun fact: we actually chose to use squares rather than circles because we found that it was easier for browsers to draw.

So we could do something like this, but we would need to make sure that we transitioned smoothly to the less-pretty and more efficient form when necessary.

I haven't yet taken a look at the implementation, but wanted to give you a quick response with high level feedback.

@MichaelSchreier
Copy link
Author

@mrocklin I haven't attached any code yet - so far it's a hacky sunday afternoon implementation...

If performance is the key factor here then a mostly browser based solution (layout & drawing) sounds like the best approach in the long run - moving all the heavy lifting to outside the critical parts of dask.

  • D3-dag for instance does everything needed, though svg based solutions typically perform poorly with many individual objects
  • cytoscape.js-dagre is canvas based and should scale much better

(both solutions provide the "Sugiyama-style" layouts best suited to dask graphs)

Still, layouting thousands of nodes does not come for free and I highly doubt that any solution could deliver the performance needed in that case.

I'm also not sure if mix-matching properly layouted graphs with the existing solution is really better than just letting the user decide to use one or the other method as desired. In my use case above for instance the graph does not change for two hours or so. That means time to layout the nodes is almost irrelevant.

If you consider custom graphs and small scale problems (where I see this bring the most benefit) to be too much of a niche application of dask then I'll skip the PR and just describe the changes required for folks to implement it themselves if needed - it's just a couple dozen lines of code after all.

I probably don't have the time to work on a full on browser based replacement as discussed above but could refine my approach as-is as an optional setting and submit a PR. Tough that could leave you with a code fragment that you do not want to support in the long run.

@mrocklin
Copy link
Member

Still, layouting thousands of nodes does not come for free and I highly doubt that any solution could deliver the performance needed in that case.

Yeah, the layout system we have in Dask, while not pretty, is actually decently performant. I would be a little surprised to see other systems beat it significantly.

I'm also not sure if mix-matching properly layouted graphs with the existing solution is really better than just letting the user decide to use one or the other method as desired

Unfortunately I think that most of our users will never be aware of this option. I agree that offering the option of control is good, but we also need to have sensible defaults that keep them out of trouble.

In my use case above for instance the graph does not change for two hours or so. That means time to layout the nodes is almost irrelevant

Understood. A lot of the design decisions we make are suboptimal, but robust. The only way I see this having an impact is if Dask chooses to use it when we have relatively few tasks and things change relatively infrequently. Otherwise I think that it's likely to break a non-trivial amount of user workloads.

However, there is another graph that we keep track of in the TaskGroups that is generally much smaller, and could probably use a layout engine closer to this. (cc @jrbourbeau ). This might be useful there.

If you consider custom graphs and small scale problems (where I see this bring the most benefit) to be too much of a niche application of dask then I'll skip the PR and just describe the changes required for folks to implement it themselves if needed - it's just a couple dozen lines of code after all.

It's not a niche use case. It's a very common case. But we probably can't optimize for it if it means breaking some other common cases.

Personally, I think that the way forward here is to find some way to have things automatically transition smoothly, or else to drop this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants