Group together related commands in the graph visualization #229

krlmlr · 2018-02-04T06:44:46Z

Can visNetwork visually group related commands (manually specified by the user) in a subgraph-like setting?

wlandau · 2018-02-14T22:20:08Z

How should we choose the groupings? My intuition tells me that graph theory has a straightforward answer somewhere.

wlandau · 2018-02-19T20:21:01Z

#233 might preserve the patterns that expanded commands came from, which would help with grouping related commands here.

wlandau · 2018-02-23T15:22:23Z

Could features like this one bud into their own drake-focused visualization package? I believe drake should natively support basic network visualizations, but the possibilities are endless, and the code base will likely be long, complicated and difficult to test.

krlmlr · 2018-02-23T20:39:08Z

I was looking only for manual grouping, perhaps with a new column in the plan data frame?

wlandau · 2018-02-23T20:49:50Z

That sounds much easier.

wlandau · 2018-02-23T21:01:16Z

On the other hand, I have tried and failed to micromanage the vertical ordering of the nodes. Maybe it's because of the directed/leveled positioning and default Sugiyama igraph layout in render_drake_graph(), but from what I recall from early development, I actually doubt this feature will turn out well as long as we are using visNetwork. Here is where I think ggraph could help us. I'm not exactly how exactly sure about the implementation, but it should be straightforward with the output of dataframes_graph(). With ggraph, we lose interactivity, but there is a lot to gain in return.

krlmlr · 2018-02-23T21:12:20Z

I saw that vis.js can do clustering, but I'm not sure if it helps. How does the ggraph output for the basic example look like?

wlandau · 2018-02-23T21:16:45Z

Not sure yet, but eager to finally try out ggraph!

wlandau · 2018-02-26T20:57:24Z

It looks like ggraph may not have clustering, but I will search harder. visNetwork has visClusteringByGroup, though I am having trouble making more than one cluster at a time.

library(drake)
library(visNetwork)
con <- load_basic_example()
df <- dataframes_graph(con)
df$nodes
df$nodes$group <- paste0(df$nodes$status, "_", df$nodes$type)
g <- render_drake_graph(df)
visGroups(g, groupname = "imported_function") %>%
  visGroups(groupname = "outdated_object") %>%
  visClusteringByGroup(
    groups = c("imported_function", "outdated_object"))

krlmlr · 2018-02-28T13:25:36Z

Works for me with a variant of the ?visGroups example from visNetwork:

library(visNetwork)
nodes <- data.frame(id = 1:10, label = paste("Label", 1:10), 
 group = sample(c("A", "B"), 10, replace = TRUE))
 edges <- data.frame(from = c(2,5,10), to = c(1,2,10))

visNetwork(nodes, edges) %>%
 visLegend() %>%
 visGroups(groupname = "A", color = "red", shape = "database") %>%
 visGroups(groupname = "B", color = "yellow", shape = "triangle") %>%
 visClusteringByGroup(c("A", "B"))

wlandau · 2018-02-28T15:37:20Z

Thanks, Kirill! Is this the kind of clustering you were imagining? Do you think it would be enough to list all the target names in the cluster, maybe with the label argument of visClusteringByGroup()?

krlmlr · 2018-02-28T15:43:19Z

For a collapsed cluster I'd rather only see its label and not the detailed target names in the cluster. I haven't thought about clustering in interactive visualizations, but this does look useful. For graphviz-based renderers we can use a similar logic for specifying the groups, even if the display will look different (see first post).

AlexAxthelm · 2018-03-11T23:09:12Z

As a side note, to consider (after seeing the unconf thread) it might be worth searching the global environment for anything that looks like a drake plan (tibble with correct colnames would be a good start), and use that as a rough basis for clustering. I know that I usually have something along the lines of

data_plan <- drake_plan({importing data})
cleaning_plan <- drake_plan({cleaning functions})
analysis_plan <- drake_plan({analysis_functions})
reporting_plan({reporting functions})
master_plan <- bind_rows(data_plan, cleaning_plan, analysis_plan, reporting_plan)
make(master_plan)

If I could see a simplified graph with 4 target-ish objects, so that I can tell easily how long importing takes, or where in the plan the make failed, I would be happy. Maybe expanding collapsing sub-plans that didn't get touched yet, or that made successfully? This could be a non-default behavior, but if I have 1000+ targets, which all are at least similar, It would be nice (and improve render time), if I didn't have to see then all.

wlandau · 2018-03-12T03:05:59Z

I like the general idea. Subplans define the natural clustering that I see most people using. People typically combine their plans with bind_rows() or similar.

bind_rows(data_plan, cleaning_plan, analysis_plan, reporting_plan)

What about bind_plans()?

master_plan <- bind_plans(
  data = data_plan,
  cleaning = cleaning_plan,
  analysis = analysis_plan,
  reporting = reporting_plan
)

bind_plans() would add an extra cluster or subplan column, where the names of the clusters would respect the argument names you provide. Eventually, we could even designate different future resource types to different subplans (re: #169).

I hesitate to search the user's environment for (sub)plans because it seems a bit mysterious.

AlexAxthelm · 2018-03-12T03:15:32Z

I like the idea to explicitly label subplans. Having that be a separate column would also open the door to multiple levels of grouping.

rkrug · 2018-03-12T07:33:49Z

Also, when using bind_plans() one could add the name of the subplan as a prefix into the target name which would make it much easier to deal with duplicate target names in different sub-plans. This could even be disabled via an argument if not wished.

AlexAxthelm · 2018-03-12T13:24:58Z

To expand on my idea from above, now that I'm in front of a real keyboard, the idea would be to have multiple levels of grouping, along the lines of:

plan = tribble(
  ~target,   ~command,        ~group,
  "x",       "seq(1, 10)",    "import",
  "y",       "seq(10, 1)",    "import",
  "x_clean", "as.numeric(x)", "cleaning",
  "y_clean", "as.numeric(y)", "cleaning",
  "z",       "y + 10",        "analysis",
  "y_lm",    "lm(x ~ y)",     c("analysis", "linear"),
  "z_lm",    "lm(x ~ z)",     c("analysis", "linear"),
  "y_glm",   "glm(x ~ y)",     c("analysis", "general"),
  "z_glm",   "glm(x ~ z)",     c("analysis", "general")
) %>% print()
# A tibble: 9 x 3
#  target  command       group    
#  <chr>   <chr>         <list>   
#1 x       seq(1, 10)    <chr [1]>
#2 y       seq(10, 1)    <chr [1]>
#3 x_clean as.numeric(x) <chr [1]>
#4 y_clean as.numeric(y) <chr [1]>
#5 z       y + 10        <chr [1]>
#6 y_lm    lm(x ~ y)     <chr [2]>
#7 z_lm    lm(x ~ z)     <chr [2]>
#8 y_glm   glm(x ~ y)    <chr [2]>
#9 z_glm   glm(x ~ z)    <chr [2]>

so that if, for example, z_glm failed to build, the build graph would show the "import", "cleaning", and "linear" groups as groups, but expand the "general" group, so that I could see the failed object.

The underlying assumption here is that plans contain targets that act similarly, so if I have many similar objects, I don't need to see the details about them unless something is wrong.

A loose sketch of what I'm thinking:

wlandau · 2018-03-13T04:17:05Z

Great ideas, Alex! It seems like we could implement them drake itself even before #282 is implemented. If we do it cleanly, not much in dataframes_graph() or vis_drake_graph() would need to change. We could just take the clusters from the group.

Permitting multiple groups (for example, c("analysis", "general") for z_glm) is the most complicated thing. Off the top of my head, I don't know if it makes sense for a pre-#282 implementation. I wonder if visNetwork supports clusters within clusters...

AlexAxthelm · 2018-03-13T13:01:10Z

It appears that clustering in visNetwork is still experimental. http://datastorm-open.github.io/visNetwork/more.html

I think trying the one-level clustering would be a good first step. My machine won’t boot right now, or I would play around with it myself.

wlandau · 2018-03-13T13:03:56Z

Sure, that sounds like a good plan for base drake. We can allow multiple groups in bind_plans() and then use the first group listed for each target. Separate tools can extend this to account for multiple groups.

wlandau · 2018-03-13T16:59:06Z

Re: ropensci/unconf18#12 (comment), clusters are related to expansions and subplans in the DSL. cc @dapperjapper.

wlandau · 2018-06-30T02:44:13Z

I plan to start work on this in a new drakevis package once I have time to work on it in earnest.

wlandau · 2018-07-06T02:24:50Z

The cleanest solution I know falls right out of #376 (comment). Keeping wildcard information after expansion/evaluation seems massively useful for #229 (comment).

wlandau · 2018-07-06T15:25:30Z

6edf816 exposes all columns from the plan in drake_graph_info()$nodes, which gives us flexibility: clusters can be subplans, wildcards, etc. visNetwork clustering may not work out (datastorm-open/visNetwork#254) but manual clustering should be straightforward.

krlmlr changed the title ~~Related commands~~ Related commands in the graph visualization Feb 4, 2018

wlandau added the type: new feature label Feb 5, 2018

wlandau mentioned this issue Feb 11, 2018

Grouping nodes with visual cues datastorm-open/visNetwork#233

Closed

wlandau changed the title ~~Related commands in the graph visualization~~ Group together related commands in the graph visualization Feb 21, 2018

krlmlr mentioned this issue Feb 23, 2018

Static ggraph visualization #279

Closed

wlandau added the topic: visualization label Feb 24, 2018

wlandau mentioned this issue Feb 24, 2018

Offload all visualization to separate tools #282

Closed

wlandau mentioned this issue Mar 10, 2018

Improved visualization for drake ropensci/unconf18#12

Closed

wlandau added the difficulty: advanced label Mar 13, 2018

wlandau-lilly removed the difficulty: advanced label Mar 13, 2018

wlandau removed type: new feature status: priority labels Mar 20, 2018

wlandau added the depends: #282 🕒 label Jun 30, 2018

wlandau mentioned this issue Jul 6, 2018

Wildcard alternative to gather/reduce_plan #376

Closed

wlandau added depends: #376 🕓 and removed depends: #282 🕒 labels Jul 6, 2018

wlandau mentioned this issue Jul 6, 2018

Missing clusters in a hierarchical layout datastorm-open/visNetwork#254

Closed

This was referenced Jul 7, 2018

Group nodes into clusters in graph visualizations #463

Merged

Node clustering in graph visuals ropensci-books/drake#17

Closed

wlandau closed this as completed in a935960 Jul 7, 2018

wlandau mentioned this issue Jul 15, 2018

Multiple output files per command: complete implementation #469

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Group together related commands in the graph visualization #229

Group together related commands in the graph visualization #229

krlmlr commented Feb 4, 2018

wlandau commented Feb 14, 2018

wlandau commented Feb 19, 2018

wlandau commented Feb 23, 2018

krlmlr commented Feb 23, 2018

wlandau commented Feb 23, 2018

wlandau commented Feb 23, 2018

krlmlr commented Feb 23, 2018

wlandau commented Feb 23, 2018

wlandau commented Feb 26, 2018

krlmlr commented Feb 28, 2018

wlandau commented Feb 28, 2018

krlmlr commented Feb 28, 2018

AlexAxthelm commented Mar 11, 2018

wlandau commented Mar 12, 2018

AlexAxthelm commented Mar 12, 2018

rkrug commented Mar 12, 2018

AlexAxthelm commented Mar 12, 2018

wlandau commented Mar 13, 2018 •

edited

Loading

AlexAxthelm commented Mar 13, 2018

wlandau commented Mar 13, 2018

wlandau commented Mar 13, 2018

wlandau commented Jun 30, 2018

wlandau commented Jul 6, 2018

wlandau commented Jul 6, 2018

Group together related commands in the graph visualization #229

Group together related commands in the graph visualization #229

Comments

krlmlr commented Feb 4, 2018

wlandau commented Feb 14, 2018

wlandau commented Feb 19, 2018

wlandau commented Feb 23, 2018

krlmlr commented Feb 23, 2018

wlandau commented Feb 23, 2018

wlandau commented Feb 23, 2018

krlmlr commented Feb 23, 2018

wlandau commented Feb 23, 2018

wlandau commented Feb 26, 2018

krlmlr commented Feb 28, 2018

wlandau commented Feb 28, 2018

krlmlr commented Feb 28, 2018

AlexAxthelm commented Mar 11, 2018

wlandau commented Mar 12, 2018

AlexAxthelm commented Mar 12, 2018

rkrug commented Mar 12, 2018

AlexAxthelm commented Mar 12, 2018

wlandau commented Mar 13, 2018 • edited Loading

AlexAxthelm commented Mar 13, 2018

wlandau commented Mar 13, 2018

wlandau commented Mar 13, 2018

wlandau commented Jun 30, 2018

wlandau commented Jul 6, 2018

wlandau commented Jul 6, 2018

wlandau commented Mar 13, 2018 •

edited

Loading