Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is it possible to establish node order explicitly in grid? #93

Closed
hamilton opened this issue Jul 18, 2022 · 8 comments
Closed

is it possible to establish node order explicitly in grid? #93

hamilton opened this issue Jul 18, 2022 · 8 comments

Comments

@hamilton
Copy link

hamilton commented Jul 18, 2022

Thank you for making such a useful library!

Had a question about ordering. Do you know of a way to force-set the order of nodes in a dag so that they appear in a user-defined way?

My use-case is in parsing large, complex SQL queries. These queries often have source tables & a number of CTEs that ultimately feed into a SELECT statement. So naturally, these are DAGs. Given that these types of queries are hundreds and hundreds of lines of SQL, I figured I'd make a minimap for our users that utilized d3-dag. To make this real, however, the order of the CTEs would have to be preserved since their position is usually part of the user's mental map of the query. I made a figma mock to demonstrate something approximately what I'm trying to do below.

Screen Shot 2022-07-17 at 6 21 22 PM

I could also see this being very useful for commit / branch representations in git.

After trying an example of this myself & reading hte docs, I'm skeptical that d3-dag could probably do that on its own, but figured I'd ask first. My example laid out the nodes optimally, but optimal isn't what is needed!

Screen Shot 2022-07-17 at 6 26 52 PM

@erikbrinkman
Copy link
Owner

If I understand correctly, it actually should be pretty easy to do, and part of a change I've been meaning to make for quite a while.

Currently, the order of the nodes in the "grid" layout is just defined by a topological sort, that breaks ties randomly:

d3-dag/src/grid/index.ts

Lines 158 to 159 in 37cc714

// topological sort
const ordered = [...dag.idescendants("before")];

However, there are many ways we could augment the topological sort. The most straightforward might be to use something like the rank operator for simplex layering. Essentially this allows you to specify a rank for any node in the graph, and the layout implicitly adds invisible edges from lower ranks to higher ranks.

If I understood you correctly, then you could assign each CTE with its rank in the initial query, and that should produce the layout you want. Am I understanding you correctly?

On a side note, grid returns pretty generic control points for the edges. If you want the edges to render like your figma I suggest you use the grid tweak found in my observable notebook, repeat here for preservation sake:

gridTweak = (layout) => (dag) => {
  // Tweak allows a basis interpolation to curve the lines
  // We essentially take the three point lines and make them five, with two points on either side of the bend
  const { width, height } = layout(dag);
  for (const { points } of dag.ilinks()) {
    const [first, middle, last] = points;
    if (last !== undefined) {
      points.splice(
        0,
        3,
        first,
        {
          x: middle.x + Math.sign(first.x - middle.x) * nodeRadius,
          y: middle.y
        },
        middle,
        { x: middle.x, y: middle.y + nodeRadius },
        last
      );
    }
  }
  return { width, height };
}

@hamilton
Copy link
Author

Yes, that's exactly what I'm hoping for – a way to pass in some kind of rank to enforce an ordering.

@erikbrinkman
Copy link
Owner

@hamilton I started throwing together an implementation and ran into a stumbling block that hopefully you can help clear up.

In the simplex layering, nodes can be assigned the same height, so leaving some nodes unconstrained (i.e. without a rank) is a natural way to structure it.

However for the grid layout, we're ultimately selecting between any valid topological sorting of the nodes. The current implementation just chooses one arbitrarily based on a depth first search. You're proposing one with rank constraints. You could also imagine one that attempted to minimize the length of all edges drawn (e.g. compactness).

For now, lets only focus on your case, assigning rank constraints that must be fulfilled. There are two possible ways to implement this.

  1. My initial idea, the easiest and most performant to implement would be to use the same visiting algorithm as specified, but take nodes with the lowest rank first. In order to guarantee that this preserves rank constraints, any unranked node must implicitly have rank -Infinity, e.g. we'll always pick unranked nodes before ranked nodes
  2. Treat rank constraints as faux edges. Like the way it's currently done, any ordering which is topologically sound, and conforms to the rank constraints will be arbitrarily chosen based on node ordering. There are two downsides to this approach.
    1. the only implementation I can think of has n^2 performance for adjacent (in sorted order ranks), e.g. if n nodes have rank 1, and k nodes have rank 2, and no nodes have a rank between 1 and 2, then the algorithm will add n * k complexity to the standard order.
    2. Because we're implicitly adding edges, we run the risk of creating cycles. The first method will essentially ignore those constraints, but this will have to throw an error.

Based on what you want to do, is there an obvious preference? In general 2 seems more robust overall, but I'm curious of 1 does the trick.

@hamilton
Copy link
Author

hamilton commented Jul 21, 2022

I see ~ thank you for taking the time to look into this! For my use-case, there will never be a situation where I'm not ranking the nodes (and there will never be ties), so the first option would suffice if I'm understanding correctly. This is because these subqueries & expressions within a big SQL query always have a line + column number where they appear, which should work for the ranking.

@erikbrinkman
Copy link
Owner

this is now available in v0.11.4, let me know how that works!

@hamilton
Copy link
Author

Awesome, thank you @erikbrinkman! I will give this a try.

@hamilton
Copy link
Author

hamilton commented Jul 27, 2022

Did a quick timeboxed prototype last night with the new rank operator. This is a fantastic feature and will solve my use case. I appreciate the thoughtfulness you put into development ~ thanks for implementing!

rill-developer-dag-prototype

@erikbrinkman
Copy link
Owner

That looks great! I always love seeing things come together! Glad this could be useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants