Optimise indirect path computations #1653

pcmanus · 2022-03-28T13:08:16Z

This commit implements 2 optimizations for the computation of "indirect
paths" (think "following all possible key edges"):

it adds some pre-computation of edges-following-an-edge that skips
filters some trivially unecessary options. Essentially, the
precomputaton is relatively cheap and avoid repeatedly considering
(and utlimately eliminating) those unecessary options.
query planning already has a mechanism by which "the computation of
indirect paths folowing a given path" is memoized to save
recomputation. This wasn't done for composition validation however
and this commit correct that.

Those optimizations are particularly potent when composing a large number
of subgraphs with many entities spanning lots of subgraphs. On a fairly
large example (100+ subgraphs), this reduce composition time from 4+
"minutes" to less than 8 seconds ("on my machine").

netlify · 2022-03-28T13:08:18Z

👷 Deploy request for apollo-federation-docs pending review.

Visit the deploys page to approve it

Name	Link
🔨 Latest commit	`b1a3bed`

codesandbox-ci · 2022-03-28T13:09:24Z

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

pcmanus · 2022-03-28T13:14:00Z

query-graphs-js/src/graphPath.ts

@@ -941,11 +1004,6 @@ function advancePathWithNonCollectingAndTypePreservingTransitions<TTrigger, V ex
      }
      excludedEdges = addEdgeExclusion(excludedEdges, edge);

-      if (isConditionExcluded(edge.conditions, excludedConditions)) {


Note to reviewer: this condition hasn't been removed, just moved down. The reason I did that is that this condition is a tad more expansive to check than the following ones and they can be swapped easily, so having that one later is probably a tad more efficient. In reality though, this probably don't make much of a difference, especially after the other optimisations of this commit, but I had made that change early and figured "why change back now?".

composition-js/src/validate.ts

clenfest · 2022-03-31T03:04:06Z

query-graphs-js/src/graphPath.ts

@@ -387,7 +387,10 @@ export class GraphPath<TTrigger, RV extends Vertex = Vertex, TNullEdge extends n
   * The set of edges that may legally continue this path.
   */
  nextEdges(): readonly Edge[] {
-    return this.graph.outEdges(this.tail);
+    const tailEdge = this.edgeToTail;
+    return tailEdge


If this is an instance of TNullEdge, is it guaranteed to be false here? I don't know how to interpret a type that extends null.

My understanding/intent (and hopefully I didn't miss something) is that the only type extending null is null itself, so this is really a way to say that tailEdge can be only be one of an Edge, null, or undefined, which some specific instantiation making the null part guaranteed to never happen.

Here, we'll default to outEdges() if tailEdge is null, which is fine. That is, it's always safe to return outEdges() from this method, with nonTrivialFollowupEdges() being simply a potentially more efficient alternative, and null edges in a GraphPath are super rare (they can only happen during query planning, not validation composition, and only when the input query has some essentially useless type-condition that we keep in the path only because there is some execution directives associated we don't want to lose) so I really don't think it's worth doing extra work to try to optimise them in this case.

Anyway, I did added a comment that may help a bit with was is going on there.

query-graphs-js/src/graphPath.ts

clenfest · 2022-03-31T20:19:06Z

query-graphs-js/src/nonTrivialEdgePrecomputing.ts

+      // are created, if we have a key (with some conditions X) from B to C, then we are guaranteed to
+      // also have a key (with the same conditions X) from A to C, and so it's that later key we
+      // should be using in the first place. In other words, it's never better to do 2 hops rather than 1.
+      return allFollowups.filter((followup) => followup.transition.kind !== 'KeyResolution' || !sameConditions(edge, followup));


So if I'm reading this correctly (and going back to the diagram you showed us), if A->B uses key 'x', but A->C uses key 'y', a transition from B->C with key 'y' would not be filtered? Not saying we need to address that case for this optimization, but maybe we should have a ticket for it?

So if I'm reading this correctly (and going back to the diagram you showed us), if A->B uses key 'x', but A->C uses key 'y', a transition from B->C with key 'y' would not be filtered?

It's correct that it wouldn't be filtered by this method. But also, that case is fairly different because at the moment we add edges to the query graph without looking if their condition is "satisfiable" (it's the algorithm that checks it later) . That is, we may have a key 'y' between A->C but 'y' not being "reachable"/"collectable" from 'A'. Let's take an example:

# subgraph A type Query { t: T } type T { x: Int }

# subgraph B type T @key(fields: "x") { x: Int y: Int }

# subgraph C type T @key(fields: "y") { y: Int z: Int }

In that example, the query graph will technically have the key edge "y" from A to C, but you cannot use directly because A doesn't not have "y". So to get "z", you genuinely have to got B then C. If anything, it's the A -> C edge that should be filtered.

And so the next question would be: why are adding edges like A->B for key "y" in the query graph is A doesn't have "y"? And we "might" (I haven't though about it deeply) be able to do something like that, but it's far from trivial because checking if a key "condition" can be satisfied can be pretty complex: it's trivial in the example above, but keys can have many field, including nested ones, and some of those field can well be external or have requires, etc..., so checking the satisfiability of a key condition essentially means running the validation algorithm on that condition. Again, it might be possible to:

create a query graph as we do today (so, with all key edges added regardless of the satisfiability of their condition).

run some "optimisation" step on that query graph that look at each key edge, see if it is ever satisfiable (essentially running a version of the validation algorithm on their condition), and remove them if they aren't.

We could them run the validation algorithm/query planning on that now "optimised" query graph. But I'll note that:

this would require quite a refactor of the current code. Definitively high effort.

I doubt it would speed up composition validation, because I don't think we would save work in that case, we'd just split the work in more steps, but it would sum up to the same work. That said ...

... it would be likely beneficial to query planning, where the "optimised" query graph might be use for computing many query plans. How much it'll help in practice is unclear however.

maybe we should have a ticket for it?

Good question. The tl;dr of what I'm saying above is that, yes, we can probably optimise the handling of keys further, but I don't think doing so is a direct extension of the optimisations done on this PR. And the idea I do have is 1) not that fleshed out and 2) probably medium-to-high effort/unclear reward at the moment, so I don't see us prioritising such a thing any time soon. And I don't know what's our policy on whether we want to log tickets in such cases?

clenfest · 2022-03-31T21:53:33Z

query-graphs-js/src/graphPath.ts

@@ -929,7 +995,7 @@ function advancePathWithNonCollectingAndTypePreservingTransitions<TTrigger, V ex
    const toAdvance = popMin(toTry);
    const nextEdges =  toAdvance.nextEdges().filter(e => !e.transition.collectOperationElements);
    if (nextEdges.length === 0) {
-      debug.log(`Nothing to try for ${toAdvance}: it has no non-collecting outbound edges`);
+      debug.log(() => `Nothing to try for ${toAdvance}: it has no non-collecting outbound edges`);


Any particular reason for this change? It seems like the debug.log just calls the function immediately unless I'm missing something.

Oh yeah, it's an important change :) (not kidding). debug.log does not call the function immediately, it first check if debug logging is enabled. Concretely, this change ensure that the string interpolation (the underlying call to toAdvance.toString()) is only done when debug is enabled. Because that's a hot path, this matter, and that toString() was showing prominently when profiling. That single change shave a few additional seconds on top of the other optimisations of this patch on the (admittedly pretty large) composition validation I was testing this with.

This commit implements 2 optimizations for the computation of "indirect paths" (think "following all possible key edges"): 1. it adds some pre-computation of edges-following-an-edge that skips filters some trivially unecessary options. Essentially, the precomputaton is relatively cheap and avoid repeatedly considering (and utlimately eliminating) those unecessary options. 2. query planning already has a mechanism by which "the computation of indirect paths folowing a given path" is memoized to save recomputation. This wasn't done for composition validation however and this commit correct that. Those optimizations are particularly potent when composing a large number of subgraphs with many entities spanning lots of subgraphs. On a fairly large example (100+ subgraphs), this reduce composition time from 4+ "minutes" to less than 8 seconds ("on my machine").

@key

The "edge following other edges"-precomputation optimization introduced in apollographql#1653 introduced a regression in the composition composition code. The problem is due to the fact that the optimization was assuming that the validation/query planning algorithm would always find optimal paths, but the algorithm was a bit too aggressive at avoid some edges given that new optimization is in place. Let's use the example this patch uses as test to illustrate: ```graphql type Query { t: T } type T @key(fields: "k1") { k1: Int } ``` ```graphql type T @key(fields: "k2") @key(fields: "k1") { k1: Int k2: Int } ``` ```graphql type T @key(fields: "k2") { k2: Int v: Int } ``` When the algorithm is on `T` in `A` and looks for `v`, it tries to find where keys can lead it. In doing so, it: 1. looked first at key `k2` to B and checked if it was usable. And it is, provided we first use key `k1` to get `k1` from B. It's inefficient but possiblly (it's arguably dumb, but it's an algorithm, it has no shame). 2. then looked at key `k1` to B. Now, it actually ignored that edge because it already had a path to B (the one from the previous point). That's because the validation algorithm doesn't care one bit about efficiency, and that is not the issue. The issue is that the algorithm was also marking that edge "excluded" for remainder of the "current loop". 3. it looked at `k2` to C, which is actually edge we want to use. But unfortunately, to use it, we first need to get `k2` from B using `k1` and as mentioned in the previous point, we had mistankenly excluded that edge, so that edge was considered a dead end when it shouldn't. 4. at that point the algorithm was actually looking back at the path it found in point 1, which got it to B, and checked if it could get to C from there. That's where the algorithm "used to" make progress by, at this point, using the `k2` edge from B to C. But the optimization from apollographql#1653 made use ignore that edge, on account that we never should have to use it. Note that the optimization of apollographql#1653 is not really to blame: the true problem is that in point 3 above, we're not able to use the `k2` edge from A to C due to an incorrect exclusion. It just happens that before apollographql#1653, the algorithm was able to get back on its feet by using an inefficient option it shouldn't have had to use. Long story short, the patch fixes the exclusion issue.

@key

The "edge following other edges"-precomputation optimization introduced in #1653 introduced a regression in the composition composition code. The problem is due to the fact that the optimization was assuming that the validation/query planning algorithm would always find optimal paths, but the algorithm was a bit too aggressive at avoid some edges given that new optimization is in place. Let's use the example this patch uses as test to illustrate: ```graphql type Query { t: T } type T @key(fields: "k1") { k1: Int } ``` ```graphql type T @key(fields: "k2") @key(fields: "k1") { k1: Int k2: Int } ``` ```graphql type T @key(fields: "k2") { k2: Int v: Int } ``` When the algorithm is on `T` in `A` and looks for `v`, it tries to find where keys can lead it. In doing so, it: 1. looked first at key `k2` to B and checked if it was usable. And it is, provided we first use key `k1` to get `k1` from B. It's inefficient but possibly (it's arguably dumb, but it's an algorithm, it has no shame). 2. then looked at key `k1` to B. Now, it actually ignored that edge because it already had a path to B (the one from the previous point). That's because the validation algorithm doesn't care one bit about efficiency, and that is not the issue. The issue is that the algorithm was also marking that edge "excluded" for remainder of the "current loop". 3. it looked at `k2` to C, which is actually edge we want to use. But unfortunately, to use it, we first need to get `k2` from B using `k1` and as mentioned in the previous point, we had mistakenly excluded that edge, so that edge was considered a dead end when it shouldn't. 4. at that point the algorithm was actually looking back at the path it found in point 1, which got it to B, and checked if it could get to C from there. That's where the algorithm "used to" make progress by, at this point, using the `k2` edge from B to C. But the optimization from #1653 made use ignore that edge, on account that we never should have to use it. Note that the optimization of #1653 is not really to blame: the true problem is that in point 3 above, we're not able to use the `k2` edge from A to C due to an incorrect exclusion. It just happens that before #1653, the algorithm was able to get back on its feet by using an inefficient option it shouldn't have had to use. Long story short, the patch fixes the exclusion issue.

pcmanus commented Mar 28, 2022

View reviewed changes

clenfest reviewed Mar 31, 2022

View reviewed changes

pcmanus force-pushed the validation-optims branch from dffa279 to 20f9677 Compare March 31, 2022 09:10

clenfest approved these changes Mar 31, 2022

View reviewed changes

Sylvain Lebresne added 3 commits April 1, 2022 10:13

Initial review updates

587e92c

Update changelogs

b1a3bed

pcmanus force-pushed the validation-optims branch from 20f9677 to b1a3bed Compare April 1, 2022 08:15

pcmanus merged commit b4e4fd4 into apollographql:main Apr 1, 2022

pcmanus mentioned this pull request Apr 4, 2022

Fix validation issue introduced by #1653 #1673

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise indirect path computations #1653

Optimise indirect path computations #1653

pcmanus commented Mar 28, 2022

netlify bot commented Mar 28, 2022 •

edited

Loading

codesandbox-ci bot commented Mar 28, 2022 •

edited

Loading

pcmanus Mar 28, 2022

clenfest Mar 31, 2022

pcmanus Mar 31, 2022

clenfest Mar 31, 2022

pcmanus Apr 1, 2022 •

edited

Loading

clenfest Mar 31, 2022

pcmanus Apr 1, 2022

Optimise indirect path computations #1653

Optimise indirect path computations #1653

Conversation

pcmanus commented Mar 28, 2022

netlify bot commented Mar 28, 2022 • edited Loading

👷 Deploy request for apollo-federation-docs pending review.

codesandbox-ci bot commented Mar 28, 2022 • edited Loading

pcmanus Mar 28, 2022

Choose a reason for hiding this comment

clenfest Mar 31, 2022

Choose a reason for hiding this comment

pcmanus Mar 31, 2022

Choose a reason for hiding this comment

clenfest Mar 31, 2022

Choose a reason for hiding this comment

pcmanus Apr 1, 2022 • edited Loading

Choose a reason for hiding this comment

clenfest Mar 31, 2022

Choose a reason for hiding this comment

pcmanus Apr 1, 2022

Choose a reason for hiding this comment

netlify bot commented Mar 28, 2022 •

edited

Loading

codesandbox-ci bot commented Mar 28, 2022 •

edited

Loading

pcmanus Apr 1, 2022 •

edited

Loading