Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise indirect path computations #1653

Merged
merged 3 commits into from
Apr 1, 2022

Conversation

pcmanus
Copy link
Contributor

@pcmanus pcmanus commented Mar 28, 2022

This commit implements 2 optimizations for the computation of "indirect
paths" (think "following all possible key edges"):

  1. it adds some pre-computation of edges-following-an-edge that skips
    filters some trivially unecessary options. Essentially, the
    precomputaton is relatively cheap and avoid repeatedly considering
    (and utlimately eliminating) those unecessary options.
  2. query planning already has a mechanism by which "the computation of
    indirect paths folowing a given path" is memoized to save
    recomputation. This wasn't done for composition validation however
    and this commit correct that.

Those optimizations are particularly potent when composing a large number
of subgraphs with many entities spanning lots of subgraphs. On a fairly
large example (100+ subgraphs), this reduce composition time from 4+
"minutes" to less than 8 seconds ("on my machine").

@netlify
Copy link

netlify bot commented Mar 28, 2022

👷 Deploy request for apollo-federation-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit b1a3bed

@codesandbox-ci
Copy link

codesandbox-ci bot commented Mar 28, 2022

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

@@ -941,11 +1004,6 @@ function advancePathWithNonCollectingAndTypePreservingTransitions<TTrigger, V ex
}
excludedEdges = addEdgeExclusion(excludedEdges, edge);

if (isConditionExcluded(edge.conditions, excludedConditions)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer: this condition hasn't been removed, just moved down. The reason I did that is that this condition is a tad more expansive to check than the following ones and they can be swapped easily, so having that one later is probably a tad more efficient. In reality though, this probably don't make much of a difference, especially after the other optimisations of this commit, but I had made that change early and figured "why change back now?".

composition-js/src/validate.ts Outdated Show resolved Hide resolved
@@ -387,7 +387,10 @@ export class GraphPath<TTrigger, RV extends Vertex = Vertex, TNullEdge extends n
* The set of edges that may legally continue this path.
*/
nextEdges(): readonly Edge[] {
return this.graph.outEdges(this.tail);
const tailEdge = this.edgeToTail;
return tailEdge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an instance of TNullEdge, is it guaranteed to be false here? I don't know how to interpret a type that extends null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding/intent (and hopefully I didn't miss something) is that the only type extending null is null itself, so this is really a way to say that tailEdge can be only be one of an Edge, null, or undefined, which some specific instantiation making the null part guaranteed to never happen.

Here, we'll default to outEdges() if tailEdge is null, which is fine. That is, it's always safe to return outEdges() from this method, with nonTrivialFollowupEdges() being simply a potentially more efficient alternative, and null edges in a GraphPath are super rare (they can only happen during query planning, not validation composition, and only when the input query has some essentially useless type-condition that we keep in the path only because there is some execution directives associated we don't want to lose) so I really don't think it's worth doing extra work to try to optimise them in this case.

Anyway, I did added a comment that may help a bit with was is going on there.

query-graphs-js/src/graphPath.ts Outdated Show resolved Hide resolved
// are created, if we have a key (with some conditions X) from B to C, then we are guaranteed to
// also have a key (with the same conditions X) from A to C, and so it's that later key we
// should be using in the first place. In other words, it's never better to do 2 hops rather than 1.
return allFollowups.filter((followup) => followup.transition.kind !== 'KeyResolution' || !sameConditions(edge, followup));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I'm reading this correctly (and going back to the diagram you showed us), if A->B uses key 'x', but A->C uses key 'y', a transition from B->C with key 'y' would not be filtered? Not saying we need to address that case for this optimization, but maybe we should have a ticket for it?

Copy link
Contributor Author

@pcmanus pcmanus Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I'm reading this correctly (and going back to the diagram you showed us), if A->B uses key 'x', but A->C uses key 'y', a transition from B->C with key 'y' would not be filtered?

It's correct that it wouldn't be filtered by this method. But also, that case is fairly different because at the moment we add edges to the query graph without looking if their condition is "satisfiable" (it's the algorithm that checks it later) . That is, we may have a key 'y' between A->C but 'y' not being "reachable"/"collectable" from 'A'. Let's take an example:

# subgraph A
type Query {
  t: T
}

type T {
  x: Int
}
# subgraph B
type T @key(fields: "x") {
  x: Int
  y: Int
}
# subgraph C
type T @key(fields: "y") {
  y: Int
  z: Int
}

In that example, the query graph will technically have the key edge "y" from A to C, but you cannot use directly because A doesn't not have "y". So to get "z", you genuinely have to got B then C. If anything, it's the A -> C edge that should be filtered.

And so the next question would be: why are adding edges like A->B for key "y" in the query graph is A doesn't have "y"? And we "might" (I haven't though about it deeply) be able to do something like that, but it's far from trivial because checking if a key "condition" can be satisfied can be pretty complex: it's trivial in the example above, but keys can have many field, including nested ones, and some of those field can well be external or have requires, etc..., so checking the satisfiability of a key condition essentially means running the validation algorithm on that condition. Again, it might be possible to:

  1. create a query graph as we do today (so, with all key edges added regardless of the satisfiability of their condition).
  2. run some "optimisation" step on that query graph that look at each key edge, see if it is ever satisfiable (essentially running a version of the validation algorithm on their condition), and remove them if they aren't.

We could them run the validation algorithm/query planning on that now "optimised" query graph. But I'll note that:

  1. this would require quite a refactor of the current code. Definitively high effort.
  2. I doubt it would speed up composition validation, because I don't think we would save work in that case, we'd just split the work in more steps, but it would sum up to the same work. That said ...
  3. ... it would be likely beneficial to query planning, where the "optimised" query graph might be use for computing many query plans. How much it'll help in practice is unclear however.

maybe we should have a ticket for it?

Good question. The tl;dr of what I'm saying above is that, yes, we can probably optimise the handling of keys further, but I don't think doing so is a direct extension of the optimisations done on this PR. And the idea I do have is 1) not that fleshed out and 2) probably medium-to-high effort/unclear reward at the moment, so I don't see us prioritising such a thing any time soon. And I don't know what's our policy on whether we want to log tickets in such cases?

@@ -929,7 +995,7 @@ function advancePathWithNonCollectingAndTypePreservingTransitions<TTrigger, V ex
const toAdvance = popMin(toTry);
const nextEdges = toAdvance.nextEdges().filter(e => !e.transition.collectOperationElements);
if (nextEdges.length === 0) {
debug.log(`Nothing to try for ${toAdvance}: it has no non-collecting outbound edges`);
debug.log(() => `Nothing to try for ${toAdvance}: it has no non-collecting outbound edges`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for this change? It seems like the debug.log just calls the function immediately unless I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, it's an important change :) (not kidding). debug.log does not call the function immediately, it first check if debug logging is enabled. Concretely, this change ensure that the string interpolation (the underlying call to toAdvance.toString()) is only done when debug is enabled. Because that's a hot path, this matter, and that toString() was showing prominently when profiling. That single change shave a few additional seconds on top of the other optimisations of this patch on the (admittedly pretty large) composition validation I was testing this with.

Sylvain Lebresne added 3 commits April 1, 2022 10:13
This commit implements 2 optimizations for the computation of "indirect
paths" (think "following all possible key edges"):
1. it adds some pre-computation of edges-following-an-edge that skips
   filters some trivially unecessary options. Essentially, the
   precomputaton is relatively cheap and avoid repeatedly considering
   (and utlimately eliminating) those unecessary options.
2. query planning already has a mechanism by which "the computation of
   indirect paths folowing a given path" is memoized to save
   recomputation. This wasn't done for composition validation however
   and this commit correct that.

Those optimizations are particularly potent when composing a large number
of subgraphs with many entities spanning lots of subgraphs. On a fairly
large example (100+ subgraphs), this reduce composition time from 4+
"minutes" to less than 8 seconds ("on my machine").
@pcmanus pcmanus merged commit b4e4fd4 into apollographql:main Apr 1, 2022
pcmanus pushed a commit to pcmanus/federation that referenced this pull request Apr 4, 2022
The "edge following other edges"-precomputation optimization introduced
in apollographql#1653 introduced a regression in the composition composition code.

The problem is due to the fact that the optimization was assuming that
the validation/query planning algorithm would always find optimal paths,
but the algorithm was a bit too aggressive at avoid some edges given
that new optimization is in place.

Let's use the example this patch uses as test to illustrate:
```graphql
type Query {
  t: T
}

type T @key(fields: "k1") {
  k1: Int
}
```
```graphql
type T @key(fields: "k2") @key(fields: "k1") {
  k1: Int
  k2: Int
}
```
```graphql
type T @key(fields: "k2") {
  k2: Int
  v: Int
}
```
When the algorithm is on `T` in `A` and looks for `v`, it tries to find
where keys can lead it. In doing so, it:
1. looked first at key `k2` to B and checked if it was usable. And it is,
   provided we first use key `k1` to get `k1` from B. It's inefficient
   but possiblly (it's arguably dumb, but it's an algorithm, it has no
   shame).
2. then looked at key `k1` to B. Now, it actually ignored that edge
   because it already had a path to B (the one from the previous point).
   That's because the validation algorithm doesn't care one bit about
   efficiency, and that is not the issue. The issue is that the
   algorithm was also marking that edge "excluded" for remainder of
   the "current loop".
3. it looked at `k2` to C, which is actually edge we want to use. But
   unfortunately, to use it, we first need to get `k2` from B using `k1`
   and as mentioned in the previous point, we had mistankenly excluded
   that edge, so that edge was considered a dead end when it shouldn't.
4. at that point the algorithm was actually looking back at the path
   it found in point 1, which got it to B, and checked if it could get
   to C from there. That's where the algorithm "used to" make progress
   by, at this point, using the `k2` edge from B to C. But the
   optimization from apollographql#1653 made use ignore that edge, on account that
   we never should have to use it.

Note that the optimization of apollographql#1653 is not really to blame: the true
problem is that in point 3 above, we're not able to use the `k2` edge
from A to C due to an incorrect exclusion. It just happens that
before apollographql#1653, the algorithm was able to get back on its feet by using
an inefficient option it shouldn't have had to use.

Long story short, the patch fixes the exclusion issue.
pcmanus pushed a commit that referenced this pull request Apr 5, 2022
The "edge following other edges"-precomputation optimization introduced
in #1653 introduced a regression in the composition composition code.

The problem is due to the fact that the optimization was assuming that
the validation/query planning algorithm would always find optimal paths,
but the algorithm was a bit too aggressive at avoid some edges given
that new optimization is in place.

Let's use the example this patch uses as test to illustrate:
```graphql
type Query {
  t: T
}

type T @key(fields: "k1") {
  k1: Int
}
```
```graphql
type T @key(fields: "k2") @key(fields: "k1") {
  k1: Int
  k2: Int
}
```
```graphql
type T @key(fields: "k2") {
  k2: Int
  v: Int
}
```
When the algorithm is on `T` in `A` and looks for `v`, it tries to find
where keys can lead it. In doing so, it:
1. looked first at key `k2` to B and checked if it was usable. And it is,
   provided we first use key `k1` to get `k1` from B. It's inefficient
   but possibly (it's arguably dumb, but it's an algorithm, it has no
   shame).
2. then looked at key `k1` to B. Now, it actually ignored that edge
   because it already had a path to B (the one from the previous point).
   That's because the validation algorithm doesn't care one bit about
   efficiency, and that is not the issue. The issue is that the
   algorithm was also marking that edge "excluded" for remainder of
   the "current loop".
3. it looked at `k2` to C, which is actually edge we want to use. But
   unfortunately, to use it, we first need to get `k2` from B using `k1`
   and as mentioned in the previous point, we had mistakenly excluded
   that edge, so that edge was considered a dead end when it shouldn't.
4. at that point the algorithm was actually looking back at the path
   it found in point 1, which got it to B, and checked if it could get
   to C from there. That's where the algorithm "used to" make progress
   by, at this point, using the `k2` edge from B to C. But the
   optimization from #1653 made use ignore that edge, on account that
   we never should have to use it.

Note that the optimization of #1653 is not really to blame: the true
problem is that in point 3 above, we're not able to use the `k2` edge
from A to C due to an incorrect exclusion. It just happens that
before #1653, the algorithm was able to get back on its feet by using
an inefficient option it shouldn't have had to use.

Long story short, the patch fixes the exclusion issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants