Investigate GraphQL dedupe optimizations #592

stephenh · 2023-03-29T19:44:59Z

The difference between, top-level queries:

query {
  bookReviews {
    book { author { name } }
  }
}

And subgraph queries:

query {
  authors {
    name
    books { bookReviews { ... } }
  }
}

Can be surprisingly large due to the duplication of author { name } that is from the top-level query.

In particular we've seen, for returning the same data set:

60% bigger payload (690kb vs. 1.1mb), and
70% more JSON fields (22k vs 37k)

Which translates to more CPU time spent:

Re-invoking duplicate resolvers (the same author.name every time)
Re-invoking duplicate auth checks (the same author.name auth check multiple times)
JSON.stringify-ing the payload
gziping the payload

Solutions like graphql-crunch help dedupe the data but only happen after we've already called the resolvers, and so have to spend its own CPU time running the dedup process.

Ideally we could lean into the GraphQL __typename + id convention for identities and only a) invoke resolvers once, and b) produce a payload that is essentially the GraphQL crunch output but directly as we go, instead of having to crunch after the fact.

The text was updated successfully, but these errors were encountered:

stephenh · 2023-11-19T03:13:38Z

Closing as this isn't really related to Joist.

stephenh closed this as completed Nov 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate GraphQL dedupe optimizations #592

Investigate GraphQL dedupe optimizations #592

stephenh commented Mar 29, 2023

stephenh commented Nov 19, 2023

Investigate GraphQL dedupe optimizations #592

Investigate GraphQL dedupe optimizations #592

Comments

stephenh commented Mar 29, 2023

stephenh commented Nov 19, 2023