Skip to content
Reduces the size of GraphQL responses by consolidating duplicate values
Branch: master
Clone or download
stevekrenzel Merge pull request #11 from kpman/patch-1
Fix second parameter on formatResponse
Latest commit 7626f8d Feb 14, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bench 2.0.0 Sep 4, 2018
src Consolidating hash logic. Sep 5, 2018
test 2.0.0 Sep 4, 2018
.gitignore Adding node_modules to .gitignore. May 5, 2018
package.json 2.1.0 Sep 4, 2018
readme.md Fix second parameter on formatResponse Nov 9, 2018
yarn.lock Adds GraphQL 2.0 Sep 4, 2018

readme.md

graphql-crunch

NPM version

Optimizes JSON responses by minimizing duplication and improving compressibility.

On Banter.fm, we see a 76% reduction in raw JSON size and a 30% reduction in gzip'd size. This leads to reduced transfer time and faster JSON parsing on mobile.

Client support

graphql-crunch is client agnostic and can be used anywhere that sends or receives JSON. We provide examples for integration with apollo-client as we use this in a GraphQL environment.

Installation

This library is distributed on npm. In order to add it as a dependency, run the following command:

$ npm install graphql-crunch --save

or with Yarn:

$ yarn add graphql-crunch

How does it work?

We flatten the object hierarchy into an array using a post-order traversal of the object graph. As we traverse we efficiently check if we've come across a value before, including arrays and objects, and replace it with a reference to it's earlier occurence if we've seen it. Values are only ever present in the array once.

Note: Crunching and uncrunching is an entirely lossless process. The final payload exactly matches the original.

Motivation

Large JSON blobs can be slow to parse on some mobile platforms, especially older Android phones, so we set out to improve that. At the same time we also wound up making the payloads more amenable to gzip compression too. GraphQL and REST-ful API responses tend to have a lot of duplication leading to huge payload sizes.

Example

In these examples, we use the SWAPI GraphQL demo.

Small Example

Using this query we'll fetch the first 2 people and their first 2 films and the first 2 characters in each of those films. We limit the connections to the first two items to keep the payload small:

  {
    allPeople(first: 2) {
      people {
        name
        gender
        filmConnection(first: 2) {
          films {
            title
            characterConnection(first: 2) {
              characters {
                name
                gender
              }
            }
          }
        }
      }
    }
  }

We get this response:

{
  "data": {
    "allPeople": {
      "people": [
        {
          "name": "Luke Skywalker",
          "gender": "male",
          "filmConnection": {
            "films": [
              {
                "title": "A New Hope",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              },
              {
                "title": "The Empire Strikes Back",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "name": "C-3PO",
          "gender": "n/a",
          "filmConnection": {
            "films": [
              {
                "title": "A New Hope",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              },
              {
                "title": "The Empire Strikes Back",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              }
            ]
          }
        }
      ]
    }
  }
}

After we crunch it, we get:

{
  "data": [
    "male",
    "Luke Skywalker",
    { "gender": 0, "name": 1 },
    "n/a",
    "C-3PO",
    { "gender": 3, "name": 4 },
    [ 2, 5 ],
    { "characters": 6 },
    "A New Hope",
    { "characterConnection": 7, "title": 8 },
    "The Empire Strikes Back",
    { "characterConnection": 7, "title": 10 },
    [ 9, 11 ],
    { "films": 12 },
    { "filmConnection": 13, "gender": 0, "name": 1 },
    { "filmConnection": 13, "gender": 3, "name": 4 },
    [ 14, 15 ],
    { "people": 16 },
    { "allPeople": 17 }
  ]
}

The transformed payload is substantially smaller. After converting both payloads to JSON (with formatting removed), the transformed payload is 49% fewer bytes.

When the client receives this, we simply uncrunch it and get back the exact original version for the client to handle.

Large Example

In real-world scenarios, we'll have modularized our shcema with fragments and have as well as connections that have more than two items in them. Here's a query similar to the one above except we don't limit the size of the connections and we request a standard set of selections on Person objects.

{
  allPeople {
    people {
      ...PersonFragment
      filmConnection {
        films {
          ...FilmFragment
        }
      }
    }
  }
}

fragment PersonFragment on Person {
  name
  birthYear
  eyeColor
  gender
  hairColor
  height
  mass
  skinColor
  homeworld {
    name
    population
  }
}

fragment FilmFragment on Film {
  title
  characterConnection {
    characters {
      ...PersonFragment
    }
  }
}

The resulting response from this query is roughly 1MB of JSON (989,946 bytes), but with tons of duplication. Here is how crunching impacts the payload size:

Raw Crunched Improvement
Size 989,946B 28,220B 97.1%
GZip'd Size 22,240B 5,069B 77.2%

This is an admittedly extreme result, but highlights the potential for crunching payloads with large amounts of duplication.

Usage

Server-side

With apollo-server you can supply a custom formatResponse function. We use this to crunch the data field of the response before sending it over the wire.

import { ApolloServer } from 'apollo-server';

const server = new ApolloServer({
  // schema, context, etc...
  formatResponse: (response) => {
    if(response.data) {
      response.data = crunch(response.data);
    }
    return response;
  },
});

server.listen({port: 80});

To maintain compatibility with clients that aren't expecting crunched payloads, we recommend conditioning the crunch on a query param, like so:

import url from 'url';
import querystring from 'querystring';
import { ApolloServer } from 'apollo-server';

const server = new ApolloServer({
  // schema, context, etc...
  formatResponse: (response, options) => {
    const parsed = url.parse(options.context.request.url);
    const query = querystring.parse(parsed.query);

    if(query.crunch && response.data) {
      const version = parseInt(query.crunch) || 1;
      response.data = crunch(response.data, version);
    }

    return response;
  },
});

server.listen({port: 80});

Now only clients that opt-in to crunched payloads via the ?crunch=2 query parameter will receive them.

Your client can specify the version of the crunch format to use in the query parameter. If the version isn't specified, or an unknown version is supplied, we default to v1.0.

Client-side

On the client, we uncrunch the server response before the GraphQL client processes it.

With apollo-client, use a link configuration to setup an afterware, e.g.

import { ApolloClient } from 'apollo-client';
import { ApolloLink, concat } from 'apollo-link';
import { HttpLink } from 'apollo-link-http';
import { uncrunch } from 'graphql-crunch';

const http = new HttpLink({
  credentials: 'include',
  uri: '/api'
});

const uncruncher = new ApolloLink((operation, forward) =>
  forward(operation)
    .map((response) => {
      response.data = uncrunch(response.data);
      return response;
    });
);

const client = new ApolloClient({link: concat(uncruncher, http)});
You can’t perform that action at this time.