Permalink
Fetching contributors…
Cannot retrieve contributors at this time
242 lines (215 sloc) 14.9 KB

Scope

The purpose of this experiment is to explore some ideas for reducing the size of query result payloads in Relay and GraphQL, which are currently serialized as JSON for delivery over the wire.

Possibilities

  • Use of a less redundant format for encoding potentially repetitive tree-shaped data; for example JSON Graph or something like it.
  • Given such a format as an object, encoding it in a binary format such as MessagePack or FlatBuffers in order to get a denser format for transmission over the wire.
  • Consider teaching the server the semantics of the RelayRecordStore, so that instead of encoding a tree of data, the server could send a result set across the wire as a series of operations necessary to update a flattened store (NOTE: RelayRecordStore already is a normalized, flattened store).
  • Instead of vending materialized objects to views, make client-side proxying readers that could produce data on-demand from one of these dense representations (such as FlatBuffers) which could be stored as-is inside the store. Potential wins here revolve around reduced memory footprint, especially in the absence of garbage collection, but also independently of it given that even in Relay it is possible to have some overfetching of data that is requested and not actually used in the views.
  • Make a transform that would enable component-authors to continue to read data using simple property access (eg. this.props.viewer.notifications.count, which would transpile to this.props.viewer().notifications().count() or any of a number of other possible variants). Note that we've previously experimented with lazily-constructed reader methods and noted some speed ups, at least with test data. These might not play out in the real world, but being able to defer materialization of view data from a compact binary representation until the last possible moment, could increase the chances of this working.
  • Indicate the presence of subfields using bit arrays rather than textual field names.
  • Try to do better than standard gzip by using compression algorithms optimized for compressing small data payloads, such as preshared dictionaries, constructed based on static analysis of the schema and and the subset of it actually used by an application.

On the use of gzip

I've got tests in here that measure the effect of gzipping the payload. This is to give us a sense of relative size, and some notion of time cost, but bear in mind that all data coming from the server gets gzipped anyway transparently, so it's not something that we'd have to explicitly do.

Results

I compared payload sizes and wall-clock times for a number of different processing pipelines. The columns in the following table mean:

  • op: The operation(s) applied in the pipeline.
    • "passthrough": Data is passed through as a JSON-able object, and gets stringified.
    • "messagepackize": The data object is processed into the MessagePack binary format.
    • "leonize": The data is converted to LEON ("Little Endian Object") format, a binary representation choosen for its similarity to FlatBuffers.
    • "flatten": I looked at JSON Graph, asked what pieces of it were redundant and verbose, and made a Relay (node-centric) flattened representation of the data that ended up looking very like the way we store data in RelayRecordData (albeit without the features of GraphQLRange; instead, connections are just stored using arrays).
    • "thin": Like "flatten", but without the redundant embedding of node IDs inside their normalized record storage (because the ID is present in the key that points to the record).
    • "arrayrefize": Like "thin", but instead of using a key/value map to store normalized records (where the keys can potentially be quite lengthy, Base-64 encoded things), store them in an array with small integer indices.
    • "gzip": Passes the data from a previous step in the pipeline through gzip compression.
  • size: The byte count of the data after processing.
  • %: Percent size relative to the baseline (the stringified "passthrough").
  • time: Time, in milliseconds, to run the data through the pipeline 100 times.
  • reverse: Time, in milliseconds, to run the data through the inverse equivalent to the pipeline 100 times. In other words, this is the "decoding" counterpart to the "encoding" step. I did not implement this inverse operation for all of the pipelines.

Sample data results

This was "fake" data with a fair bit of redundancy encoded in it that I generated by hacking the TodoMVC example:

op size % time reverse
passthrough 25789 100.00% 17.74ms 27.89ms
passthrough + gzip(1) 2241 8.69% 391.99ms n/a
passthrough + gzip(6) 1891 7.33% 527.76ms n/a
passthrough + gzip(9) 1797 6.97% 683.41ms n/a
messagepackize 21349 82.78% 85.11ms 103.54ms
messagepackize + gzip(1) 2189 8.49% 193.17ms n/a
messagepackize + gzip(6) 1866 7.24% 305.09ms n/a
messagepackize + gzip(9) 1782 6.91% 411.25ms n/a
leonize 24506 95.03% 85.34ms 136.53ms
leonize + gzip(1) 2268 8.79% 205.18ms n/a
leonize + gzip(6) 1921 7.45% 283.36ms n/a
leonize + gzip(9) 1832 7.10% 356.84ms n/a
flatten 23313 90.40% 91.49ms n/a
flatten + gzip(1) 3496 13.56% 453.11ms n/a
flatten + gzip(6) 2889 11.20% 559.32ms n/a
flatten + gzip(9) 2785 10.80% 702.81ms n/a
flatten + messagepackize 18594 72.10% 168.77ms n/a
flatten + messagepackize + gzip(1) 3315 12.85% 283.93ms n/a
flatten + messagepackize + gzip(6) 2757 10.69% 344.60ms n/a
flatten + messagepackize + gzip(9) 2704 10.49% 473.94ms n/a
thin 20889 81.00% 97.95ms n/a
thin + gzip(1) 3081 11.95% 415.11ms n/a
thin + gzip(6) 2480 9.62% 496.12ms n/a
thin + gzip(9) 2395 9.29% 629.82ms n/a
thin + messagepackize 16662 64.61% 156.70ms n/a
thin + messagepackize + gzip(1) 2892 11.21% 282.12ms n/a
thin + messagepackize + gzip(6) 2386 9.25% 344.54ms n/a
thin + messagepackize + gzip(9) 2314 8.97% 414.01ms n/a
arrayrefize 18387 71.30% 112.70ms n/a
arrayrefize + gzip(1) 2803 10.87% 421.14ms n/a
arrayrefize + gzip(6) 2425 9.40% 472.12ms n/a
arrayrefize + gzip(9) 2376 9.21% 552.89ms n/a
arrayrefize + messagepackize 14019 54.36% 171.58ms n/a
arrayrefize + messagepackize + gzip(1) 2576 9.99% 272.92ms n/a
arrayrefize + messagepackize + gzip(6) 2256 8.75% 340.27ms n/a
arrayrefize + messagepackize + gzip(9) 2236 8.67% 378.97ms n/a

Observations

It is really hard to beat gzip. MessagePack is disappointing when it comes to compactness, but gives a marginal benefit as an intermediate step between intial processing and the final gzipping step.

Even the smallest first-phase ("arrayrefize") format, which is 71.30% of the size of the basic object+stringified format, ends up compressing worse than the naive "redundant" JSON representation (8.67% compared to 6.91%, even with the small boost from including MessagePack in the pipeline).

In short, if you try to eliminate redundancy by applying some kind of clever transformation or restructuring your data object before piping it to gzip, you will probably just do a crappy job of reimplementing gzip at the wrong layer of abstraction, and get worse results overall.

We should probably just represent the data "naturally", and let gzip do the rest.

However, all of this was based on testing a single result payload, and an artificial one at that, so I wanted do a test with another data point.

Newsfeed query results

This query result represents the initial data necessary to render the first few stories on the Facebook newsfeed. It is larger, and may or may not have a different degree of internal redundancy (I didn't look over the actual data with a magnifying glass), and uses lengthy, Base-64 encoded IDs.

op size % time reverse
passthrough 61637 100.00% 37.05ms 63.45ms
passthrough + gzip(1) 12278 19.92% 1063.56ms n/a
passthrough + gzip(6) 10266 16.66% 1228.81ms n/a
passthrough + gzip(9) 10202 16.55% 1280.92ms n/a
messagepackize 55137 89.45% 118.07ms 169.56ms
messagepackize + gzip(1) 12104 19.64% 526.86ms n/a
messagepackize + gzip(6) 10671 17.31% 613.71ms n/a
messagepackize + gzip(9) 10649 17.28% 600.60ms n/a
leonize 59854 97.11% 150.39ms 301.10ms
leonize + gzip(1) 12789 20.75% 528.62ms n/a
leonize + gzip(6) 11012 17.87% 740.05ms n/a
leonize + gzip(9) 10989 17.83% 704.96ms n/a
flatten 81788 132.69% 174.48ms n/a
flatten + gzip(1) 13645 22.14% 1571.88ms n/a
flatten + gzip(6) 11439 18.56% 1743.39ms n/a
flatten + gzip(9) 11342 18.40% 1763.25ms n/a
flatten + messagepackize 75336 122.23% 240.31ms n/a
flatten + messagepackize + gzip(1) 13398 21.74% 689.82ms n/a
flatten + messagepackize + gzip(6) 11920 19.34% 869.47ms n/a
flatten + messagepackize + gzip(9) 11866 19.25% 913.43ms n/a
thin 68702 111.46% 157.63ms n/a
thin + gzip(1) 12887 20.91% 1275.43ms n/a
thin + gzip(6) 10956 17.78% 1544.59ms n/a
thin + gzip(9) 10867 17.63% 1513.19ms n/a
thin + messagepackize 62512 101.42% 261.12ms n/a
thin + messagepackize + gzip(1) 12617 20.47% 616.97ms n/a
thin + messagepackize + gzip(6) 11379 18.46% 797.79ms n/a
thin + messagepackize + gzip(9) 11328 18.38% 818.44ms n/a
arrayrefize 54301 88.10% 161.09ms n/a
arrayrefize + gzip(1) 12156 19.72% 1039.70ms n/a
arrayrefize + gzip(6) 10477 17.00% 1240.24ms n/a
arrayrefize + gzip(9) 10434 16.93% 1219.25ms n/a
arrayrefize + messagepackize 47884 77.69% 280.34ms n/a
arrayrefize + messagepackize + gzip(1) 11859 19.24% 637.90ms n/a
arrayrefize + messagepackize + gzip(6) 10902 17.69% 719.39ms n/a
arrayrefize + messagepackize + gzip(9) 10889 17.67% 707.08ms n/a

Fascinatingly here, the "flatten" and even the "thin" strategies actually caused the result to get bigger. Although gzip does end up undoing a lot of the damage, the simple "passthrough"+gzip still comes out in front. Notably, and unlike the TodoMVC example, adding an intermediate MessagePack step did not improve the overall results.

These results again support the idea that a naive+gzip approach is likely to be a viable default. So in this instance, perhaps, "Dumb is the new Clever".

Conclusions and future work

I mostly focused on size here and didn't have time to really consider the implications of processing time, nor the client side aspects of the problem. Open questions:

  • Even though flattened formats don't compress as well as the naive response format, can we get other wins (for example, cheaper writes, particularly as we align the network format with the what the Relay store expects to use internally)?
  • How much cheaper would writes have to be to justify delaying the arrival of the query data?
  • What about a streamier response format that we could write more granularly to the store as it arrives?
  • Would a streamier format be substantially larger across the wire, if expressed in terms of write operations (eg. write field here, append there etc), and would gzip again prove capable of mitigating the downside?
  • [Crazy Town] If anything were possible, and it were viable to maintain server-side state about content in the client side store, how small could we make "delta" updates? Does Relay (which tries very hard to avoid overfetching) leave little to win on the table here? How big would the upside be if we could avoid shipping data about nodes referenced deep in the query but which are already on the client (ie. not just redundant encoding vs normalized coding, but redundant encoding vs not encoding the data at all)? Is having the server track previously-queried data insurmountably more difficult than having it track queries to which a client is subscribed?
  • Would keeping a binary format like FlatBuffers in the store and materializing them on demand be a net memory win? And would that have flow-on performance benefits? Would the read-time overhead be negligible?
  • How do we trade write-time cost (for example, merging binary storage to make reads cheaper) against read-time cost?
  • What would we do if the sky were the limits and we could change everything in the stack, including the HTTP protocol? How would we make a much more concurrent, batchy, streamy data retrieval system across client and server?