Note about HTTP/1.1 roundtrip reduction via array-based auto-batching #5

sungam3r · 2019-10-10T17:26:17Z

Is it worth mentioning batch requests? In this case, an array of objects is transmitted via http.

benjie · 2019-11-07T15:48:26Z

I wondered this too; but I think batch requests will be superseded by HTTP2 so I'm not sure how much value there is in it. That said, it's very straightforward and I think quite widely used, so it's probably worth mentioning - perhaps in an addendum?

sungam3r · 2019-11-07T20:33:46Z

batch requests will be superseded by HTTP2

Maybe yes maybe no. In any case, whole years may pass until this comes. But batching is already working. And I must say that his support does not carry any burden.

michaelstaib · 2019-11-10T19:46:41Z

@benjie

be superseded by HTTP2 so I'm not sure how

Not really, we have implemented batching with the export directive where you can export results from the a preceding query as variables in the next.

And I think the PayPal guys were looking or discussing to use that for transactions. You can do with this nice mutation flows.

POST /graphql?batchOperations=[NewsFeed, StoryComments]

query NewsFeed {
  stories {
    id @export(as: "ids")
    actor
    message
  }
}

query StoryComments {
  stories(ids: $ids) {
    comments(first: 2) {
      actor
      message
    }
  }
}

We support two ways to batch ... operation batching like the above where you send in a document and specify the operations as a list in the url to specify which operation you want to batch...

And we also support the Apollo way which basically is a request batching. Batching for the sake to safe requests might not be necessary with HTTP/2 but combined with @export I see value in it for especially mutations.

benjie · 2019-11-10T20:58:49Z

@michaelstaib Sure; I was referring to straight-forward vanilla batching rather than with dependencies and chaining. Is there a full spec for the approach you raised? It seems you’d need a runtime on top of GraphQL to validate the types of the variables/exports before running any of the batch? That’s quite a but more complex than the simple array-based parallel batching that Apollo Server, PostGraphile and the like support.

michaelstaib · 2019-11-10T21:11:30Z

@benjie

Hot Chocolate and Sangria have implemented it this way. You actually do not need a lot extra. Directives in Hot Chocolate can be associated with field middleware, so the execution is quite simple. Validation wise you can also do static analysis to checkout if outputs are compatible with the inputs. We basically infer the variables.

Oleg and I both inferred the specification from a talk that lee gave :)

There are some discussions also with the GraphQL-Java guys about this on the spec repo. Also there are some discussions on this with Ivan, the PayPal guys and so on

I really should put that all together to a spec document. I think I will write a little blog summing all up and then we have something with all the information in one document.

https://hotchocolate.io/docs/batching

On our docs is the talk from @leebyron. They introduced this back then to basically solve something that you actually better solve with @defer. But, as I said, the superpower of @export and batching lies more in mutations in my opinion.

sungam3r · 2019-11-10T21:48:31Z

Just a note - I was talking about array-based batching approach.

benjie · 2019-11-13T11:35:35Z

These seem to be radically different topics that share some similarities.

With auto-batching the requests are captured and batched together as a purely transport-level optimisation, to reduce roundtrips. They introduce no new validation, GraphQL re-writing, or any other complexities. The operations typically run independently in parallel on the server, and the results are returned as an array in the same order to the client. They have trade-offs, and HTTP/2 (or using a websocket transport) solves the problem better.

With chained batches, the queries in the batch have to be prepared in advance so that they can be issued together, otherwise later operations may not validate as they rely on variables that are dynamically declared from previous operations' @export directives. They require a runtime to re-write the queries based on the @export directives, and must be ran serially (though there may be some parallelism depending on the variable dependencies) so the variables can be updated. The operations themselves are not valid GraphQL operations until the transforms have been applied, so many GraphQL validation/linting tools would reject them.

(By the way, I think the validation issues/rewrite requirements of these chained batches could be solved by changing the syntax for query StoryComments { to query StoryComments($ids: Int[] @import(from: "ids")) {.)

@sungam3r I suggest you rename this issue to "HTTP/1.1 roundtrip reduction via array-based auto-batching".

@michaelstaib I suggest you raise a new issue about the approach you mention, which I think might be better described as something like "chained operation batches".

michaelstaib · 2019-11-13T12:26:29Z

@benjie you are right and we should have multiple issues describing the different approaches.

sungam3r · 2019-11-13T14:44:03Z

@benjie I completely agree with your every word as if I wrote it myself.

sungam3r · 2019-11-13T14:47:18Z

The operations typically run independently in parallel on the server

Only one small remark - queries in the batch array may contain mutations, so it is unsafe to execute them in parallel. GraphQL-dotnet executes batched requests sequentially.

benjie · 2019-11-13T22:08:49Z

Interesting. I only use query auto-batching as a transport layer optimisation; so if, without batching, the application would have issued 4 operations at the same time over separate parallel HTTP requests (which would have ran independently and in parallel) then I treat it as acceptable to run those same operations independently and in parallel when received in a batch by the server. Making it such that mutations run in series does sound like a good idea though, even though it would make query batching even slower than it is already (taking it from as slow as the slowest operation, to as slow as the slowest chain of operations).

michaelstaib · 2019-11-14T05:34:05Z

@benjie

Just one question here, do you not write the result to the stream as they arrive.

We have optimized batching to write to the response stream as soon as possible. Our .NET client Strawberry Shake will then start reading the results from the stream as they arrive and make them immediately available.

So, we do not wait for the full array to be finished to be written to the stream. This means that the time to the first response is quite quick.

michaelstaib · 2019-11-14T05:38:16Z

@sungam3r

I think ordered execution for mutations is exactly what operation batching does. With that it is more explicit that you need an order.

The Apollo batching that is proposed here was about being able to merge multiple requests from a client into one in order to optimize transport. This is done by Apollo for instance under the hood.

michaelstaib · 2019-11-14T05:40:16Z

Just one more general thing about performance here, when we do batching we preserve the same DataLoader for all merged requests. Meaning resolved entities are consistent and more quickly resolved for all responses of the batch.

sungam3r · 2019-11-14T06:13:20Z

I do not put an equal sign between the client’s desire to make a batch request and execute all the queries in it in parallel. You can consider a batch request as a way to optimize data transmission over slow transport. Parallelization of the execution of individual elements in the transmitted array of requests is a further optimization, which in the general case cannot be performed without the explicit indication of the client. Similarly, with the question of returning the results as quickly as possible - the order of responses may be important to the client. Imagine that he performs a series of 10 mutations in a batch request over a client’s bank account. You can’t just take it and start executing individual queries in batch in parallel. It is necessary to somehow specify the way the client indicates how the server should perform its batch request. I think that for security reasons, by default you need to execute batch queries strictly sequentially and return the results in order.

I think ordered execution for mutations is exactly what operation batching does. With that it is more explicit that you need an order.

These two methods are orgogonal to each other. Sequential execution considerations apply to both.

benjie · 2019-11-14T07:47:19Z

@michaelstaib That’s an optimisation I’ve not implemented yet; AFAIK Apollo Client can’t make use of that optimisation? apollographql/apollo-server#396

michaelstaib · 2019-11-14T09:15:07Z

@sungam3r

... may be important to the client. Imagine that he performs a series of 10 mutations in a batch request over a client’s bank account. You can’t just take it and start executing individual queries in batch in parallel. It is necessary to somehow specify the way the client indicates how the server should perform its batch request. I think that for security reasons, by default you need to execute batch queries strictly sequentially and return the results in order.

I get that, that is why we support @export and operation batching. But with vanilla batching why do you not add all of that in one query request. The execution algorithms for mutation does exactly guarantee that the mutations are executed serially. There is no need to use batching for that in this case. The requests that are batched together in this case are independent since they cannot share anything.

Just as an example:

mutation {
  a 
  b
  c
}

The above mutation would execute a,b,c in the exact order. So why batch the request in this case?

The vanially batching for HTTP 1.1 is about reducing requests.

The batching that I referred to earlier that Hot Chocolate and Sangria support is about those sequences.

mutation a {
  a @export(as: "transactionId")
}

mutation b {
  b(transactionId: $transactionId)
}

In that case sequences make sense since in the case of paypal you might have a transactionId that is created by the first mutation and then we use that in the next mutation and so on.

There is also a way to still tune the HTTP 1.1 batching it that guarantees order and parallelization. If you for instance merge the incoming batches into one query request you could still guarantee execution order in the case of mutations and you could parallelize in the case of queries.

basically two requests:

mutation a {
  a 
}

mutation b {
  b 
}

would become:

mutation ab {
  request_1: a 
  request_2: a 
}

I think I read a blog about Apollo server and that they do something like that. I could be mistaken.

I just see that here as an exchange of ideas. I do not mean to be condescending or making the case that you're solutions or reasons for your solutions are bad in any way. Just want to make that clear. I really love discussion these topics with other implementors like @IvanGoncharov or @benjie or now @sungam3r.

michaelstaib · 2019-11-14T09:18:31Z

@benjie we support that in our .NET Client and we also build some fetcher logic for relay we are not that invested in Apollo.

benjie mentioned this issue Nov 13, 2019

Implement chained operation batches graphile/crystal#1183

Closed

sungam3r changed the title ~~Note about batching~~ Note about HTTP/1.1 roundtrip reduction via array-based auto-batching Nov 13, 2019

sungam3r mentioned this issue Feb 10, 2020

[RFC] Executing multiple operations in a query graphql/graphql-spec#375

Open

sungam3r mentioned this issue Mar 9, 2020

Query Batching multi threaded? graphql-dotnet/graphql-dotnet#1599

Closed

benjie mentioned this issue May 14, 2020

[RFC] Dynamic variable declaration graphql/graphql-spec#583

Open

benjie mentioned this issue Nov 9, 2020

What is the response format of a multi-operations query? graphql/graphql-spec#29

Closed

ghmcadams added enhancement question labels Jul 1, 2022

ghmcadams added this to the Post 1.0 release milestone Jul 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Note about HTTP/1.1 roundtrip reduction via array-based auto-batching #5

Note about HTTP/1.1 roundtrip reduction via array-based auto-batching #5

sungam3r commented Oct 10, 2019

benjie commented Nov 7, 2019

sungam3r commented Nov 7, 2019

michaelstaib commented Nov 10, 2019

benjie commented Nov 10, 2019

michaelstaib commented Nov 10, 2019 •

edited

sungam3r commented Nov 10, 2019

benjie commented Nov 13, 2019

michaelstaib commented Nov 13, 2019

sungam3r commented Nov 13, 2019

sungam3r commented Nov 13, 2019

benjie commented Nov 13, 2019

michaelstaib commented Nov 14, 2019 •

edited

michaelstaib commented Nov 14, 2019

michaelstaib commented Nov 14, 2019

sungam3r commented Nov 14, 2019

benjie commented Nov 14, 2019

michaelstaib commented Nov 14, 2019 •

edited

michaelstaib commented Nov 14, 2019

Note about HTTP/1.1 roundtrip reduction via array-based auto-batching #5

Note about HTTP/1.1 roundtrip reduction via array-based auto-batching #5

Comments

sungam3r commented Oct 10, 2019

benjie commented Nov 7, 2019

sungam3r commented Nov 7, 2019

michaelstaib commented Nov 10, 2019

benjie commented Nov 10, 2019

michaelstaib commented Nov 10, 2019 • edited

sungam3r commented Nov 10, 2019

benjie commented Nov 13, 2019

michaelstaib commented Nov 13, 2019

sungam3r commented Nov 13, 2019

sungam3r commented Nov 13, 2019

benjie commented Nov 13, 2019

michaelstaib commented Nov 14, 2019 • edited

michaelstaib commented Nov 14, 2019

michaelstaib commented Nov 14, 2019

sungam3r commented Nov 14, 2019

benjie commented Nov 14, 2019

michaelstaib commented Nov 14, 2019 • edited

michaelstaib commented Nov 14, 2019

michaelstaib commented Nov 10, 2019 •

edited

michaelstaib commented Nov 14, 2019 •

edited

michaelstaib commented Nov 14, 2019 •

edited