Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] General batch endpoint #1645

Closed
wants to merge 3 commits into from
Closed

Conversation

Toflar
Copy link
Contributor

@Toflar Toflar commented Jan 12, 2018

Q A
Bug fix? no
New feature? yes
BC breaks? no
Deprecations? no
Tests pass? TODO
Fixed tickets
License MIT
Doc PR TODO

One of the major disadvantages of REST is that under certain circumstances you need a lot of requests to get to a desired state whether that is for data retrieval or data creation does not really matter. I think for data retrieval you should think about using GraphQL which we already have support for (awesome!).
This PR is about improving batch creation of new data via REST.
Imagine you have a quiz and you need to store 50 answers to that quiz. That would be 50 requests in my case because an answer is a resource.

So then I started thinking about how my batch endpoint should look like. Then, as I was thinking about the concept, I figured I could maybe build a general solution so I quickly put together this WIP to outline and allow for discussions 😄

It's far from being complete but here's the idea:
There's a general endpoint /batch that accepts POST requests. The structure is an array that represents single requests as if they would've been executed one at the time. Example:

POST /batch HTTP/1.1
Accept: application/json
Content-Type: application/json
Content-Length: 201

[
{
  "path": "/quizanswer",
  "method": "POST",
  "body": "..."
},
{
  "path": "/quizanswer",
  "method": "POST",
  "body": "..."
},
{
  "path": "/quizanswer",
  "method": "POST",
  "body": "..."
},
]

If you want you can also add headers which by default are the same as the ones from the master request (otherwise you would have to provide Content-Type etc. for every single entry).
The special batch endpoint then just takes every single entry and executes a sub request for them. So internally they are handled exactly as if you executed 50 isolated requests one after the other but from the outside, it's just one request. Neat, no? 😄
As a response you get an array with the most important data from each of the requests.

This allows you to batch-create any data. You can even combine different endpoints in the same request! It's a general purpose API endpoint bringing maximum flexibility to API Platform. I would not recommend to mix multiple resources in one batch request but I think we should leave that decisions to the developers.

Here's a few things to consider, todo's or just remarks:

  • Agree on the concept. Do you like it? Does it make sense to you?
  • Unit tests, once agreed on concept.
  • Make /batch path configurable, do you even like the name?
  • Is an array with path, methodetc. too cumbersome? Better ideas? I thought about just providing the requests in HTTP format (so POST /quizanswer....) but that would be a lot of repetition for e.g. Host: .. which is useless in that case anyway.
  • Should we use 207 as status code for that even though 207 is WebDav? But Multi-Status is somehow correct. Stick to http://www.odata.org/documentation/odata-version-3-0/batch-processing, so we use 202 Accepted.
  • Swagger integration
  • more?

@dunglas
Copy link
Member

dunglas commented Jan 13, 2018

Hi,

This is a very good idea and something we definitely need. Actually, this is on my todo list for a while and it’s awesome that you start working on this topic! Thank you.

I had a bit different implementation in mind. There is a specification for batch endpoints in the Odata protocol that is well designed: http://www.odata.org/documentation/odata-version-3-0/batch-processing/

The main idea is to send a multipart HTTP request to the batch endpoint that contains inner HTTP requests. It fits very well with the architecture of API Platform and Symfony, and allows to use 100% of the existing features of the framework (including content-negotiation).

It can be implemented this way:

  • A /$batch endpoint will be called by the client and include all inner requests in multipart, the boundaries will be used as identifier
  • The related Symfony controller will parse the outer multipart request to get the raw content of every embedded HTTP request
  • It will convert the raw content of the request to a Symfony Request instance
  • A Symfony subrequest will be executed for every inner HTTP request (as in your current implementation)
  • The batch controller will generate a multipart response containing all sub-responses (associated with the original id sent in the boundary by the client, so the client will be able to match the response with the original request)

OData also defines a way to use new entities in other requests in the same batch, it would be nice to support this in a second time.

There is almost everyhing we need in Symfony and API Platform to implement this, the only big missing part is... a HTTP parser. AFAIK, there is no PHP function nor popular library to parse a raw HTTP request. Symfony's HttpFoundation uses directly the PHP superglobals (Request::fromGlobals).
We will need to create a new class to convert a raw HTTP request to superglobal-like arrays, then pass these arrays to the Request contructor.

WDYT @Toflar?

@dunglas
Copy link
Member

dunglas commented Jan 13, 2018

Regarding your specific use case, you can already use the "embedded relations" feature of API Platform like this:

  • Add an explicit quiz -> answer relation
  • Use serialization groups to allow embedding answers objects when updating a quizz resource
  • Send all the answers this way

API Platform will already take care of updating existing answers, and creating the new ones.

A batch endpoint will only be useful when reading or updating resources with no relations at all between them.

@Toflar
Copy link
Contributor Author

Toflar commented Jan 14, 2018

Hey, thanks for the feedback!

Basically, what Odata does, is what I meant with

Is an array with path, method etc. too cumbersome? Better ideas? I thought about just providing the requests in HTTP format (so POST /quizanswer....) but that would be a lot of repetition for e.g. Host: .. which is useless in that case anyway.

I thought about it over the weekend and I have to admit that I don't fancy their solution. It is maybe the cleaner solution from a protocol perspective, yes. However, to me it's waaay to complicated to use. It would also require the client side to be able to build and parse http multipart requests/responses, not just API platform on the server side. So if you're for example using the API from within some JS with JSON, all you need is natively supported by JS and thus in any browser. So for batch requests you all of a sudden need an http parser, you need to understand http multipart requests with bounderies etc. I think acceptance of such and endpoint would be very low compared to a simple array of requests provided in the same data format you already work with for the rest of the API.

So in short: What Odata does is exactly the same as my proposal, it just makes use of HTTP spec possibilities. Normally I'd go for the spec but in that case I think I would go for the more straightforward solution because I'm pretty much convinced that acceptance would be way higher. And after all that's imho what an API is all about: making it easy for other developers to use and get the job done 😄

Wdyt?

BTW: Thank you for the hint regarding the relation, I might use that too, awesome! 👍

@dunglas
Copy link
Member

dunglas commented Jan 15, 2018

There is an existing JS implementation for OData: https://stackoverflow.com/a/23805287/1352334
It's just a few lines of code, and we can easily provide our own JS library if we want to.

It looks cleaner to me to stick with HTTP (and reuse the Microsoft's R&D work) than re-creating a custom protocol.

@Toflar
Copy link
Contributor Author

Toflar commented Jan 15, 2018

Okay, fine with me! What about https://github.com/Riverline/multipart-parser?

@Koc
Copy link

Koc commented Jan 18, 2018

IMHO it is better to solve this problem using custom DTOs

@Toflar
Copy link
Contributor Author

Toflar commented Jun 14, 2018

I've started to work on a library to parse HTTP requests. It's WIP but should already get us pretty far. See https://github.com/Toflar/http-request-parser 😄

@asimonf
Copy link

asimonf commented Sep 12, 2018

Is this already decided? By the way, how would one go about making the batch operation atomic the Odata-way?

@dunglas
Copy link
Member

dunglas commented Sep 12, 2018

@asimonf can you elaborate a bit?

@bendavies
Copy link
Contributor

any interest in progressing this?

@Toflar
Copy link
Contributor Author

Toflar commented Jan 28, 2019

Not from my side atm, happy to help finishing and releasing a stable version of the parser in case it‘s needed though.

@asimonf
Copy link

asimonf commented Jan 28, 2019

@dunglas sorry it took so long to answer. I actually missed the notification. What I meant was, suppose a batch using something like Odata were to be implemented, would there be any way to make such an operation atomic? As in, if one of the requests fail, all of them must fail?

@Toflar
Copy link
Contributor Author

Toflar commented Apr 6, 2019

As Symfony now has a MIME component I wanted to see how we can take advantage of core development here or contribute to make that possible. I've discussed briefly with @fabpot at EUFOSSA Hackathon. Just writing that down here for future reference, I hope I did get everything right 😄 :

So currently the MIME component supports generating MIME messages but not the other way around (no parser, yet). Moreover, multipart handling is the same for both HTTP and MIME so the logic can be shared there. Parsing HTTP is in fact pretty straight forward but for e-mails it's a lot harder due to encoding, maximum line lengths and a lot more.
But having a parser in the MIME component itself might be very interesting because it would also allow to serialize and unserialize MIME messages (e.g. for the messenger component).
So any efforts in creating a mulitpart capable HTTP parser should be directed to the MIME component.

@alebedev80
Copy link
Contributor

Any news?

@Toflar
Copy link
Contributor Author

Toflar commented May 11, 2020

I‘m closing this because I don‘t have any intentions to continue working on this and I keep getting asked about it. I just haven’t had any project that required a batch endpoint and so I couldn‘t afford working on it.

@Toflar Toflar closed this May 11, 2020
@bendavies
Copy link
Contributor

I was just reading this about 3 minutes before you closed it! Did I inadvertently do something to alert you?

@Toflar
Copy link
Contributor Author

Toflar commented May 11, 2020

Haha, nah you didn‘t. Must have been one of these strange coincidences :) I was asked about it on Slack :)

@vipulw2011
Copy link

vipulw2011 commented Jun 21, 2020

@Toflar @dunglas I see that this feature is merged into master. So just need to know how do we configure it. I need to make a bulk PATCH for one of the entity with different id's as currently making multiple PATCH itemOperation is not good. Can you guys help me with it, if possible ?

I will just put up what I understood. I need enable batch_endpoint in config and provide path for the endpoint. So, in request body what I do is give actual endpoint inside path key and method as PATCH. Am I correct until here ?

@bendavies
Copy link
Contributor

It was not merged

@vipulw2011
Copy link

@bendavies Yeah ! My bad ! I just missed the wants to merge part.
How do I do it ? Any ideas ?

@kryst3q
Copy link

kryst3q commented Jul 10, 2020

@bendavies I would also hear the answer for @vipulw2011 question if you don't mind.

@Toflar
Copy link
Contributor Author

Toflar commented Jul 10, 2020

APIP doesn't provide any solution for it which means you build your own endpoints with your own controllers in whatever way you like :)

@kryst3q
Copy link

kryst3q commented Jul 10, 2020

Thanks for fast repsonse @Toflar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants