Add JSON support to parse_body_arguments #997

Closed
wants to merge 2 commits into
from

Projects

None yet

3 participants

@gmr

This patch adds support for a JSON request body for POST/PATCH/PUT requests. JSON bodies are constructed the same as the other request body arguments to ensure API compatibility with RequestHandler.get_body_argument and RequestHandler.get_body_arguments.

Additionally minimal added test coverage to parse_body_arguments

gmr added some commits Feb 28, 2014
@gmr gmr Add JSON support to parse_body_arguments
This patch adds support for a JSON request body for POST/PATCH/PUT requests. JSON bodies are constructed the same as the other request body arguments to ensure API compatibility with RequestHandler.get_body_argument and RequestHandler.get_body_arguments.

Additionally minimal add test coverage to parse_body_arguments
c402d51
@gmr gmr Fix the test response expectations 8a97072
@bdarnell
tornadoweb member

It feels very weird to me to contort json objects to fit the get_argument/get_arguments data model - one of the advantages of using json over form-encoding is that there is a type-safe distinction between a single value and a one-element list. It seems to me that you'd be better off just storing the decoded object and accessing it with ordinary dict methods instead of get_argument[s], unless you have a particular desire to gloss over the distinction between json and form-encoding.

@bdarnell
tornadoweb member

I'm closing this PR because I don't think it makes sense to mix the data model for HTML forms and other bodies. Additionally, I think any enhancements to body parsing should really be done at the RequestHandler level instead of parse_body_arguments, because the latter is difficult for the application to customize and has no good error-handling path.

@bdarnell bdarnell closed this May 25, 2014
@wsantos

Would you consider something like that https://gist.github.com/wsantos/303c541cb18557702862 ?

@gmr

Last conversation is we were going to add a RequestHandler.json_body attribute and not apply any of the RequestHandler.get_argument type behavior on it.

@wsantos

With this approach we can accomplish any body processing, not only JSON, maybe a change from processed_body to parsed_body, and the developer have the power to do any implementation and error handling, what do you think @gmr ?

@gmr

That's more of a @bdarnell question than for me. I owe a PR doing the json_body bit. Not opposed to it, but what I was looking for was to remove the work in dealing with Content-Type: application/json since it's a fairly standard convention at this point in writing APIs. Your approach would require that I implement the json.decode, and what I was trying to accomplish was not having to think about it ;-)

@wsantos

I would prefer your implementation, it is more seamless, but we don't have only JSON, maybe a good change would be a "default" tornado processing if possibility to developer to make their own.

@bdarnell
tornadoweb member

Regarding @wsantos's gist, the current_user/get_current_user pattern has proven confusing and I'd rather not repeat it.

As a baseline, if you don't care about checking content-type, you can do this in one line today:
def post(self):
self.json_body = json_decode(self.request.body)

So opting in has to be very simple if it is to be worthwhile. One way in which a centralized implementation can be valuable is by handling content-type correctly. This suggests some sort of registry in which handlers for different content-types can be added. But then we're back to the original question, which is whether it makes sense for divergent parsing functions to write their results into a single object, and if so what that object should look like.

Another consideration, which is perhaps a separate and larger project, is how to handle asynchronous parsing of large bodies. For form/multipart uploads in particular, it would be nice to handle these in some incremental way from data_received.

@bdarnell bdarnell reopened this Aug 23, 2014
@wsantos

So, the idea is to centralize the body processor ? i was thinking in something like RequestHandlers, here is an code sample, of course we must have a default list for some handlers like, application/x-www-form-urlencoded, multipart/form-data and etc. i still thinking in how to expose the processed data.

import tornado.ioloop
import tornado.web

class JsonContentTypeHandler(tornado.???.ContentTypeHandler)
    pass

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

application = tornado.web.Application([
    (r"/", MainHandler),
], content_type_handlers=[
    "application/json", JsonContentTypeHandler
])

if __name__ == "__main__":
    application.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

I like this approach but some parts need to be polished.

@bdarnell
tornadoweb member

Yes, something like that. I had been thinking of configuring it in the RequestHandler subclass, but the Application works too. (I would avoid the word "handler" for the new type; it's best to save that word for RequestHandler and its subclasses.)

The question is, as you say, how to expose the data. form-urlencoded and json both produce a single object, which is usually a dictionary with string keys, which makes it appealing (but IMHO awkward) to massage the json output to match the legacy form-urlencoded schema. Other encodings produce more divergent results: multipart/form-data produces a dict as in the first case but also a separate files object; a protobuf/thrift parser would produce an object that is not dict-like at all.

I think that applications generally do want to coerce all input formats they support into one consistent object model, but they disagree about which format should dominate. For example, apps that use protobufs may want to accept the json serialization of protobufs and access the results as the decoded protobuf instead of the raw json dict. Apps that use a json schema (example) may want to use that schema to upgrade form-encoded data to the json format instead of downgrading json to the ambiguous form-encoded format.

So after thinking about it that way, maybe the answer is not a collection of parsers keyed by content-type, but a single parser (per handler or application) which may support multiple content-types and coerce them all into a consistent model.

@gmr

Closing this as to not clutter up the PR/issues list.

@gmr gmr closed this Jun 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment