New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support form param parsing #180
Comments
I could really use this so I started looking at how bottle.py handles the parsing of the body. Is cgi.FieldStorage not allowed in Falcon? It would be great if this was rolled into get_param so you could just data = req.get_param('form_name') Going to take a harder look at this as right now because it directly affects my current project where I'm attempting to use Falcon instead of bottle.py. Suggestions are welcome. |
cgi.FieldStorage could work. I wonder how fast it is? |
If you're interested, I've got a library for streaming multipart parsing in Python: https://github.com/andrew-d/python-multipart Highlights: test coverage is 100%, it supports parsing multipart and form data, and it's very configurable. |
@andrew-d that looks interesting. I'm now leaning more towards a general req.form property and only parsing the stream on the first access of the request's form property. The get_param already has the performance tweak to not parse it if the query string doesn't exist so we could do a 'delayed' parsing on the stream till it is needed. |
👍 I like the req.form property idea. |
I like req.form as well. I can help if you need testing. |
I'm circling back to this, are we thinking something that would allow you to access the form data like so?
Essentially create a dictionary styled way of interacting with the form data but doing some smart on demand parsing. The names of the would correspond to keys in the dictionary and their content would be the value? @kgriffs what do you think? |
Good stuff. I've been thinking about this, and I would love to be able to meet the following goals with whatever design we come up with:
With the above in mind, I mocked up some permutations based on your example for the sake of discussion: """
Using a form object as proposed, shown side-by-side with query param interface
"""
site_name = req.get_param('site', required=True)
limit = req.get_param_as_int('limit') or 10
coordinates = req.get_param_as_list('coordinates',
transform=float, required=True)
first_name = req.form.get('firstname')
last_name = req.form.get('lastname')
email_text = req.form.get('emailtext')
should_subscribe = req.form.get('subscribe')
"""
Side-by-side example: With new interface for params (additional work item)
"""
# New interface prototype
#
# This would deprecate the req.get_param interface, but not remove
# it until the 0.2 release.
#
# If we decide to go down this path, we will want to also update the
# header interface to suite (i.e., get_header ===> header).
def param(name, type=str, transform=None, required=False, store=None):
pass
site_name = req.param('site', required=True)
limit = req.param('limit', int) or 10
coordinates = req.param('coordinates', list,
transform=float, required=True)
first_name = req.form('firstname')
last_name = req.form('lastname')
email_text = req.form('emailtext')
should_subscribe = req.form('subscribe') # bool
"""
Without new interface for params
"""
site_name = req.get_param('site', required=True)
limit = req.get_param_as_int('limit') or 10
coordinates = req.get_param_as_list('coordinates',
transform=float, required=True)
first_name = req.form('firstname')
last_name = req.form('lastname')
email_text = req.form('emailtext')
should_subscribe = req.form('subscribe') # bool
"""
Another option for forms, but doesn't allow for future additions, such as transform support
"""
first_name = req.form['firstname']
last_name = req.form['lastname']
email_text = req.form['emailtext']
send = req.form['send'] What do you think? |
Discussed on IRC with sebasmagri, and we think it would make sense to just reuse the req.get_param interface to access form params. The two types of params are very similar, and unifying the interface would afford hooks that support other content types besides forms. |
I would like to note that the req.param interface looks much better for me, so it would be nice just to use it for form parameters also. |
Let me take a stab at implementing this under the existing |
Kurt, I think going the get param route makes a lot of sense. If you want to get Ross
|
See issue falconry#180
I tried this out to see what it would look like (WIP: no tests, needs try blocks around cgi.FieldStorage calls): https://github.com/kgriffs/falcon/blob/issues/180/falcon/request.py#L543 TBH, I'm still on the fence about this. On the one hand, overloading the get_param interface means you can get away with supporting form submissions and direct API requests with the same line of code in the responder; on the other hand, if something goes wrong, it may not be immediately obvious where the errant param came from. What does everybody think about this WIP? If we like it, @hendricksonrw has volunteered to help with test coverage (I think we'll need a large, dedicated test module to do this justice). |
Hmm, I'm starting to have second thoughts on this. What if we were to just land PR #172 and then contribute a before hook to Talons that uses https://github.com/andrew-d/python-multipart or something to handle file uploads? Also, I was thinking, supporting multipart/form-data for file uploads directly in the Falcon "kernel" is going to be really hard to do efficiently without either taking a 3rd-party dependency or writing some Falcon-specific code. cgi.FieldStorage is OK, but I was looking at the code and it could be a lot better. |
After discussing this proposal on IRC, I think it is the right way to go. Unless anyone objects in the next day or two, I will get PR #172 merged and then look at contributing something to talons. |
This patch adds support for merging the body of an 'application/x-www-form-urlencoded' request into the params collection so they can be queried via get_param & Co. Credit goes to sorenh for the originaly idea and POC. Thanks Soren! Partially implements issue #180
OK, only thing left now is to write a Talons hook for processing file uploads from a form post. |
Question: If we can do this without external dependencies (using the cgi module), is there any reason not to just implement this directly in |
Does the above (82ae94f in particular) not mean that it becomes impossible to POST form data to a URI with a query string, without one overwriting the other? This would be an edge case obviously, but I don't see a reason for sacrificing such functionality. As an example, consider this (HTML) form:
If both querystring and form variables are accessed using the same method on To answer the question of form data and/or file uploads directly in Request, IMO that seems fine as long as it doesn't have external dependencies and means no (or negligible) performance hit. |
As currently implemented, the form data overwrites query string values in the case of a collision. If we wanted to separate them, we would need to introduce a parallel interface, perhaps something like this: req.form.get_param
req.form.get_param_as_int
req.form.get_param_as_list
# etc. This makes Request a bit more complicated, but may be worth it if we can find a good example of when an app/service would need to distinguish between fields submitted through a form vs. a query string. |
Could also create separate accessors for GET, POST, and then a default prioritized combination like what Falcon does now. There's definitely a use-case for retrieving lists of parameters with the same name (regardless of GET vs POST), e.g. |
It's useful to have the form fields parsed, but it should be into a separate data structure to req.params and very importantly, it should not consume the POST data such that POST data can no longer be read. Perhaps such parsing should be optional or explicitly executed or something. My application needs the POST data. |
Seems like a reasonable approach to only consume the request stream "just in time" when the form data attribute/helper (however we end up implementing it) is first called. Alternatively, we could buffer |
See also #418 |
Closing since remaining work is captured in #418. |
I'm not sure where this ended up, but my question is, am I both able to get Falcon to parse the form data, AND also get access to the POST data for my own purposes? If I can only do one of these things then that's not an ideal outcome. |
There's an issue (#418) to modify how this works for the 0.3 release. The current plan is to create a separate interface for form params vs. query string params. The request body will no longer be automatically consumed, only being parsed the first time a form param method is called. In your case, it sounds like you would like to have Falcon parse the form params but also have access to the raw string? In the case of |
Yes I need both to be able to parse the form data and get access to the raw POST data. |
OK, thanks for confirming. I added a note to #418 so we don't forget. |
I saw #418 still open, so the issue still remain? |
Maybe do this as a /contrib hook? Alternatively, role into get_param?
application/x-www-form-urlencoded
multipart/form-data
See also: http://www.w3.org/TR/html401/interact/forms.html#submit-format
The text was updated successfully, but these errors were encountered: