net/http: Request.Form field data may include HTML entities #45479
I'm no HTTP expert, so this could be working as intended, but I notice that when an HTML form is posted, the document's encoding is also used to encode the form data. So, if the document is Latin-1, and a text field contains "
Is Chrome's behavior, of using an HTML entity reference
If (a), are servers expected to handle HTML entity references in form data? I could see no mention of this expectation in the net/http package code or docs. I also can't see how a server could distinguish a Chrome-introduced HTML entity reference from a form that literally contained that sequence, which makes me think (a) is not the answer.
A word of documentation in net/http might help. StackOverflow and the usual sources were surprisingly unhelpful.
The text was updated successfully, but these errors were encountered:
Thanks for reporting this!
The HTTP form fields transport features are completely HTML agnostic. The HTTP spec never mentions such a feature as far as I know.
Here Go is just parsing the HTTP and relaying the information to the user code, which (if they want to support it) has to then decode this bit too (which I would advise against for security reasons).
My suggestion would be to not let it happen: always make sure you're using utf-8. Note that the stdlib always tries to do this when it has the power to choose. Moreover I'd suggest to not try to decode anything that comes from the client that might have been weirdly encoded. Either have some JS on the client (or the browser itself) that applies a well defined encoding or just reject anything you don't recognize.
On the documentation side of things: Go has an implementation of the HTTP spec, which doesn't meddle with HTML entity encoding in any way, so I'm not sure about adding a doc line about this. I fear it might create more confusion than solving issues.