Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More standard application/x-www-form-urlencoded parsing #5110

Merged
merged 1 commit into from Feb 13, 2016

Conversation

pauldraper
Copy link
Contributor

  • Allow semicolon separators in addition to ampersands
  • Parse empty/non-existant names and values

W3C recommends accepting semicolons separators.

We recommend that HTTP server implementors, and in particular, CGI implementors support the use of ";" in place of "&"

Old stuff, but most parsers still follow the recommendation.

Also, most parsers accept empty names and values, and non-existent values. For example, the example request in RFC 5849 3.4.1.3.1 (OAuth 1.0) has the body

c2&a3=2+q

and requires parsing the c2 parameter in order for OAuth to work.


Most parsers allow these two usages, for example com.apache.http.client.utils.URLEncodedUtils or io.netty.handler.codec.http.QueryStringDecoder (HTML forms put fields in GET query strings, hence the name).

I would have used one of those rather than implementing it myself, but the play project doesn't currently depend on either.

Nil
}
}.toSeq
data.split("&|;").map { param =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike splitting on a single character, splitting on a regular expression will parse/compile the regular expression on every invocation. It would be more performant to hold the regular expression in a val on an object and use it to split.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And OpenJDK doesn't cache regexes.....

I fixed this.

FYI, the library examples I mentioned go to the trouble of creating non-regex implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can look at increasing performance if and when that's an issue. Another thing we should address is hashdos.

Allow semicolon separators in addition to ampersands
Parse empty/non-existant names and values
@benmccann
Copy link
Contributor

@jroper looks like you were reviewing this one. what do you think? should we merge it?

@benmccann
Copy link
Contributor

@jroper mind taking another look at this one?

@gmethvin
Copy link
Member

gmethvin commented Jan 2, 2016

This lgtm

@mkurz
Copy link
Member

mkurz commented Jan 19, 2016

Ping @jroper

gmethvin added a commit that referenced this pull request Feb 13, 2016
More standard application/x-www-form-urlencoded parsing
@gmethvin gmethvin merged commit 20f2b38 into playframework:master Feb 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants