New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"=" in query parameter values #1120

Closed
jacktol opened this Issue May 12, 2017 · 5 comments

Comments

Projects
None yet
4 participants
@jacktol

jacktol commented May 12, 2017

When a query parameter value contains a "=", akka-http throws the following exception:

akka.http.scaladsl.model.IllegalUriException: Illegal query: Invalid input '=', expected part, '&' or 'EOI' ...

It doesn't matter whether you're in strict or relaxed mode, the equals sign is not allowed.

When looking at RFC3986 page 23 however, query is defined as:

The query component contains non-hierarchical data that, along with
data in the path component (Section 3.3), serves to identify a
resource within the scope of the URI's scheme and naming authority
(if any). The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.

query = *( pchar / "/" / "?" )

The characters slash ("/") and question mark ("?") may represent data
within the query component. Beware that some older, erroneous
implementations may not handle such data correctly when it is used as
the base URI for relative references (Section 5.1), apparently
because they fail to distinguish query data from path data when
looking for hierarchical separators. However, as query components
are often used to carry identifying information in the form of
"key=value" pairs and one frequently used value is a reference to
another URI, it is sometimes better for usability to avoid percent-
encoding those characters.

Where pchar is defined as:

pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

And its types as:

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

According to this definition the equals sign should be allowed. Am I missing something, or is this a bug in akka-http?

True, I wouldn't use "=" in values myself, but Angular2 for example doesn't escape it as well, hence requiring custom work otherwise.

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph May 12, 2017

Member

Thanks, @jacktol. I think this has been discussed a few times but cannot find a good reference right now.

You are right that according to RFC 3986, '=' would be valid everywhere. However, RFC 3986 does not define the semantics of the query string so other specs (like W3C Html specs) could still be stricter than what RFC 3986 allows.

The basic question is how to tell apart if the second '=' should be part of the name or of the value of the parameter. If there's no unambiguous spec we usually punt and take the stricter approach.

The best document that would give an unambiguous meaning of multiple '=' in a parameter is the one at https://url.spec.whatwg.org/#urlencoded-parsing:

If bytes contains a 0x3D (=), then let name be the bytes from the start of bytes up to but excluding its first 0x3D (=), and let value be the bytes, if any, after the first 0x3D (=) up to the end of bytes. If 0x3D (=) is the first byte, then name will be the empty byte sequence. If it is the last, then value will be the empty byte sequence.

So, I guess at least in relaxed mode we could go with that definition.

Member

jrudolph commented May 12, 2017

Thanks, @jacktol. I think this has been discussed a few times but cannot find a good reference right now.

You are right that according to RFC 3986, '=' would be valid everywhere. However, RFC 3986 does not define the semantics of the query string so other specs (like W3C Html specs) could still be stricter than what RFC 3986 allows.

The basic question is how to tell apart if the second '=' should be part of the name or of the value of the parameter. If there's no unambiguous spec we usually punt and take the stricter approach.

The best document that would give an unambiguous meaning of multiple '=' in a parameter is the one at https://url.spec.whatwg.org/#urlencoded-parsing:

If bytes contains a 0x3D (=), then let name be the bytes from the start of bytes up to but excluding its first 0x3D (=), and let value be the bytes, if any, after the first 0x3D (=) up to the end of bytes. If 0x3D (=) is the first byte, then name will be the empty byte sequence. If it is the last, then value will be the empty byte sequence.

So, I guess at least in relaxed mode we could go with that definition.

@jacktol

This comment has been minimized.

Show comment
Hide comment
@jacktol

jacktol May 15, 2017

Thanks for the quick response @jrudolph. What you're suggesting sounds reasonable. It seems like a logical way to interpret such values.

jacktol commented May 15, 2017

Thanks for the quick response @jrudolph. What you're suggesting sounds reasonable. It seems like a logical way to interpret such values.

@gmethvin

This comment has been minimized.

Show comment
Hide comment
@gmethvin

gmethvin Jun 11, 2017

Contributor

The other problem here is error handling. Uri#query throws an exception for invalid URIs. To me this is unexpected, especially for a Scala API. Assuming Query is meant to represent any kind of valid query in the URI, it should be able to support strings that don't follow the usual convention.

Contributor

gmethvin commented Jun 11, 2017

The other problem here is error handling. Uri#query throws an exception for invalid URIs. To me this is unexpected, especially for a Scala API. Assuming Query is meant to represent any kind of valid query in the URI, it should be able to support strings that don't follow the usual convention.

@gmethvin

This comment has been minimized.

Show comment
Hide comment
@gmethvin

gmethvin Jun 12, 2017

Contributor

@jrudolph I agree with your proposal to use the whatwg standard for application/x-www-form-urlencoded. This would be useful for Play as it puts the behavior in line with the URI parsing we had with the Netty server prior to switching to Akka HTTP as the default.

Contributor

gmethvin commented Jun 12, 2017

@jrudolph I agree with your proposal to use the whatwg standard for application/x-www-form-urlencoded. This would be useful for Play as it puts the behavior in line with the URI parsing we had with the Netty server prior to switching to Akka HTTP as the default.

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Jun 12, 2017

Member

@gmethvin Uri#query works like this for compatibility reasons. There's also rawQuery to access the query string without parsing. I agree the modelling is not perfect, as URI represents a more general structure and then puts additional constraints on the query part which are (semi-) defined only through the HTML / HTTP specs.

Member

jrudolph commented Jun 12, 2017

@gmethvin Uri#query works like this for compatibility reasons. There's also rawQuery to access the query string without parsing. I agree the modelling is not perfect, as URI represents a more general structure and then puts additional constraints on the query part which are (semi-) defined only through the HTML / HTTP specs.

gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

Allow '=' character in query parameter values in relaxed mode (Fix #1120
)

* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change

gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

Allow '=' character in query parameter values in relaxed mode (Fix #1120
)

* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change

gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change

gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change

gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 13, 2017

Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change

gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 14, 2017

Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change

@jrudolph jrudolph closed this in #1190 Jun 14, 2017

jrudolph added a commit that referenced this issue Jun 14, 2017

@2m 2m added this to the 10.0.8 milestone Jun 20, 2017

tomrf1 added a commit to tomrf1/akka-http that referenced this issue Aug 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment