"=" in query parameter values #1120

Closed
jacktol opened this Issue May 12, 2017 · 5 comments

Comments

Projects
None yet
4 participants

jacktol commented May 12, 2017

When a query parameter value contains a "=", akka-http throws the following exception:

akka.http.scaladsl.model.IllegalUriException: Illegal query: Invalid input '=', expected part, '&' or 'EOI' ...

It doesn't matter whether you're in strict or relaxed mode, the equals sign is not allowed.

When looking at RFC3986 page 23 however, query is defined as:

The query component contains non-hierarchical data that, along with
data in the path component (Section 3.3), serves to identify a
resource within the scope of the URI's scheme and naming authority
(if any). The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.

query = *( pchar / "/" / "?" )

The characters slash ("/") and question mark ("?") may represent data
within the query component. Beware that some older, erroneous
implementations may not handle such data correctly when it is used as
the base URI for relative references (Section 5.1), apparently
because they fail to distinguish query data from path data when
looking for hierarchical separators. However, as query components
are often used to carry identifying information in the form of
"key=value" pairs and one frequently used value is a reference to
another URI, it is sometimes better for usability to avoid percent-
encoding those characters.

Where pchar is defined as:

pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

And its types as:

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

According to this definition the equals sign should be allowed. Am I missing something, or is this a bug in akka-http?

True, I wouldn't use "=" in values myself, but Angular2 for example doesn't escape it as well, hence requiring custom work otherwise.

Member

jrudolph commented May 12, 2017 edited

Thanks, @jacktol. I think this has been discussed a few times but cannot find a good reference right now.

You are right that according to RFC 3986, '=' would be valid everywhere. However, RFC 3986 does not define the semantics of the query string so other specs (like W3C Html specs) could still be stricter than what RFC 3986 allows.

The basic question is how to tell apart if the second '=' should be part of the name or of the value of the parameter. If there's no unambiguous spec we usually punt and take the stricter approach.

The best document that would give an unambiguous meaning of multiple '=' in a parameter is the one at https://url.spec.whatwg.org/#urlencoded-parsing:

If bytes contains a 0x3D (=), then let name be the bytes from the start of bytes up to but excluding its first 0x3D (=), and let value be the bytes, if any, after the first 0x3D (=) up to the end of bytes. If 0x3D (=) is the first byte, then name will be the empty byte sequence. If it is the last, then value will be the empty byte sequence.

So, I guess at least in relaxed mode we could go with that definition.

jacktol commented May 15, 2017

Thanks for the quick response @jrudolph. What you're suggesting sounds reasonable. It seems like a logical way to interpret such values.

Contributor

gmethvin commented Jun 11, 2017

The other problem here is error handling. Uri#query throws an exception for invalid URIs. To me this is unexpected, especially for a Scala API. Assuming Query is meant to represent any kind of valid query in the URI, it should be able to support strings that don't follow the usual convention.

Contributor

gmethvin commented Jun 12, 2017

@jrudolph I agree with your proposal to use the whatwg standard for application/x-www-form-urlencoded. This would be useful for Play as it puts the behavior in line with the URI parsing we had with the Netty server prior to switching to Akka HTTP as the default.

Member

jrudolph commented Jun 12, 2017

@gmethvin Uri#query works like this for compatibility reasons. There's also rawQuery to access the query string without parsing. I agree the modelling is not perfect, as URI represents a more general structure and then puts additional constraints on the query part which are (semi-) defined only through the HTML / HTTP specs.

@gmethvin gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

@gmethvin gmethvin Allow '=' character in query parameter values in relaxed mode (Fix #1120
)

* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change
e807a17

@gmethvin gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

@gmethvin gmethvin Allow '=' character in query parameter values in relaxed mode (Fix #1120
)

* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change
87ac123

@gmethvin gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

@gmethvin gmethvin Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change
044b331

@gmethvin gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 12, 2017

@gmethvin gmethvin Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change
b629090

@gmethvin gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 13, 2017

@gmethvin gmethvin Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change
1834db4

@gmethvin gmethvin added a commit to gmethvin/akka-http that referenced this issue Jun 14, 2017

@gmethvin gmethvin Allow '=' in query param values in relaxed mode (Fix #1120)
* Create new character class for query value chars and use this in relaxed mode
* Update relevant tests and examples to reflect this change
2b6931b

jrudolph closed this in #1190 Jun 14, 2017

2m added this to the 10.0.8 milestone Jun 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment