New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suprising (additional) percent escaping on Uri#authority rendering #1558

Closed
ktoso opened this Issue Nov 28, 2017 · 6 comments

Comments

Projects
None yet
2 participants
@ktoso
Member

ktoso commented Nov 28, 2017

It was reported that:

class UriTest extends WordSpecLike with MustMatchers {
val userInfo = "user:p%40ssword"
val hostAndPath = "host/direc%20tory/fi%20le.tgz?query=test%25#frag%20ment"
val uriString = s"https://$userInfo@$hostAndPath"

"A Java URI" must {
"convert into an Akka HTTP Uri" in {
val javaUri = JavaURI.create(uriString)
val akkaUri = Uri(javaUri.toString)
akkaUri.toString mustEqual javaUri.toString
}
}

The issue seems to be not in parsing but rendering, I'll submit a proposal but not 100% sure if it's the best way to address this.

@ktoso ktoso self-assigned this Nov 28, 2017

ktoso added a commit to ktoso/akka-http that referenced this issue Nov 28, 2017

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Nov 28, 2017

Member

Regarding the general issue: we don't promise round trips for string representations. The minimum invariant we support is equivalence as described in RFC 3986 Section 6.1. I.e.

val uriString = 
val javaUri: java.net.URI = java.net.URI.create(uriString)
val akkaUri = Uri(uriString)

akkaUri == Uri(javaUri.toString)

A stricter behavior could be to render fully normalized URIs. But even then, strings would only be the same if java would also promise to render normalized URIs and that both implementations actually implement the same logic. I haven't checked to which degree we render completely normalized URIs.

For that reason, I think we should leave the behavior of java.net.URI out of the picture for now.

Member

jrudolph commented Nov 28, 2017

Regarding the general issue: we don't promise round trips for string representations. The minimum invariant we support is equivalence as described in RFC 3986 Section 6.1. I.e.

val uriString = 
val javaUri: java.net.URI = java.net.URI.create(uriString)
val akkaUri = Uri(uriString)

akkaUri == Uri(javaUri.toString)

A stricter behavior could be to render fully normalized URIs. But even then, strings would only be the same if java would also promise to render normalized URIs and that both implementations actually implement the same logic. I haven't checked to which degree we render completely normalized URIs.

For that reason, I think we should leave the behavior of java.net.URI out of the picture for now.

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Nov 28, 2017

Member

For me it looks as if our parser wouldn't properly percent decode the user info before passing it into the model as it is doing everywhere else.

Member

jrudolph commented Nov 28, 2017

For me it looks as if our parser wouldn't properly percent decode the user info before passing it into the model as it is doing everywhere else.

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Nov 28, 2017

Member

I.e. for me the bug is this:

scala> Uri("http://test:p%40ssword@example.com").authority.userinfo
res3: String = test:p%40ssword

while the result should be "test:p@ssword".

Member

jrudolph commented Nov 28, 2017

I.e. for me the bug is this:

scala> Uri("http://test:p%40ssword@example.com").authority.userinfo
res3: String = test:p%40ssword

while the result should be "test:p@ssword".

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Nov 28, 2017

Member

If you compare the parser logic e.g. to its processing of a host name it seems the percent decoding was just forgotten. Compare

run(_host = NamedHost(getDecodedStringAndLowerIfEncoded(UTF8)))

with

clearSB() ~ zeroOrMore(`userinfo-char` ~ appendSB()| `pct-encoded`) ~ '@' ~ run(_userinfo = sb.toString)

Member

jrudolph commented Nov 28, 2017

If you compare the parser logic e.g. to its processing of a host name it seems the percent decoding was just forgotten. Compare

run(_host = NamedHost(getDecodedStringAndLowerIfEncoded(UTF8)))

with

clearSB() ~ zeroOrMore(`userinfo-char` ~ appendSB()| `pct-encoded`) ~ '@' ~ run(_userinfo = sb.toString)

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Nov 28, 2017

Member

Should be simple enough to fix by using clearSBForDecoding() and getDecodedString() in the userinfo rule.

Member

jrudolph commented Nov 28, 2017

Should be simple enough to fix by using clearSBForDecoding() and getDecodedString() in the userinfo rule.

@jrudolph

This comment has been minimized.

Show comment
Hide comment
@jrudolph

jrudolph Nov 28, 2017

Member

Apart from that RFC 3986 Section 3.2.1 states this:


3.2.1.  User Information

   The userinfo subcomponent may consist of a user name and, optionally,
   scheme-specific information about how to gain authorization to access
   the resource.  The user information, if present, is followed by a
   commercial at-sign ("@") that delimits it from the host.

      userinfo    = *( unreserved / pct-encoded / sub-delims / ":" )

   Use of the format "user:password" in the userinfo field is
   deprecated.  Applications should not render as clear text any data
   after the first colon (":") character found within a userinfo
   subcomponent unless the data after the colon is the empty string
   (indicating no password).  Applications may choose to ignore or
   reject such data when it is received as part of a reference and
   should reject the storage of such data in unencrypted form.  The
   passing of authentication information in clear text has proven to be
   a security risk in almost every case where it has been used.

   Applications that render a URI for the sake of user feedback, such as
   in graphical hypertext browsing, should render userinfo in a way that
   is distinguished from the rest of a URI, when feasible.  Such
   rendering will assist the user in cases where the userinfo has been
   misleadingly crafted to look like a trusted domain name
   (Section 7.6).

So, we could even go as far as dropping or ignoring userinfo completely.

Member

jrudolph commented Nov 28, 2017

Apart from that RFC 3986 Section 3.2.1 states this:


3.2.1.  User Information

   The userinfo subcomponent may consist of a user name and, optionally,
   scheme-specific information about how to gain authorization to access
   the resource.  The user information, if present, is followed by a
   commercial at-sign ("@") that delimits it from the host.

      userinfo    = *( unreserved / pct-encoded / sub-delims / ":" )

   Use of the format "user:password" in the userinfo field is
   deprecated.  Applications should not render as clear text any data
   after the first colon (":") character found within a userinfo
   subcomponent unless the data after the colon is the empty string
   (indicating no password).  Applications may choose to ignore or
   reject such data when it is received as part of a reference and
   should reject the storage of such data in unencrypted form.  The
   passing of authentication information in clear text has proven to be
   a security risk in almost every case where it has been used.

   Applications that render a URI for the sake of user feedback, such as
   in graphical hypertext browsing, should render userinfo in a way that
   is distinguished from the rest of a URI, when feasible.  Such
   rendering will assist the user in cases where the userinfo has been
   misleadingly crafted to look like a trusted domain name
   (Section 7.6).

So, we could even go as far as dropping or ignoring userinfo completely.

jrudolph added a commit to jrudolph/akka-http that referenced this issue Nov 29, 2017

jrudolph added a commit to jrudolph/akka-http that referenced this issue Nov 29, 2017

@ktoso ktoso added this to the 10.0.11 milestone Nov 29, 2017

jrudolph added a commit that referenced this issue Nov 29, 2017

Merge pull request #1576 from jrudolph/jr/w/follow-up-1559
Percent decode userinfo while parsing URIs (#1558)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment