Incoming URL is decoded twice (at least with Nancy.Hosting.Self) #1280

Closed
ChrisBrandhorst opened this Issue Oct 20, 2013 · 36 comments

Comments

Projects
None yet
@ChrisBrandhorst
Contributor

ChrisBrandhorst commented Oct 20, 2013

When using Nancy.Hosting.Self (at least in this case), an incoming URL is decoded using HttpUtility.UrlDecode twice.

First, in NancyHost.cs#ConvertRequestToNancyRequest. The resulting Request is then passed to the routeresolver, where in DefaultRouteResolver#ResolveResult, it is decoded again.

This is a problem when you have URLs like

/containers/123/artist/AC%252FDC

combined with the following route:

/containers/{id}/{filter*}

In this case, the filter part is a string I would like to split on the slash again (using the slash like this looks nicer in the UI). However, because of the double decoding, the resulting value for filter is AC/DC, where I would like to have AC%2FDC

@phillip-haydon

This comment has been minimized.

Show comment
Hide comment
@phillip-haydon

phillip-haydon Nov 14, 2013

Member

This will be cause by a change I made a while ago.

@thecodejunkie if you remember we had that issue where routes were not being decoded before route checking was occuring. This happened when doing it on an asp.net host.

So we decoded it, but it looks like self hosting is also decoding as well.

The code needs to remove the decoding from the NancyHost and only do it at the RouteResolver which will happen for all hosts rather than specific hosts.

Member

phillip-haydon commented Nov 14, 2013

This will be cause by a change I made a while ago.

@thecodejunkie if you remember we had that issue where routes were not being decoded before route checking was occuring. This happened when doing it on an asp.net host.

So we decoded it, but it looks like self hosting is also decoding as well.

The code needs to remove the decoding from the NancyHost and only do it at the RouteResolver which will happen for all hosts rather than specific hosts.

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Nov 14, 2013

Member

Nope the opposite way around. AspNet hosting already decodes when the request comes in and then the execution path goes to DefaultRouteResolver and decodes again. I think removing that decode from DefaultRouteResolver is better and let all hosts be responsible for decoding.

AspNet example - https://github.com/NancyFx/Nancy/blob/master/src/Nancy.Hosting.Aspnet/NancyHttpRequestHandler.cs#L79

Member

jchannon commented Nov 14, 2013

Nope the opposite way around. AspNet hosting already decodes when the request comes in and then the execution path goes to DefaultRouteResolver and decodes again. I think removing that decode from DefaultRouteResolver is better and let all hosts be responsible for decoding.

AspNet example - https://github.com/NancyFx/Nancy/blob/master/src/Nancy.Hosting.Aspnet/NancyHttpRequestHandler.cs#L79

@thecodejunkie

This comment has been minimized.

Show comment
Hide comment
@thecodejunkie

thecodejunkie Nov 15, 2013

Member

Hosts should decode, route resolve should not.

Member

thecodejunkie commented Nov 15, 2013

Hosts should decode, route resolve should not.

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Nov 15, 2013

Member

High 5 me :)

Member

jchannon commented Nov 15, 2013

High 5 me :)

@ChrisBrandhorst

This comment has been minimized.

Show comment
Hide comment
@ChrisBrandhorst

ChrisBrandhorst Nov 15, 2013

Contributor

Allright, I can make a PR for the DefaultRouteResolver, but isn't this change possible breaking?
Mayby all hosts should be checked if they actually do the decoding?

Contributor

ChrisBrandhorst commented Nov 15, 2013

Allright, I can make a PR for the DefaultRouteResolver, but isn't this change possible breaking?
Mayby all hosts should be checked if they actually do the decoding?

@thecodejunkie

This comment has been minimized.

Show comment
Hide comment
@thecodejunkie

thecodejunkie Nov 15, 2013

Member

Yes. A pull-request for this issue needs to take on the big picture

  • Update the DefaultRouteResolver
  • Verify the correct behaviour of the hosts
Member

thecodejunkie commented Nov 15, 2013

Yes. A pull-request for this issue needs to take on the big picture

  • Update the DefaultRouteResolver
  • Verify the correct behaviour of the hosts
@provegard

This comment has been minimized.

Show comment
Hide comment
@provegard

provegard Nov 19, 2013

Contributor

I experienced this bug as well, but with Nancy.Hosting.Aspnet. Pluses in URL paths are erroneously converted into spaces.

Also important: In .NET 4.5.1, Microsoft introduced a new app setting called "aspnet:UseLegacyRequestUrlGeneration". The default value is false, meaning that they change behavior so that the request URL passed to a HTTP handler now has escaped path segments. Prior to 4.5.1, or if the setting is set to true in Web.config, path segments passed to a HTTP handler are not escaped.

Contributor

provegard commented Nov 19, 2013

I experienced this bug as well, but with Nancy.Hosting.Aspnet. Pluses in URL paths are erroneously converted into spaces.

Also important: In .NET 4.5.1, Microsoft introduced a new app setting called "aspnet:UseLegacyRequestUrlGeneration". The default value is false, meaning that they change behavior so that the request URL passed to a HTTP handler now has escaped path segments. Prior to 4.5.1, or if the setting is set to true in Web.config, path segments passed to a HTTP handler are not escaped.

@provegard

This comment has been minimized.

Show comment
Hide comment
@provegard

provegard Dec 20, 2013

Contributor

I just tried to upgrade from Nancy 0.20.0 to latest, and had tests failing because the query string is now URL encoded - it wasn't before. I don't want to add "manual" decoding because I think it's related to this issue. Any update?

Contributor

provegard commented Dec 20, 2013

I just tried to upgrade from Nancy 0.20.0 to latest, and had tests failing because the query string is now URL encoded - it wasn't before. I don't want to add "manual" decoding because I think it's related to this issue. Any update?

@thecodejunkie

This comment has been minimized.

Show comment
Hide comment
@thecodejunkie

thecodejunkie Dec 20, 2013

Member

umm, that doesn't sound right

Member

thecodejunkie commented Dec 20, 2013

umm, that doesn't sound right

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Dec 20, 2013

Member

I believe the encoding was added but you shouldn't have to decode anything
manually

Member

jchannon commented Dec 20, 2013

I believe the encoding was added but you shouldn't have to decode anything
manually

@thecodejunkie

This comment has been minimized.

Show comment
Hide comment
@thecodejunkie

thecodejunkie Dec 20, 2013

Member

@provegard could you post your test so we can reproduce?

Member

thecodejunkie commented Dec 20, 2013

@provegard could you post your test so we can reproduce?

@provegard

This comment has been minimized.

Show comment
Hide comment
@provegard

provegard Dec 20, 2013

Contributor

Oops, looking at my test I see that I actually encode the URL (using HttpUtility.UrlEncode) before passing it as query string...

But that means that Nancy.Testing 0.20.0 would decode the query string for me whereas 0.21 doesn't?

Contributor

provegard commented Dec 20, 2013

Oops, looking at my test I see that I actually encode the URL (using HttpUtility.UrlEncode) before passing it as query string...

But that means that Nancy.Testing 0.20.0 would decode the query string for me whereas 0.21 doesn't?

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Dec 20, 2013

Member

You should be able to do

with.Query("name", "Fred") without the need for using http utility.

Either package should decode for you

On Friday, 20 December 2013, Per Rovegård wrote:

Oops, looking at my test I see that I actually encode the URL (using
HttpUtility.UrlEncode) before passing it as query string...

But that means that Nancy.Testing 0.20.0 would decode the query string for
me whereas 0.21 doesn't?


Reply to this email directly or view it on GitHubhttps://github.com/NancyFx/Nancy/issues/1280#issuecomment-31001076
.

Member

jchannon commented Dec 20, 2013

You should be able to do

with.Query("name", "Fred") without the need for using http utility.

Either package should decode for you

On Friday, 20 December 2013, Per Rovegård wrote:

Oops, looking at my test I see that I actually encode the URL (using
HttpUtility.UrlEncode) before passing it as query string...

But that means that Nancy.Testing 0.20.0 would decode the query string for
me whereas 0.21 doesn't?


Reply to this email directly or view it on GitHubhttps://github.com/NancyFx/Nancy/issues/1280#issuecomment-31001076
.

@provegard

This comment has been minimized.

Show comment
Hide comment
@provegard

provegard Dec 20, 2013

Contributor

@jchannon If I omit my manual encoding in my test case and pass a URL with a + in it, Nancy/Nancy.Testing 0.20.0 will decode the + to a space. I guess that is what this issue is about.

Nancy 0.21 doesn't do the decoding. But this issue hasn't been closed, so was the decoding removed in relation to fixing some other issue?

Contributor

provegard commented Dec 20, 2013

@jchannon If I omit my manual encoding in my test case and pass a URL with a + in it, Nancy/Nancy.Testing 0.20.0 will decode the + to a space. I guess that is what this issue is about.

Nancy 0.21 doesn't do the decoding. But this issue hasn't been closed, so was the decoding removed in relation to fixing some other issue?

@phillip-haydon

This comment has been minimized.

Show comment
Hide comment
@phillip-haydon

phillip-haydon Dec 22, 2013

Member

I'm pretty sure the test code was changed so when you added Query parameters it would encode them to simulate an actual query value without the need for encoding them yourself. (I think this is an issue @jchannon fixed?)

Member

phillip-haydon commented Dec 22, 2013

I'm pretty sure the test code was changed so when you added Query parameters it would encode them to simulate an actual query value without the need for encoding them yourself. (I think this is an issue @jchannon fixed?)

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Dec 22, 2013

Member

Yup thats what I did

Member

jchannon commented Dec 22, 2013

Yup thats what I did

@provegard

This comment has been minimized.

Show comment
Hide comment
@provegard

provegard Jan 2, 2014

Contributor

I think the test code change makes sense, but I also feel that it should be listed in Breaking changes. When upgrading Nancy, I always check Breaking changes. If there are none I really don't expect my test suite to start failing. Don't you agree?

Contributor

provegard commented Jan 2, 2014

I think the test code change makes sense, but I also feel that it should be listed in Breaking changes. When upgrading Nancy, I always check Breaking changes. If there are none I really don't expect my test suite to start failing. Don't you agree?

@grumpydev

This comment has been minimized.

Show comment
Hide comment
@grumpydev

grumpydev Jan 2, 2014

Member

Marked it as a breaking change - we will need to update the wiki for 0.22 as there's several in there now that havne't been added.

Member

grumpydev commented Jan 2, 2014

Marked it as a breaking change - we will need to update the wiki for 0.22 as there's several in there now that havne't been added.

@ChrisBrandhorst

This comment has been minimized.

Show comment
Hide comment
@ChrisBrandhorst

ChrisBrandhorst Jan 2, 2014

Contributor

What would be a convenient process for checking this? The change in DefaultRouteResolver is easy, but I personally can't run the tests (running on VS Express), and don't have the ability to try the other hosts...

Contributor

ChrisBrandhorst commented Jan 2, 2014

What would be a convenient process for checking this? The change in DefaultRouteResolver is easy, but I personally can't run the tests (running on VS Express), and don't have the ability to try the other hosts...

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Jan 2, 2014

Member

I think the plan is to remove the decoding from DefaultRouteResolver and add the decoding to the hosts where needed (ASP.NET does it automatically) and have some tests to prove the hosts are doing it.

Member

jchannon commented Jan 2, 2014

I think the plan is to remove the decoding from DefaultRouteResolver and add the decoding to the hosts where needed (ASP.NET does it automatically) and have some tests to prove the hosts are doing it.

@provegard

This comment has been minimized.

Show comment
Hide comment
@provegard

provegard Jan 2, 2014

Contributor

Please also see my note about aspnet:UseLegacyRequestUrlGeneration above - from .NET 4.5.1, ASP.NET will encode URL paths, but in earlier versions it didn't.

Contributor

provegard commented Jan 2, 2014

Please also see my note about aspnet:UseLegacyRequestUrlGeneration above - from .NET 4.5.1, ASP.NET will encode URL paths, but in earlier versions it didn't.

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Jan 13, 2014

Member

Anyone started on this yet? Just come across some code in my projet that I don't like 😄

var email = (string)parameters.email;
email = email.Replace(" ", "+");
Member

jchannon commented Jan 13, 2014

Anyone started on this yet? Just come across some code in my projet that I don't like 😄

var email = (string)parameters.email;
email = email.Replace(" ", "+");
@thecodejunkie

This comment has been minimized.

Show comment
Hide comment
@thecodejunkie

thecodejunkie Jan 13, 2014

Member

Nope =( x-mas leave, project deadline and being sick had built up a healthy backlog of stuff =(

Member

thecodejunkie commented Jan 13, 2014

Nope =( x-mas leave, project deadline and being sick had built up a healthy backlog of stuff =(

@albertjan

This comment has been minimized.

Show comment
Hide comment
@albertjan

albertjan Jan 14, 2014

Contributor

Should it even be decoded once?

Contributor

albertjan commented Jan 14, 2014

Should it even be decoded once?

@phillip-haydon

This comment has been minimized.

Show comment
Hide comment
@phillip-haydon

phillip-haydon Jan 27, 2014

Member

It should be decoded once, because the browser encodes the request.

The problem is deciding where it should be decoded, if host A decodes, and host B does not, then it means that it should be done at the host.

But if an OWIN web server does not decode, and IIS does decode, then we get differences when hosting OWIN in a console app vs IIS...

If we do it at host level then we get duplication everywhere...

What to doooooooo... I think this needs to be deferred from 0.22.0 until we can investigate further and figure out a proper solution.

Member

phillip-haydon commented Jan 27, 2014

It should be decoded once, because the browser encodes the request.

The problem is deciding where it should be decoded, if host A decodes, and host B does not, then it means that it should be done at the host.

But if an OWIN web server does not decode, and IIS does decode, then we get differences when hosting OWIN in a console app vs IIS...

If we do it at host level then we get duplication everywhere...

What to doooooooo... I think this needs to be deferred from 0.22.0 until we can investigate further and figure out a proper solution.

@grumpydev

This comment has been minimized.

Show comment
Hide comment
@grumpydev

grumpydev Jan 27, 2014

Member

Yeah, think we need to put this back.. there's lots of things that can potentially be encoded, and as there's no way we can detect if they're already encoded, we need to make sure there's a set rule that covers everything - either the engine dictates it will decode once so the hosts have to encode if necessary, or if it dictates it won't decode anything, we need to come to an agreement.

Member

grumpydev commented Jan 27, 2014

Yeah, think we need to put this back.. there's lots of things that can potentially be encoded, and as there's no way we can detect if they're already encoded, we need to make sure there's a set rule that covers everything - either the engine dictates it will decode once so the hosts have to encode if necessary, or if it dictates it won't decode anything, we need to come to an agreement.

@ChrisBrandhorst

This comment has been minimized.

Show comment
Hide comment
@ChrisBrandhorst

ChrisBrandhorst Jan 27, 2014

Contributor

Perhaps, since there are so many variables, a quick fix would be to be able to configure the encoding behavior of the DefaultRouteResolver in the bootstrapper?

Contributor

ChrisBrandhorst commented Jan 27, 2014

Perhaps, since there are so many variables, a quick fix would be to be able to configure the encoding behavior of the DefaultRouteResolver in the bootstrapper?

@RPM1984

This comment has been minimized.

Show comment
Hide comment
@RPM1984

RPM1984 Feb 13, 2014

Yep, +1 for this issue. Trying with the URL:
/users/a%2Bb%40c.com (users/a+b@c.com)

and when it gets into my module, it's "a b@c.com".

RPM1984 commented Feb 13, 2014

Yep, +1 for this issue. Trying with the URL:
/users/a%2Bb%40c.com (users/a+b@c.com)

and when it gets into my module, it's "a b@c.com".

@liddellj

This comment has been minimized.

Show comment
Hide comment
@liddellj

liddellj May 13, 2014

Contributor

I've encountering this as well, hosting under ASP.NET, see #1499 for details.

Contributor

liddellj commented May 13, 2014

I've encountering this as well, hosting under ASP.NET, see #1499 for details.

@thecodejunkie thecodejunkie modified the milestones: 0.24, 0.23 Jun 4, 2014

@grumpydev grumpydev modified the milestones: Future, 0.24 Jun 10, 2014

@ollnixon

This comment has been minimized.

Show comment
Hide comment
@ollnixon

ollnixon Jul 28, 2014

Is there any news on this issue?

Is there any news on this issue?

@rasmus-beck

This comment has been minimized.

Show comment
Hide comment
@rasmus-beck

rasmus-beck Sep 7, 2014

Contributor

Hi guys,

Just looking at what it would take to solve this issue. I have just run into a similar problem #1676.

As far as I can tell, this isn't just about where the decoding should be done. It is pretty obvious that if some hosts do this be default, then hosts should handle this problem.

But how the result is carried forth in the code, is also a problem. Nancy.Url isn't capable of handling a decoded path as I see it. If the encoded path contains %2f which is a /, then a decoded path in Nancy.Url will not be able to tell the difference between a separator and a segment that happens to contain a forward slash.

So perhaps the solution is to always force the Path property on Nancy.Url to be encoded, and then let Nancy.Url handle decoding by exposing a new property PathSegments, which is a string array of decoded path segments.

This also has the added benefit of being backwards compatible as I see it.

The route resolver should still be changed to not do decoding.

/rasmus

Contributor

rasmus-beck commented Sep 7, 2014

Hi guys,

Just looking at what it would take to solve this issue. I have just run into a similar problem #1676.

As far as I can tell, this isn't just about where the decoding should be done. It is pretty obvious that if some hosts do this be default, then hosts should handle this problem.

But how the result is carried forth in the code, is also a problem. Nancy.Url isn't capable of handling a decoded path as I see it. If the encoded path contains %2f which is a /, then a decoded path in Nancy.Url will not be able to tell the difference between a separator and a segment that happens to contain a forward slash.

So perhaps the solution is to always force the Path property on Nancy.Url to be encoded, and then let Nancy.Url handle decoding by exposing a new property PathSegments, which is a string array of decoded path segments.

This also has the added benefit of being backwards compatible as I see it.

The route resolver should still be changed to not do decoding.

/rasmus

@rasmus-beck

This comment has been minimized.

Show comment
Hide comment
@rasmus-beck

rasmus-beck Sep 7, 2014

Contributor

I am thinking something along the lines of this commit: rasmus-beck@6bdc2ca

I still need to create a number of tests to cover all hosts, and make sure I have actually looked at all hosts.

That does introduce two breaking changes on IRouteResolverTrie, but we can't really get around this as I see it.

/rasmus

Contributor

rasmus-beck commented Sep 7, 2014

I am thinking something along the lines of this commit: rasmus-beck@6bdc2ca

I still need to create a number of tests to cover all hosts, and make sure I have actually looked at all hosts.

That does introduce two breaking changes on IRouteResolverTrie, but we can't really get around this as I see it.

/rasmus

@thecodejunkie thecodejunkie modified the milestones: 1.0 Alpha, Future Sep 8, 2014

@khellang khellang modified the milestones: 1.0 Beta, 1.0 Alpha Jan 27, 2015

@grumpydev grumpydev removed this from the 1.0 Beta milestone Feb 4, 2015

@ChrisBrandhorst ChrisBrandhorst removed this from the 1.0 Beta milestone Feb 4, 2015

@kenotron

This comment has been minimized.

Show comment
Hide comment
@kenotron

kenotron Apr 6, 2015

This is a pretty serious flaw. Why don't we create a UrlPathDecode that is mostly what UrlDecode does in HttpUtility and use THAT inside the parts where you are decoding a path? So, the only thing you need is to modify the DefaultRouteResolver.cs to use that. For everyone else out there, you can get by with this blog's recommendations:

http://programmaticallyspeaking.com/the-mysteriously-escaped-request-path-in-aspnet.html

kenotron commented Apr 6, 2015

This is a pretty serious flaw. Why don't we create a UrlPathDecode that is mostly what UrlDecode does in HttpUtility and use THAT inside the parts where you are decoding a path? So, the only thing you need is to modify the DefaultRouteResolver.cs to use that. For everyone else out there, you can get by with this blog's recommendations:

http://programmaticallyspeaking.com/the-mysteriously-escaped-request-path-in-aspnet.html

@simon-brooke

This comment has been minimized.

Show comment
Hide comment
@simon-brooke

simon-brooke Jul 15, 2015

I've hit this bug this morning, and, assuming that double-decoding was going on, tried both:

http://localhost:3579/Json/PointsMatching/%2B

and

http://localhost:3579/Json/PointsMatching/%252B

Both of these were received serverside as spaces, so it's not quite as simple as double-decoding, since double-decoding should decode

'%252B' to '%2B' and subsequently to '+'

I had planned heroically to contribute a patch, but having read through the comments above I see that it's substantially more subtle than I had supposed!

I've hit this bug this morning, and, assuming that double-decoding was going on, tried both:

http://localhost:3579/Json/PointsMatching/%2B

and

http://localhost:3579/Json/PointsMatching/%252B

Both of these were received serverside as spaces, so it's not quite as simple as double-decoding, since double-decoding should decode

'%252B' to '%2B' and subsequently to '+'

I had planned heroically to contribute a patch, but having read through the comments above I see that it's substantially more subtle than I had supposed!

@taharrison

This comment has been minimized.

Show comment
Hide comment
@taharrison

taharrison Jul 27, 2015

Hey guys.

I just tried to patch the behaviour of url-decoding in DefaultRouteResolver to get around a bug with the treatment of encoded /s. But I've closed my pull request, because it relied on the DefaultRouteResolver being responsible for URL Decoding, which from the discussion here it seems it shouldn't be.

I carefully documented the issues I saw though, so am reproducing the summary and test cases here so that they can help inform future work on this issue:

Currently the DefaultRouteResolver url-decodes capture groups before identifying where the capture groups begin and end (i.e. splitting on "/")

This leads to odd behavior around encoded "/"s.

Examples:

With routes

"foo/{bar}/baz"
"foo/{fuzz}"

a request to "foo/bar%2fbaz" will resolve to route "foo/{bar}/baz", despite the "/" having been encoded (and should therefore not be treated as a path separator).

a request to "foo/3-slashes-looks-like-"%2f%2f%2f" resolves to "foo/{fuzz*}" (but not "foo/{fuzz}") with a Fuzz parameter initialised to "3-slashes-looks-like-"/"", rather than the expected "3-slashes-looks-like-"///""

The fix is to delay url decoding until /after/ the path has been split on the "/" characters.

The following tests demonstrated the above behaviour as part of DefaultRouteResolverFixture

The first 4 demonstrate current incorrect behaviour around encoded "/"; the last 5 demonstrate current supported behaviour around url decoding /within DefaultRouteResolver/.

    [Theory]
    [InlineData("/bleh/test%2fbar", "GreedyOnEnd", "test/bar")]
    [InlineData("/foo/test%2ffuzz/plop", "Captured", "test/fuzz")]
    [InlineData("/bleh/test%2f%2ffuzz/bar", "GreedyInMiddle", "test//fuzz")]
    [InlineData("/foo/%21%23%24%26%27%28%29%2A%2B%2C%2F%3A%3B%3D%3F%40%5B%5D/plop", "Captured", "!#$&'()*+,/:;=?@[]")]
    public void Should_resolve_and_decode_capture_groups_with_encoded_forward_slashes(string path, string captureType, string expected)
    {
        //Given, When
        var browser = InitBrowser(true);
        var result = browser.Get(path);

        //Then
        result.Body.AsString().ShouldEqual(captureType + " " + expected);
    }

    [Theory]
     [InlineData("/bleh/this%2fis%2fsome%2fstuff/bar", "GreedyInMiddle", "this/is/some/stuff")]
     [InlineData("/bleh/this%2Fis%2Fsome%2Fstuff/bar", "GreedyInMiddle", "this/is/some/stuff")]
     [InlineData("/foo/double%252fencoded/plop", "Captured", "double%2fencoded")]
     [InlineData("/foo/%61%62%63%31%32%33/plop", "Captured", "abc123")]
     [InlineData("/foo/%21%23%24%26%27%28%29%2A%2B%2C%3A%3B%3D%3F%40%5B%5D/plop", "Captured", "!#$&'()*+,:;=?@[]")]
     public void Should_url_decode_captures(string path, string captureType, string expected)
     {
         //Given, When
         var browser = InitBrowser(true);
         var result = browser.Get(path);

         //Then
         result.Body.AsString().ShouldEqual(captureType + " " + expected);
     }

Hey guys.

I just tried to patch the behaviour of url-decoding in DefaultRouteResolver to get around a bug with the treatment of encoded /s. But I've closed my pull request, because it relied on the DefaultRouteResolver being responsible for URL Decoding, which from the discussion here it seems it shouldn't be.

I carefully documented the issues I saw though, so am reproducing the summary and test cases here so that they can help inform future work on this issue:

Currently the DefaultRouteResolver url-decodes capture groups before identifying where the capture groups begin and end (i.e. splitting on "/")

This leads to odd behavior around encoded "/"s.

Examples:

With routes

"foo/{bar}/baz"
"foo/{fuzz}"

a request to "foo/bar%2fbaz" will resolve to route "foo/{bar}/baz", despite the "/" having been encoded (and should therefore not be treated as a path separator).

a request to "foo/3-slashes-looks-like-"%2f%2f%2f" resolves to "foo/{fuzz*}" (but not "foo/{fuzz}") with a Fuzz parameter initialised to "3-slashes-looks-like-"/"", rather than the expected "3-slashes-looks-like-"///""

The fix is to delay url decoding until /after/ the path has been split on the "/" characters.

The following tests demonstrated the above behaviour as part of DefaultRouteResolverFixture

The first 4 demonstrate current incorrect behaviour around encoded "/"; the last 5 demonstrate current supported behaviour around url decoding /within DefaultRouteResolver/.

    [Theory]
    [InlineData("/bleh/test%2fbar", "GreedyOnEnd", "test/bar")]
    [InlineData("/foo/test%2ffuzz/plop", "Captured", "test/fuzz")]
    [InlineData("/bleh/test%2f%2ffuzz/bar", "GreedyInMiddle", "test//fuzz")]
    [InlineData("/foo/%21%23%24%26%27%28%29%2A%2B%2C%2F%3A%3B%3D%3F%40%5B%5D/plop", "Captured", "!#$&'()*+,/:;=?@[]")]
    public void Should_resolve_and_decode_capture_groups_with_encoded_forward_slashes(string path, string captureType, string expected)
    {
        //Given, When
        var browser = InitBrowser(true);
        var result = browser.Get(path);

        //Then
        result.Body.AsString().ShouldEqual(captureType + " " + expected);
    }

    [Theory]
     [InlineData("/bleh/this%2fis%2fsome%2fstuff/bar", "GreedyInMiddle", "this/is/some/stuff")]
     [InlineData("/bleh/this%2Fis%2Fsome%2Fstuff/bar", "GreedyInMiddle", "this/is/some/stuff")]
     [InlineData("/foo/double%252fencoded/plop", "Captured", "double%2fencoded")]
     [InlineData("/foo/%61%62%63%31%32%33/plop", "Captured", "abc123")]
     [InlineData("/foo/%21%23%24%26%27%28%29%2A%2B%2C%3A%3B%3D%3F%40%5B%5D/plop", "Captured", "!#$&'()*+,:;=?@[]")]
     public void Should_url_decode_captures(string path, string captureType, string expected)
     {
         //Given, When
         var browser = InitBrowser(true);
         var result = browser.Get(path);

         //Then
         result.Body.AsString().ShouldEqual(captureType + " " + expected);
     }

@damianh damianh closed this Sep 24, 2015

@thecodejunkie

This comment has been minimized.

Show comment
Hide comment
@thecodejunkie

thecodejunkie Sep 24, 2015

Member

This is unsolveable .. all framework suffer from this on all platforms

Member

thecodejunkie commented Sep 24, 2015

This is unsolveable .. all framework suffer from this on all platforms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment