Skip to content
This repository has been archived by the owner. It is now read-only.

Incoming URL is decoded twice (at least with Nancy.Hosting.Self) #1280

Closed
ChrisBrandhorst opened this issue Oct 20, 2013 · 36 comments
Closed

Incoming URL is decoded twice (at least with Nancy.Hosting.Self) #1280

ChrisBrandhorst opened this issue Oct 20, 2013 · 36 comments

Comments

@ChrisBrandhorst
Copy link
Contributor

@ChrisBrandhorst ChrisBrandhorst commented Oct 20, 2013

When using Nancy.Hosting.Self (at least in this case), an incoming URL is decoded using HttpUtility.UrlDecode twice.

First, in NancyHost.cs#ConvertRequestToNancyRequest. The resulting Request is then passed to the routeresolver, where in DefaultRouteResolver#ResolveResult, it is decoded again.

This is a problem when you have URLs like

/containers/123/artist/AC%252FDC

combined with the following route:

/containers/{id}/{filter*}

In this case, the filter part is a string I would like to split on the slash again (using the slash like this looks nicer in the UI). However, because of the double decoding, the resulting value for filter is AC/DC, where I would like to have AC%2FDC

@phillip-haydon
Copy link
Member

@phillip-haydon phillip-haydon commented Nov 14, 2013

This will be cause by a change I made a while ago.

@thecodejunkie if you remember we had that issue where routes were not being decoded before route checking was occuring. This happened when doing it on an asp.net host.

So we decoded it, but it looks like self hosting is also decoding as well.

The code needs to remove the decoding from the NancyHost and only do it at the RouteResolver which will happen for all hosts rather than specific hosts.

@jchannon
Copy link
Member

@jchannon jchannon commented Nov 14, 2013

Nope the opposite way around. AspNet hosting already decodes when the request comes in and then the execution path goes to DefaultRouteResolver and decodes again. I think removing that decode from DefaultRouteResolver is better and let all hosts be responsible for decoding.

AspNet example - https://github.com/NancyFx/Nancy/blob/master/src/Nancy.Hosting.Aspnet/NancyHttpRequestHandler.cs#L79

@thecodejunkie
Copy link
Member

@thecodejunkie thecodejunkie commented Nov 15, 2013

Hosts should decode, route resolve should not.

@jchannon
Copy link
Member

@jchannon jchannon commented Nov 15, 2013

High 5 me :)

@ChrisBrandhorst
Copy link
Contributor Author

@ChrisBrandhorst ChrisBrandhorst commented Nov 15, 2013

Allright, I can make a PR for the DefaultRouteResolver, but isn't this change possible breaking?
Mayby all hosts should be checked if they actually do the decoding?

@thecodejunkie
Copy link
Member

@thecodejunkie thecodejunkie commented Nov 15, 2013

Yes. A pull-request for this issue needs to take on the big picture

  • Update the DefaultRouteResolver
  • Verify the correct behaviour of the hosts
@provegard
Copy link
Contributor

@provegard provegard commented Nov 19, 2013

I experienced this bug as well, but with Nancy.Hosting.Aspnet. Pluses in URL paths are erroneously converted into spaces.

Also important: In .NET 4.5.1, Microsoft introduced a new app setting called "aspnet:UseLegacyRequestUrlGeneration". The default value is false, meaning that they change behavior so that the request URL passed to a HTTP handler now has escaped path segments. Prior to 4.5.1, or if the setting is set to true in Web.config, path segments passed to a HTTP handler are not escaped.

@provegard
Copy link
Contributor

@provegard provegard commented Dec 20, 2013

I just tried to upgrade from Nancy 0.20.0 to latest, and had tests failing because the query string is now URL encoded - it wasn't before. I don't want to add "manual" decoding because I think it's related to this issue. Any update?

@thecodejunkie
Copy link
Member

@thecodejunkie thecodejunkie commented Dec 20, 2013

umm, that doesn't sound right

@jchannon
Copy link
Member

@jchannon jchannon commented Dec 20, 2013

I believe the encoding was added but you shouldn't have to decode anything
manually

@thecodejunkie
Copy link
Member

@thecodejunkie thecodejunkie commented Dec 20, 2013

@provegard could you post your test so we can reproduce?

@provegard
Copy link
Contributor

@provegard provegard commented Dec 20, 2013

Oops, looking at my test I see that I actually encode the URL (using HttpUtility.UrlEncode) before passing it as query string...

But that means that Nancy.Testing 0.20.0 would decode the query string for me whereas 0.21 doesn't?

@jchannon
Copy link
Member

@jchannon jchannon commented Dec 20, 2013

You should be able to do

with.Query("name", "Fred") without the need for using http utility.

Either package should decode for you

On Friday, 20 December 2013, Per Rovegård wrote:

Oops, looking at my test I see that I actually encode the URL (using
HttpUtility.UrlEncode) before passing it as query string...

But that means that Nancy.Testing 0.20.0 would decode the query string for
me whereas 0.21 doesn't?


Reply to this email directly or view it on GitHubhttps://github.com//issues/1280#issuecomment-31001076
.

@provegard
Copy link
Contributor

@provegard provegard commented Dec 20, 2013

@jchannon If I omit my manual encoding in my test case and pass a URL with a + in it, Nancy/Nancy.Testing 0.20.0 will decode the + to a space. I guess that is what this issue is about.

Nancy 0.21 doesn't do the decoding. But this issue hasn't been closed, so was the decoding removed in relation to fixing some other issue?

@phillip-haydon
Copy link
Member

@phillip-haydon phillip-haydon commented Dec 22, 2013

I'm pretty sure the test code was changed so when you added Query parameters it would encode them to simulate an actual query value without the need for encoding them yourself. (I think this is an issue @jchannon fixed?)

@jchannon
Copy link
Member

@jchannon jchannon commented Dec 22, 2013

Yup thats what I did

@provegard
Copy link
Contributor

@provegard provegard commented Jan 2, 2014

I think the test code change makes sense, but I also feel that it should be listed in Breaking changes. When upgrading Nancy, I always check Breaking changes. If there are none I really don't expect my test suite to start failing. Don't you agree?

@grumpydev
Copy link
Member

@grumpydev grumpydev commented Jan 2, 2014

Marked it as a breaking change - we will need to update the wiki for 0.22 as there's several in there now that havne't been added.

@ChrisBrandhorst
Copy link
Contributor Author

@ChrisBrandhorst ChrisBrandhorst commented Jan 2, 2014

What would be a convenient process for checking this? The change in DefaultRouteResolver is easy, but I personally can't run the tests (running on VS Express), and don't have the ability to try the other hosts...

@jchannon
Copy link
Member

@jchannon jchannon commented Jan 2, 2014

I think the plan is to remove the decoding from DefaultRouteResolver and add the decoding to the hosts where needed (ASP.NET does it automatically) and have some tests to prove the hosts are doing it.

@provegard
Copy link
Contributor

@provegard provegard commented Jan 2, 2014

Please also see my note about aspnet:UseLegacyRequestUrlGeneration above - from .NET 4.5.1, ASP.NET will encode URL paths, but in earlier versions it didn't.

@jchannon
Copy link
Member

@jchannon jchannon commented Jan 13, 2014

Anyone started on this yet? Just come across some code in my projet that I don't like 😄

var email = (string)parameters.email;
email = email.Replace(" ", "+");
@thecodejunkie
Copy link
Member

@thecodejunkie thecodejunkie commented Jan 13, 2014

Nope =( x-mas leave, project deadline and being sick had built up a healthy backlog of stuff =(

@albertjan
Copy link
Contributor

@albertjan albertjan commented Jan 14, 2014

Should it even be decoded once?

@phillip-haydon
Copy link
Member

@phillip-haydon phillip-haydon commented Jan 27, 2014

It should be decoded once, because the browser encodes the request.

The problem is deciding where it should be decoded, if host A decodes, and host B does not, then it means that it should be done at the host.

But if an OWIN web server does not decode, and IIS does decode, then we get differences when hosting OWIN in a console app vs IIS...

If we do it at host level then we get duplication everywhere...

What to doooooooo... I think this needs to be deferred from 0.22.0 until we can investigate further and figure out a proper solution.

@grumpydev
Copy link
Member

@grumpydev grumpydev commented Jan 27, 2014

Yeah, think we need to put this back.. there's lots of things that can potentially be encoded, and as there's no way we can detect if they're already encoded, we need to make sure there's a set rule that covers everything - either the engine dictates it will decode once so the hosts have to encode if necessary, or if it dictates it won't decode anything, we need to come to an agreement.

@ChrisBrandhorst
Copy link
Contributor Author

@ChrisBrandhorst ChrisBrandhorst commented Jan 27, 2014

Perhaps, since there are so many variables, a quick fix would be to be able to configure the encoding behavior of the DefaultRouteResolver in the bootstrapper?

@RPM1984
Copy link

@RPM1984 RPM1984 commented Feb 13, 2014

Yep, +1 for this issue. Trying with the URL:
/users/a%2Bb%40c.com (users/a+b@c.com)

and when it gets into my module, it's "a b@c.com".

@liddellj
Copy link
Contributor

@liddellj liddellj commented May 13, 2014

I've encountering this as well, hosting under ASP.NET, see #1499 for details.

@thecodejunkie thecodejunkie modified the milestones: 0.24, 0.23 Jun 4, 2014
@grumpydev grumpydev modified the milestones: Future, 0.24 Jun 10, 2014
@ollnixon
Copy link

@ollnixon ollnixon commented Jul 28, 2014

Is there any news on this issue?

@rasmus-beck
Copy link
Contributor

@rasmus-beck rasmus-beck commented Sep 7, 2014

Hi guys,

Just looking at what it would take to solve this issue. I have just run into a similar problem #1676.

As far as I can tell, this isn't just about where the decoding should be done. It is pretty obvious that if some hosts do this be default, then hosts should handle this problem.

But how the result is carried forth in the code, is also a problem. Nancy.Url isn't capable of handling a decoded path as I see it. If the encoded path contains %2f which is a /, then a decoded path in Nancy.Url will not be able to tell the difference between a separator and a segment that happens to contain a forward slash.

So perhaps the solution is to always force the Path property on Nancy.Url to be encoded, and then let Nancy.Url handle decoding by exposing a new property PathSegments, which is a string array of decoded path segments.

This also has the added benefit of being backwards compatible as I see it.

The route resolver should still be changed to not do decoding.

/rasmus

@rasmus-beck
Copy link
Contributor

@rasmus-beck rasmus-beck commented Sep 7, 2014

I am thinking something along the lines of this commit: rasmus-beck@6bdc2ca

I still need to create a number of tests to cover all hosts, and make sure I have actually looked at all hosts.

That does introduce two breaking changes on IRouteResolverTrie, but we can't really get around this as I see it.

/rasmus

@thecodejunkie thecodejunkie modified the milestones: 1.0 Alpha, Future Sep 8, 2014
@khellang khellang modified the milestones: 1.0 Beta, 1.0 Alpha Jan 27, 2015
@grumpydev grumpydev removed this from the 1.0 Beta milestone Feb 4, 2015
@ChrisBrandhorst ChrisBrandhorst removed this from the 1.0 Beta milestone Feb 4, 2015
@kenotron
Copy link

@kenotron kenotron commented Apr 6, 2015

This is a pretty serious flaw. Why don't we create a UrlPathDecode that is mostly what UrlDecode does in HttpUtility and use THAT inside the parts where you are decoding a path? So, the only thing you need is to modify the DefaultRouteResolver.cs to use that. For everyone else out there, you can get by with this blog's recommendations:

http://programmaticallyspeaking.com/the-mysteriously-escaped-request-path-in-aspnet.html

@simon-brooke
Copy link

@simon-brooke simon-brooke commented Jul 15, 2015

I've hit this bug this morning, and, assuming that double-decoding was going on, tried both:

http://localhost:3579/Json/PointsMatching/%2B

and

http://localhost:3579/Json/PointsMatching/%252B

Both of these were received serverside as spaces, so it's not quite as simple as double-decoding, since double-decoding should decode

'%252B' to '%2B' and subsequently to '+'

I had planned heroically to contribute a patch, but having read through the comments above I see that it's substantially more subtle than I had supposed!

@taharrison
Copy link

@taharrison taharrison commented Jul 27, 2015

Hey guys.

I just tried to patch the behaviour of url-decoding in DefaultRouteResolver to get around a bug with the treatment of encoded /s. But I've closed my pull request, because it relied on the DefaultRouteResolver being responsible for URL Decoding, which from the discussion here it seems it shouldn't be.

I carefully documented the issues I saw though, so am reproducing the summary and test cases here so that they can help inform future work on this issue:

Currently the DefaultRouteResolver url-decodes capture groups before identifying where the capture groups begin and end (i.e. splitting on "/")

This leads to odd behavior around encoded "/"s.

Examples:

With routes

"foo/{bar}/baz"
"foo/{fuzz}"

a request to "foo/bar%2fbaz" will resolve to route "foo/{bar}/baz", despite the "/" having been encoded (and should therefore not be treated as a path separator).

a request to "foo/3-slashes-looks-like-"%2f%2f%2f" resolves to "foo/{fuzz*}" (but not "foo/{fuzz}") with a Fuzz parameter initialised to "3-slashes-looks-like-"/"", rather than the expected "3-slashes-looks-like-"///""

The fix is to delay url decoding until /after/ the path has been split on the "/" characters.

The following tests demonstrated the above behaviour as part of DefaultRouteResolverFixture

The first 4 demonstrate current incorrect behaviour around encoded "/"; the last 5 demonstrate current supported behaviour around url decoding /within DefaultRouteResolver/.

    [Theory]
    [InlineData("/bleh/test%2fbar", "GreedyOnEnd", "test/bar")]
    [InlineData("/foo/test%2ffuzz/plop", "Captured", "test/fuzz")]
    [InlineData("/bleh/test%2f%2ffuzz/bar", "GreedyInMiddle", "test//fuzz")]
    [InlineData("/foo/%21%23%24%26%27%28%29%2A%2B%2C%2F%3A%3B%3D%3F%40%5B%5D/plop", "Captured", "!#$&'()*+,/:;=?@[]")]
    public void Should_resolve_and_decode_capture_groups_with_encoded_forward_slashes(string path, string captureType, string expected)
    {
        //Given, When
        var browser = InitBrowser(true);
        var result = browser.Get(path);

        //Then
        result.Body.AsString().ShouldEqual(captureType + " " + expected);
    }

    [Theory]
     [InlineData("/bleh/this%2fis%2fsome%2fstuff/bar", "GreedyInMiddle", "this/is/some/stuff")]
     [InlineData("/bleh/this%2Fis%2Fsome%2Fstuff/bar", "GreedyInMiddle", "this/is/some/stuff")]
     [InlineData("/foo/double%252fencoded/plop", "Captured", "double%2fencoded")]
     [InlineData("/foo/%61%62%63%31%32%33/plop", "Captured", "abc123")]
     [InlineData("/foo/%21%23%24%26%27%28%29%2A%2B%2C%3A%3B%3D%3F%40%5B%5D/plop", "Captured", "!#$&'()*+,:;=?@[]")]
     public void Should_url_decode_captures(string path, string captureType, string expected)
     {
         //Given, When
         var browser = InitBrowser(true);
         var result = browser.Get(path);

         //Then
         result.Body.AsString().ShouldEqual(captureType + " " + expected);
     }
@damianh damianh closed this Sep 24, 2015
@thecodejunkie
Copy link
Member

@thecodejunkie thecodejunkie commented Sep 24, 2015

This is unsolveable .. all framework suffer from this on all platforms

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.