FromRoute parameters aren't URL Decoded. #4599
Comments
@Zoltu there was a bug in this in the past, but it was fixed in the RC1 release. I just tried it in the latest RC2 nightly builds and I don't see the bug for query strings or route data. Can you clarify which build of ASP.NET Core you are using? |
RC1 final. Il try with RC2 This evening and post here whether it resolves the issue or not. |
I am not able to reproduce this behavior on RC1 final. In both cases I see the model bound value as decoded. |
I just tested this again in a brand new ASPNET Core application created in Visual Studio 2015 using the Web template (https://docs.asp.net/en/latest/tutorials/your-first-aspnet-application.html). I added Swashbuckle (via NuGet) for easy testing and made both of the methods simply return The working method, where the URL encoded string is in a query parameter: The not working method, where the URL encoded string is in a route parameter: You can see that Swashbuckle encoded the URL the same in both requests, just one in the query string and one in the route. However, the results from the two calls were different, despite the actions being the same except for the |
@Zoltu thanks for the extra info, I'll have someone try this out and see what's going on here. Very odd. |
@ryanbrandenburg, a gift for you. |
Hello @Zoltu! I did some investigation and found some interesting behaviors here. The first thing I found was that in your example the route response was partially URL decoded. The only character which wasn't decoded was %2f, which is
you would hit aspnet/KestrelHttpServer#124 appears to mention this behavior and it sounds like it's intentional (or at least known). @Tratcher and @troydai can hopefully confirm my reading of that. If my reading was correct then I think the guidance would be not to use [FromRoute] on any values which are likely to contain |
Good catch on it only not decoding the I did discover the difference when IIS hosting (http://stackoverflow.com/questions/37178949/how-do-i-allow-url-encoded-path-segments-in-azure). I am of the opinion that what IIS is doing is _incorrect behavior_ because it is negating the entire point of URL encoding. The point of URL encoding is to allow an HTTP request to pass data via the URL (route, query string, etc.) through to the application without it being interpreted by the routing engine. In this case, IIS is decoding my encoded value and then using the decoded value as part of its routing logic. I would encourage Kestrel to _not_ follow IIS's lead on this one and instead have the web server behave to spec. References:Uniform Resource Identifier (URI): Generic Syntax § 2.4. When to Encode or Decode
Uniform Resource Locators § 2.2. URL Character Encoding Issues (emphasis mine)
It sounds like this is an issue with Kestrel and not with MVC. Should I move this issue over there? |
Yeah I think IIS/http.sys do this for "security purposes" and doing the "extra" (i.e. wrong) decoding. Unfortunately, I think that once it does its decoding there no way to know how to get that raw data back because you don't know if what you see as a |
@Tratcher @blowdart need your view on this. I think this is technically by design, given how IIS/http.sys are designed to work. I think that raw Kestrel has the correct RFC behavior. And I'll perhaps regret asking this question, but should we modify Kestrel to have the same behavior as IIS/http.sys and super-decode |
Talked to @Tratcher and this is presently by design. The IIS/http.sys feature was originally for security reasons when URLs typically mapped to physical files, and at this point for compat reasons that can't change. Kestrel, on the other hand, is a brand new server, and so does not have this legacy behavior. |
I like how you closed this before I could give you an answer you'll regret from me :p |
@Eilon Now that we have established that Kestrel should not be URL decoding per the RFC, we are back to the original issue of MVC not decoding the Unless I am misunderstanding which part is by design, I think this issue should be reopened since the original problem is still outstanding. I suspect we just got side tracked on the IIS thing and which part of the pipeline should decode. |
@Zoltu you are right. And when you're right, you're right. And you, you're always right! 😄 Yes, it seems that "something" (not sure what "something" is) should be URL decoding this before the app sees it... |
%2F is the only one that changes the definition of the path (excluding |
If we're talking about the path, we're talking about routing. Nothing else in MVC looks at the path. What do you think should happen for a route + path like the following?
Should What should link generation output if you put What about a case like this?
Should What should link generation output if you put |
I got a chance to sit down at a computer and throw a quick test together for the behavior that I was afraid of. My contention here is that per my interpretation of the RFC, Kestrel should not be doing any* (see last paragraph) URL decoding and MVC should not be doing URL decoding until model binding time (after routing). Based on what I have seen when looking briefly at the Kestrel code, it appears that it is doing URL decoding which leads to the incorrect MVC action being called, or in some cases the correct MVC action being called but with an incompletely decoded model. My original bug report only touched on the surface of this problem because it happened to be what I ran into but I believe the problem is much deeper and a fundamental flaw in the way URL Decoding is handled in Kestrel + MVC. Given the following code and ASP.NET RC2 (including Kestrel RC2): using System;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
namespace TestWeb.Controllers
{
[Route("api/[controller]")]
public class ValuesController : Controller
{
[Route("foo&bar")]
public async Task<String> Foo()
{
return "bar";
}
[Route("foo%26bar")]
public async Task<String> Apple()
{
return "apple";
}
[Route("{zip}")]
public async Task<String> Zip([FromRoute] String zip)
{
return $"zip: {zip}";
}
}
}
Now this is a bit of a can of worms but the RFC special cases "uppercase and lowercase letters, decimal digits, hyphen, period, underscore, and tilde." and specifically says that they should be normalized to and handled in their unencoded form. This means that given the routes |
How many of these scenarios have you tested against Asp.Net 4 / MVC 5? |
Note when running behind IIS we'll be subject to most of the same behaviors as Asp.Net 4 because they're implemented in Http.Sys. |
Yes,
Author should be I'm not familiar with MVC link generation so I probably shouldn't comment on it but I will anyway. :D In an ideal world (without humans), the link generation would generate links that were as guaranteed as possible to get back to the original route. This means URL Encoding everything except for the specially-handled characters listed in my previous comment (letters, digits, etc.). I suspect though that humans won't like that much, despite it giving the most desirable behavior in terms of guaranteeing proper routing in the face of all current and future possible routes. |
None. I see Kestrel and ASP.NET Core as an opportunity to make breaking changes with IIS and MVC 5 that bring .NET in line with various RFCs and web standards. If maintaining consistent behavior (despite it being counter to the RFC) is desired then most of my arguments here fall apart. :) Microsoft has a an unfortunate negative image in the developer community with regards to not following web standards in the past (IE 6, IIS, etc.). I would love to see Kestrel / ASP.NET come into line with existing RFCs where possible to help shed this image and encourage adoption, which is why I am pushing a bit on this (my specific use case can easily be worked around in a number of ways). I do understand that IIS will continue to behave the way it always has, which includes its URL Rewrite stuff. My long term plan is to stop using IIS if possible (I'm hoping that Kestrel can achieve similar performance to IIS now or in the future). |
A quick observation - if we decide to take on this change, we'd need to make this change to about any middleware that deals with app.Map("/a&b", app =>
{
app.UseMvc();
});
app.Map("/a%26b", app =>
{
app.UseStaticFiles(new StaticFileOptions
{
RequestPath = "/c%2fd"
});
}); Each of the middlewares in this sample, Map, StaticFiles, and Routing needs to now work with unencoded urls and know how to consume and produce url segments that are unencoded. |
@pranavkm the three fields in your samples are explicitly PathStrings, which by definition uses the decoded representation. That's too fundamental to ever change. |
@Tratcher, where does that leave us? If we specifically have to make this change in Routing, we'd have to know how much of the unencoded path has already been consumed give the current RequestPath and route the rest. That sounds pretty complicated. |
Going back to the @rynowak's comment from #4599 (comment), it doesn't look like we could round trip the value correctly - there isn't an mechanism in place to tell what the token's value was over the wire. Given the other issue with PathStrings expressly disallowing encoded values, there's not a whole lot left to do this in this work item. I'll go ahead and close this for now. |
Just read the entire thread and I am pretty disappointed that this hasn't been fixed at all. Currently an ASP.NET Core application will behave entirely differently depending on where it is being run: Kestrel: Decodes partially (WTF!???) Since we all agree that we don't want Kestrel to become as shit as IIS and therefore not match it's shitty behaviour I wonder why did Kestrel go along with this weird partial decoding strategy? What is this about? Why can Kestrel not just be cool and NOT decode route arguments, so that MVC/Nancy/{your-web-framework-of-choice} can handle it consistently? Currently it's an absolute bloody mess. A web framework should be able to decide whether the route argument should be left entirely decoded or encoded, but please not some half baked crap which makes it impossible to EVER know what the user has really sent across the wire. Please can this issue be looked at again with a fresh pair of eyes? You had 3 choices: why did you go with c.)?? No seriously, why? /rantover |
One more thing which I would like to add, the current implementation makes web applications do funny logic, which is really not obvious to anyone who didn't study this thread. For example, a developer might implement some logic which will find and replace all It is such an obscure oddity that code has to be written specifically for Kestrel. I would suggest at least to have a method like |
Comments on closed issues are not tracked, please open a new issue with the details for your scenario. E.g. which specific encoded characters are you having trouble with? |
In the first example, a URL encoded string passed as the
value
query string parameter will be decoded before being passed into theGetValue
function.In the second example, a URL encoded string passed as a route segment will NOT be URL decoded before being passed to the
GetValue
function.This behavior is unexpected, I would expect both Route and Query parameters to be URL decoded before being passed to the function.
If this is expected behavior, is there a workaround or a way to tell the route to decode before passing the parameter through? The problem is made more complex in my real world example because the route parameter is a URL and I want the parameter to be a
Uri
rather than aString
. My current solution is to accept aString
parameter and then construct anew Uri(value)
out of it. However, this means I don't benefit from ModelState validation since the only a string is necessary to have the model state be valid.The text was updated successfully, but these errors were encountered: