Params part of the routable URL are not escaped, specifically a slash #14636

Closed
maletor opened this Issue Apr 7, 2014 · 15 comments

Projects

None yet

5 participants

@maletor
Contributor
maletor commented Apr 7, 2014

In reference to #14070, a slash found in the params of a route such as

get 'foo/:token where :token contains a "/" would return a 404 because :token does not get URL escaped.

:token should probably be escaped just as it would a regular param (after the ?).

@maletor maletor changed the title from params part of url are not escaped, specifically a slash to Params part of the routable URL are not escaped, specifically a slash Apr 7, 2014
@heironimus
Contributor

@maletor, your comment below from the linked issue seems to describe the issue a little better.

When the param is expected as part of the URL, like this

get 'foo/:param' => 'foo#bar', as: :foo

link_to foo_url(param: 'slash/slash', other_param: 'slash/slash')

The param param is not escaped while the other_param param is escaped.

@pixeltrix pixeltrix added a commit that referenced this issue Apr 20, 2014
@nanaya @pixeltrix nanaya + pixeltrix Make URL escaping more consistent
1. Escape '%' characters in URLs - only unescaped data
   should be passed to URL helpers

2. Add an `escape_segment` helper to `Router::Utils`
   that escapes '/' characters

3. Use `escape_segment` rather than `escape_fragment`
   in optimized URL generation

4. Use `escape_segment` rather than `escape_path`
   in URL generation

For point 4 there are two exceptions. Firstly, when a route uses wildcard
segments (e.g. *foo) then we use `escape_path` as the value may contain '/'
characters. This means that wildcard routes can't be optimized. Secondly,
if a `:controller` segment is used in the path then this uses `escape_path`
as the controller may be namespaced.

Fixes #14629, #14636 and #14070.

Cherry picked from:
  e2ef83f
  a617925
  5460591

Conflicts:
	actionpack/CHANGELOG.md
8a06764
@pixeltrix pixeltrix added a commit that referenced this issue Apr 20, 2014
@pixeltrix pixeltrix Make URL escaping more consistent
1. Escape '%' characters in URLs - only unescaped data
   should be passed to URL helpers

2. Add an `escape_segment` helper to `Router::Utils`
   that escapes '/' characters

3. Use `escape_segment` rather than `escape_fragment`
   in optimized URL generation

4. Use `escape_segment` rather than `escape_path`
   in URL generation

For point 4 there are two exceptions. Firstly, when a route uses wildcard
segments (e.g. *foo) then we use `escape_path` as the value may contain '/'
characters. This means that wildcard routes can't be optimized. Secondly,
if a `:controller` segment is used in the path then this uses `escape_path`
as the controller may be namespaced.

Fixes #14629, #14636 and #14070.
5460591
@pixeltrix
Member

Fixed by 5460591

@pixeltrix pixeltrix closed this Apr 20, 2014
@GSI
GSI commented Aug 19, 2014

This concept causes "unexpected"/"inconsistent" behaviour when to_param is overwritten.

def to_param
  "foo/#{bar}"
end

test "to_param: forward slashes are unencoded" do
   m = FactoryGirl.create(:the_model)    
   assert_equal m.to_param, "foo/#{m.bar}"
end

This test obviously succeeds. On the other hand the _path and _url helpers will generate "foo%2F#{m.bar}".

Unless there's an even better way to handle this use case, I'd consider helper results should equal the strings as generated by (an overwritten) to_param.

@pixeltrix
Member

@GSI the job of to_param is to convert an object to a representation suitable for use in a path or query string, not to escape it - that's the job of the url generation code because there are different escaping semantics depending on where in the url that param goes.

If there's a need to use / in your path then the solution is to use a glob path segment (i.e. /*path) which tells the router not to escape them - is there a reason this doesn't/can't work for you?

@GSI
GSI commented Sep 10, 2014

@pixeltrix I've re-checked the docs, but I'm still unsure if I understand your suggestion correctly.

For example, I use get :location/:year/.../:id with contraints on :year - How would someone constrain that if year was missing in the actual route statement? get /*path/:id does neither "know" about year's position in the path nor it's regex constraints.

Also, HTML anchors would point to /123 (the non-globbed part plus the id). Correct? If yes, we'd generate additional, unnecessary requests (GET /123 => some sort of "moved" response => GET /the/actual/path/123), which I avoided by overwriting to_param.

Finally, this breaks assert_redirected_to. That could maybe be fixed in that method itself though, allowing for consistent results across dev and prod.

... or am I missing something?

@pixeltrix
Member

@GSI I'm not sure what the problem is but you can place constraints on a glob param, e.g:

Rails.application.routes.draw do
  get '/:location/*date/*id', to: 'posts#show', date: %r[\d{4}/\d{2}/\d{2}], as: 'post'
end
>> app.get('/london/2014/09/10/123/foo')
=> 200
>> app.request.params[:date]
=> "2014/09/10"
>> app.request.params[:id]
=> "123/foo"
>> app.post_path('london', '2014/09/10', '123/foo')
=> "/london/2014/09/10/123/foo"
@GSI
GSI commented Sep 11, 2014

(I was unaware that contraints were possible in the globbing context. Thanks for the hint.)

Regarding URL generation, I'll provide an example:

<%= link_to 'foo', @bar_object %> generates sth like <a href="bars/1">foo</a>.

Now the idea of overwriting to_param is to change the resulting path (href):

def to_param
    "#{location}/#{year}/#{id}"
end

With this, <%= link_to 'foo', @bar_object %> generates sth like <a href="somewhere/2014/1">foo</a>.

Even without that, and I think that is what you mean, bars_path() will generate the correct URL as long as the exact params are given.

So, bars_path({location: :somewhere, year: 2014, id: 1}) would still render as desired, while bars_path(@bar_object) seemingly falls back to the bars/1 variant.

Is there another way of achieving the desired outcome when slashes are technically "denied"?

(Changing all occurances of bars_path(@bar_object) to sth like bars_path(@bar_object.filtered_attributes_for_url) would appear "dirty" to me.
<%= link_to 'foo', @bar_object %> would also fail and thus have to be converted to the longer variant.)

@GSI
GSI commented Feb 6, 2015

@pixeltrix, I see where our examples differ and what causes the problem. You are using actual database field names in the route (e.g. date).

I am using several things like shortened_id which has no corresponding database field, but is derived dynamically from the id field.

Now that the usage of to_param is no option anymore, is there a solution that somehow allows bar_path() to "know" how to find shortened_id given a certain object (e.g. bar_path(bar_object)).

(Apparently this is not based on the model's attributes.)

@pixeltrix
Member

@GSI the url helpers don't care about model attributes. The fact that I used date is irrelevant - it could be whatever you wanted it to be. The only thing that matters is :foo vs. *foo - the latter doesn't trigger escaping of slashes. So you can use to_param - just make sure it's a glob param that's being replaced.

@GSI
GSI commented Feb 7, 2015

Thanks your explanation and patience, @pixeltrix.

I'll try to clarify: Given a bar_object (instance of the Bar model), once I call bar_path(bar_object), model attributes are relevant. Why? Because bar_object is missing an attribute named shortened_id. Thus there will be no param[:shortened_id] be passed to the controller.

Now, your suggestion of course does work if I define a route like this:
get '/*any_and_all_of_the_params_i_need', to: 'bars#show', as: 'bar'

But this leaves me with params[:any_and_all_of_the_params_i_need] containing the full string (e.g. location/year/shortened_id). As you say, it's unescaped, but is it really the intention to have to split the string up manually (location, year, shortened_id = params[:any_and_all_of_the_params_i_need].split('/')) or is there a better solution?

@pixeltrix
Member

@GSI the problem you're describing would be the same whether '/' was escaped or not - to_param returns a single value and you want link_to, etc. to be able to substitute more than one dynamic segment so that you have the corresponding params inside the controller. Unfortunate there isn't anyway to do this - you'll either have to split the one big glob param or override the url helper to extract more than one parameter for generating the route.

@GSI
GSI commented Feb 7, 2015

@pixeltrix

If '/' is unescaped, I get this link to click: http://0.0.0.0:3000/here/2014/1
If '/' is escaped, I get this link to click: http://0.0.0.0:3000/here%2F2014%2F1

Upon clicking it, the first is matched by a route get ':location/:year/:shortened_id', to: 'bar#show', while the second is unmatched (RoutingError).


A work around would be to define one route for url generation and one for matching:

get ':location/:year/:shortened_id', to: 'bars#show'
get '/*any_and_all_of_the_params_i_need', to: 'bars#show', as: 'just_for_url_generation'

... and then change all occurrences of link_to 'foo', @bar to link_to 'foo', just_for_url_generation_path(@bar).


Instead of a workaround I'd rather prefer some way to tell the escaping logic that I take responsibility for the contents of the to_param return value and that it's left the way I define it.

@pixeltrix
Member

@GSI when you call link_to 'foo', @bar it calls polymorphic_url(@bar) to construct the url. This uses the model name to get the actual url helper name. Assuming that bar_url exists it then calls to_param on @bar when constructing the url - which returns a single value. Therefore if your only route is the one with three required path parameters it will fail to generate because it's only got the one from @bar - whether it's escaped or not is irrelevant.

@GSI
GSI commented Feb 13, 2015

@pixeltrix Here is a reproduction of the error which was introduced by the slash escaping: https://github.com/GSI/rails-issue-14636

Please see the (very short) README for instructions.

@GSI
GSI commented Feb 18, 2015

Further investigation:

  • resources :posts generates this show route: get '/posts/:id', to: 'posts#show', as: :post
  • resources :posts, :except => :show expectedly skips generation of the show route, but also changes the PATCH route: patch '/posts/:id', to: 'posts#update' becomes patch '/posts/:id', to: 'posts#update', as: :post

This is important, as it explains why URL generation still works, even though, as @pixeltrix states, to_param is and always was returning a single value:

Without the default show-route, post_path refers to the PATCH-route's pattern and thus the URL is now generated using the PATCH-route's pattern (/posts/:id).

Explicitly changing the PATCH route to /posts/*id restores the behavior from before the escaping was introduced, causing the same effect as what I (erroneously) described as a "work around" in a prior comment (#14636 (comment)).

Considering this new perspective, the two-route-approach appears much less hacky and poses an actual option on dealing with the situation.

Thanks again for your comments, Andrew.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment