Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug when receiving a relative redirect location #1827

Merged
merged 2 commits into from
Oct 10, 2014
Merged

Fix a bug when receiving a relative redirect location #1827

merged 2 commits into from
Oct 10, 2014

Conversation

svanharmelen
Copy link
Contributor

Currently when you use a provider that somewhere along the line uses
Chef::HTTP.request or Chef::HTTP.streaming_request, you will receive an
error when the response is a redirect with a relative location.

@chef-supermarket
Copy link

Hi. Your friendly Curry bot here. Just letting you know that there are commit authors in this Pull Request who appear to not have signed a Chef CLA.

There are 1 commit author(s) whose commits are authored by a non GitHub-verified email address in this Pull Request. Chef will have to verify by hand that they have signed a Chef CLA.

Please sign the CLA here.

@svanharmelen
Copy link
Contributor Author

I'm already a contributor of a signed a CCLA (Schuberg Philis)

@lamont-granquist
Copy link
Contributor

this makes sense to me in isolation but i've yet to tear apart how redirects in Chef::HTTP actually work... @danielsdeleo can you take a look?

👍 pending dan's.

@danielsdeleo
Copy link
Contributor

@danielsdeleo
Copy link
Contributor

This looks fine to me, 👍

Currently when you use a provider that somewhere along the line uses
Chef::HTTP.request or Chef::HTTP.streaming_request, you will receive an
error when the response is a redirect with a relative location.
@chef-supermarket
Copy link

Hi. Your friendly Curry bot here. Just letting you know that all commit authors have become authorized to contribute. I have added the "Signed CLA" label to this issue so it can easily be found in the future.

@svanharmelen
Copy link
Contributor Author

The new commit is NO code change, but a git commit --amend to update the commit in order to get the CLA stuff fixed...

@lamont-granquist
Copy link
Contributor

looks like we have duelling PRs with #1864 -- i think that this one should probably be closed in favor of the other one?

@svanharmelen
Copy link
Contributor Author

Hi @lamont-granquist please check my comment on issue #1863. I believe this was broken even before 11.14.x, but then this was less obvious... Please double check as I don't believe PR #1864 will fix the issue. It will only fix the ruby error, but not the actual problem.

CC @danielsdeleo @coderanger

@lamont-granquist
Copy link
Contributor

Cool, thanks for the clarification.

@lamont-granquist
Copy link
Contributor

So, additional question, your patch seems to work if we get "/index.html", but what if we're redirected to "../../index.html"? Since you just replace the path, I don't think that will work?

@svanharmelen
Copy link
Contributor Author

@lamont-granquist is that even a valid relative redirect location according to the RFC's? As I fail to find any example (not online and not in the RFC's) using such a redirect... Could you point me to such a use case so I can have a look?

@lamont-granquist
Copy link
Contributor

Yeah, I know that's a valid link reference, but no idea if that is a valid redirect.

@svanharmelen
Copy link
Contributor Author

Yeah, sure it's a valid link reference... But as far as I can tell from the RFC's a relative redirect location is always the complete path from an URI object... I at least cannot find any other examples in the RFC's suggesting otherwise.

@lamont-granquist
Copy link
Contributor

I can set one up and nginx serves it, and chrome and wget follow it:

% wget http://www.scriptkiddie.org/thing/index.html
--2014-08-20 15:06:31--  http://www.scriptkiddie.org/thing/index.html
Resolving www.scriptkiddie.org... 173.255.253.42, 2600:3c01::f03c:91ff:fe93:78db
Connecting to www.scriptkiddie.org|173.255.253.42|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: ../index.html [following]
--2014-08-20 15:06:32--  http://www.scriptkiddie.org/index.html
Reusing existing connection to www.scriptkiddie.org:80.
HTTP request sent, awaiting response... 200 OK
Length: 497 [text/html]
Saving to: 'index.html'

100%[======================================>] 497         --.-K/s   in 0s

2014-08-20 15:06:32 (23.7 MB/s) - 'index.html' saved [497/497]

(chef will probably obliterate that config very shortly so don't expect that wget to work)

@lamont-granquist
Copy link
Contributor

And just using telnet:

% telnet 173.255.253.42 80
Trying 173.255.253.42...
Connected to crash.scriptkiddie.org.
Escape character is '^]'.
GET /thing/index.html HTTP/1.1
Host: www.scriptkiddie.org
Connection: close

HTTP/1.1 302 Moved Temporarily
Server: nginx/1.4.6 (Ubuntu)
Date: Wed, 20 Aug 2014 22:12:12 GMT
Content-Type: text/html
Content-Length: 169
Connection: close
Location: ../index.html

<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.4.6 (Ubuntu)</center>
</body>
</html>
Connection closed by foreign host.

@svanharmelen
Copy link
Contributor Author

Check... So it does seem to working as expected and when reading further through the RFC's I can only find that you may use a relative URI. And when looking closer at the RFC describing a relative URI it says that ../ is a valid thing to use... So that would mean that you are sharp as a razor again 😉 and that we should make sure that also works correctly.

Fortunately that is a very easy thing to do using the URI object 😄 I'll update my PR in a second... Hold on...

@svanharmelen
Copy link
Contributor Author

@lamont-granquist please see the updated PR. This now works for all kind of relative redirects (so also the ones that use stuff like ../../index.html) and actually seems to be cleaner code at the same time... 👍

@lamont-granquist
Copy link
Contributor

So is there any reason why we don't simply convert the string url into a URI object in the initializer and force the internal representation so that it must be a URI object, and then we can just use the '+' operator which seems to bake all kinds of previously-solved-problems into it, and then we can drop the regexp mangling in the else completely? Plus the .is_a?(URI) check here is just horrible code smell...

@danielsdeleo ?

I suppose a user might expect to be able to pass in a string to the initializer and then get the string back out via the attr_reader somewhere, but definitely for Chef-12/master i feel it would be good to just break that.

@danielsdeleo
Copy link
Contributor

Yeah, this type ambiguity is a PITA. The only thing I can think of as a wrinkle is that Chef::REST (and I begrudgingly kept this behavior in Chef::HTTP) lets you do:

Chef::HTTP.new("http://bing.com").get("http://google.com")

so the argument to the HTTP method method can be a relative path or a full URI. Would need to dive into the code deeper to see how that impacts what you're suggesting.

@svanharmelen
Copy link
Contributor Author

I'm not behind a desk anymore (on CET time over here), but will check tomorrow if that solves all known corner cases we tried to solve earlier.

Must admit it sounds pretty reasonable and would make a clean solution indeed...

@lamont-granquist
Copy link
Contributor

this seems insane tho:

Chef::HTTP.new("/index.html").get("http://google.com")

my brain raises a WTFException on that.

while this I could see working:

Chef::HTTP.new("http://google.com/foo/bar.html").get("../index.html")

If we coerce the argument to Chef::HTTP.new() to a URI then I think we only break the thing that I don't think makes any sense. Your use case and my second use case there would still work.

@danielsdeleo
Copy link
Contributor

@svanharmelen Yes, using the + operator with URI looks like the best way to go and lets us rely on a more complete upstream implementation.

@lamont-granquist
Copy link
Contributor

Okay, I'm not sure how I got confused about what the URI '+' operator does. It looks like that operates correctly. So I think your fix for the is_a?(URI) case is fine as-is.

What I'd kind of like to see it that operator used in the regexp case as well so that you'd URI.parse the string @uri and then append the path. Based on your unit test example it looks like we're passing in some kind of incorrect path that has a leading-/ in front of '/cookbooks' that needs to get stripped to turn it into a relative redirect? If we need to preserve that behavior for strings I'd like to see that made a bit more explicit like:

      elsif @url.is_a?(URI)
        @url+path
      else
        fixed_path = path.gsub(%r{^/+}, '')   # XXX: strip leading /'s off of path (buggy callers)
        URI.parse(@url) + fixed_path
      end

I've still got a bit of a question as to if the unit test is testing something that we actually do or if its incorrect, since it seems like that's an invalid redirect and I don't understand what code would be generating that use case.

@btm
Copy link
Contributor

btm commented Sep 8, 2014

URI.+ seems to do the right thing with a relative path:

1.9.3-p194 :002 > u = URI.parse("http://www.example.org/one/two/three/four/index.html")
 => #<URI::HTTP:0x007fe8f2045c80 URL:http://www.example.org/one/two/three/four/index.html> 
1.9.3-p194 :003 > u + "../index.html"
 => #<URI::HTTP:0x007fe8f203ea48 URL:http://www.example.org/one/two/three/index.html> 

The somewhat surprising behavior is when that relative path has a leading slash, it replaces the path, which should be fine given #1396 checks got rid of errant double slashes, right?:

1.9.3-p194 :009 > u = URI.parse("http://www.example.org/one/two/three/four/index.html")
 => #<URI::HTTP:0x007fe8f3074200 URL:http://www.example.org/one/two/three/four/index.html> 
1.9.3-p194 :010 > u + "/bob/index.html"
 => #<URI::HTTP:0x007fe8f3068720 URL:http://www.example.org/bob/index.html> 

@svanharmelen are you proposing something like your current changes, like @lamont-granquist just commented about, or a new that just changed send_http_request to use URI.+ as in your last comment?

@svanharmelen
Copy link
Contributor Author

@btm I was indeed more in favor of re-doing this one and fix the 'send_http_request' function so it will not call the 'create_url' function again when receiving a redirect, but instead just use '@url+...' instead which will fix the current issue.

This means leaving the 'create_url' function untouched for this fix, as the call to this function is just not nessecairy for handling the redirect.

/cc @lamont-granquist

@danielsdeleo
Copy link
Contributor

That sounds like a reasonable plan to me.

@svanharmelen
Copy link
Contributor Author

I will update this PR tomorrow, so it reflex this approach...

@lamont-granquist
Copy link
Contributor

Yeah if the redirect case is the only case where we don't have a URI object then that seems like a better approach.

@svanharmelen
Copy link
Contributor Author

Please review the updated approach which now looks really clean to me...

Additionally there is room to improve the create_url function as well (like @lamont-granquist also suggested), but I would say that is out of scope for this PR and the thing we need/try to fix here.

@btm
Copy link
Contributor

btm commented Sep 9, 2014

@svanharmelen 👍

we still need to deal with spec/unit/http_spec.rb being sparse.

@lamont-granquist
Copy link
Contributor

👍

@danielsdeleo
Copy link
Contributor

This is awesome, kinda crazy that we got to such a simple solution in the end (but that's good). 👍

@lamont-granquist
Copy link
Contributor

indeed, lotta yapping for a one-liner

@btm
Copy link
Contributor

btm commented Sep 12, 2014

👍

sersut pushed a commit that referenced this pull request Oct 10, 2014
Fix a bug when receiving a relative redirect location
@sersut sersut merged commit 69b3c83 into chef:master Oct 10, 2014
sersut pushed a commit that referenced this pull request Oct 10, 2014
@chef chef locked and limited conversation to collaborators Nov 16, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants