Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
curl doesn't follow Refresh: header redirects #3657
I did this
I used the attached script with CURL (curlbugphp.txt is the PHP code).
URL to reproduce is:
I expected the following
I expected to get back the webpage response but got the response (both in PHP and "curl" command line in the attached file (curlbugresponse.txt).
curl 7.54.0 (x86_64-apple-darwin18.0) libcurl/7.54.0
OSX 10.13 and OS X 10.11
You're reporting a problem with a really old curl version but more importantly, the target site your PHP script uses doesn't respond:
It would be more helpful if you instead would show us the exact response headers in the case of the problem.
I included that in the "curlbugresponse.txt file. I'll check the URL, but I know it's correct, as you can see from the curlbugresponse.txt file.
If you use the URL in the script (attached), you will see it. If you used the one above, that is if you want to use the test script I sent to reproduce it.
FYI, I'm using the delivered version of "php" with Mojave. But I have tried it on other releases at the 7.x level.
Happy to try it on newer versions, but I see no reports that match and no resutls to indicate it might work in later releases.
Here is the URL inside the PHP script - if you want to try it outside the example script.
Ah, thanks I missed the response file you attached. It does indeed highlight exactly what the problem is. These are the response headers:
This "redirect" is done using the
Historically, it seems it was considered for HTTP 1.1 but never made it!
I also found more useful info on the header and support for it in this blog post. Given that we've managed without support for this header for so long, I'm not 100% convinced adding support now is necessary.
Or is it?
(I just posted this email to the HTTPbis mailing list, pasted here and slightly reformatted for looks)
The other day someone filed a bug on curl that we don't support redirects with the
As you all know, redirects in HTTP is specified to use 3xx response codes and a
The little detail that it never made it into the 1.0 spec (nor any later one) doesn't seem to have affected the browsers. Still today, browsers keep supporting the Refresh header as a sort of Location: replacement even though it seems to never have been present in a HTTP spec.
How frequent is the use of the Refresh header? I decided to make an attempt to figure out, and for this venture I used the Rapid7 data trove. The method that data is collected with may not be the best, but it is still 52+ million HTTP responses from different current HTTP servers. (52254873 exactly in my data dump)
My counts show
Other random notes
Redirects can also be done by meta tags and sending the refresh that way, but I have not investigated how common as that isn't strictly speaking HTTP so it is outside of my research (and interest) here.
Nah, sorry, I don't have any. Yet another undocumented quirky corner of the web I suppose.
There's no documentation for this header. It is not present in the HTTP standard. It is not implemented by other non-browser HTTP toolkits. I think the sane thing to do here is to push the browsers into dropping support for this abomination. Right now, I don't think it is in our interest to implement support for this header and just extend the pain in the world.
Do contact the site you interact with and urge that they switch to a standard HTTP header!
(My post here and to the list was also subsequently turned into a blogpost of mine that has some additional feedback that I've received when discussing this topic outside of the curl project.)
Just want to add, that while I totally agree that this is NOT the standard and should not be done - the browsers all support this because they must. t's in the field and they get it all the time.
Interesting. What is the sample size? 0.02% seems quite small to have warranted the browser code base to have supported it.
Perhaps this is s the end of this particular issue as it nears 0.00% it may be that it's been phased out and awaiting complete annihilation.
I read the info that was posted earlier the first time. 52M is a pittance of total traffic. It's why I asked about the sample size. 52M tells me nothing without telling me the rate and the period. Is that 52M per minute? Per Second? Over the course of 1 hour? Only US traffic? Lots of other questions, Worldwidewebsize reports that on 2/17/2019 the number of "Google" indexed webpages was ~64.5B pages.
As a measure of raw throughput, Internet Live Stats shows at this moment total internet traffic today is 3.885 Billion GB's SO FAR.
That is not to say that badger's response wasn't fastastic - it was. It was quick, used a common source of data, and provided a good analysis of that traffic as it related to this issue and curl's thoroughness.
Fascinating to look into.
Then read the blog post or check the source where I got the data. It is explained in both places, more detailed in the latter. Those are 52 million different origins.
By all means do your own measurements. I doubt you'll find a significant different usage level. I focused on responses from different origins, you can focus on something else.