New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support protocol relative URL in HTTP Client. #11163

Merged
merged 2 commits into from Sep 14, 2017

Conversation

Projects
None yet
5 participants
@robertpustulka
Member

robertpustulka commented Sep 7, 2017

Some servers set //example.com as a Location header. buildUrl() needs to support that in order to build absolute urls.

Support protocol relative URL in HTTP Client.
Some servers set //example.com as a Location header. `buildUrl()` needs
to support that in order to build absolute urls.

@robertpustulka robertpustulka added the http label Sep 7, 2017

@robertpustulka robertpustulka added this to the 3.5.2 milestone Sep 7, 2017

@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Sep 7, 2017

Codecov Report

Merging #11163 into master will decrease coverage by 1.7%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #11163      +/-   ##
============================================
- Coverage     94.86%   93.15%   -1.71%     
- Complexity    12859    12980     +121     
============================================
  Files           437      437              
  Lines         32774    33621     +847     
============================================
+ Hits          31090    31319     +229     
- Misses         1684     2302     +618
Impacted Files Coverage Δ Complexity Δ
src/Http/Client.php 100% <100%> (ø) 55 <0> (+4) ⬆️
src/Shell/ServerShell.php 22% <0%> (-78%) 10% <0%> (+9%)
src/Cache/Engine/NullEngine.php 25% <0%> (-75%) 12% <0%> (+9%)
src/basics.php 43.33% <0%> (-49.53%) 0% <0%> (ø)
src/Shell/I18nShell.php 54.83% <0%> (-45.17%) 13% <0%> (+5%)
src/Event/Event.php 70% <0%> (-30%) 24% <0%> (+8%)
src/Shell/CacheShell.php 71.87% <0%> (-28.13%) 9% <0%> (+2%)
src/Mailer/Transport/MailTransport.php 68.18% <0%> (-25.57%) 7% <0%> (+3%)
src/Log/Engine/SyslogLog.php 76.19% <0%> (-23.81%) 7% <0%> (+2%)
src/Shell/Task/ExtractTask.php 76.53% <0%> (-23.47%) 117% <0%> (+16%)
... and 141 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d25953...fef1ebf. Read the comment docs.

codecov-io commented Sep 7, 2017

Codecov Report

Merging #11163 into master will decrease coverage by 1.7%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #11163      +/-   ##
============================================
- Coverage     94.86%   93.15%   -1.71%     
- Complexity    12859    12980     +121     
============================================
  Files           437      437              
  Lines         32774    33621     +847     
============================================
+ Hits          31090    31319     +229     
- Misses         1684     2302     +618
Impacted Files Coverage Δ Complexity Δ
src/Http/Client.php 100% <100%> (ø) 55 <0> (+4) ⬆️
src/Shell/ServerShell.php 22% <0%> (-78%) 10% <0%> (+9%)
src/Cache/Engine/NullEngine.php 25% <0%> (-75%) 12% <0%> (+9%)
src/basics.php 43.33% <0%> (-49.53%) 0% <0%> (ø)
src/Shell/I18nShell.php 54.83% <0%> (-45.17%) 13% <0%> (+5%)
src/Event/Event.php 70% <0%> (-30%) 24% <0%> (+8%)
src/Shell/CacheShell.php 71.87% <0%> (-28.13%) 9% <0%> (+2%)
src/Mailer/Transport/MailTransport.php 68.18% <0%> (-25.57%) 7% <0%> (+3%)
src/Log/Engine/SyslogLog.php 76.19% <0%> (-23.81%) 7% <0%> (+2%)
src/Shell/Task/ExtractTask.php 76.53% <0%> (-23.47%) 117% <0%> (+16%)
... and 141 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d25953...fef1ebf. Read the comment docs.

@AD7six

This comment has been minimized.

Show comment
Hide comment
@AD7six

AD7six Sep 7, 2017

Member

Some servers set //example.com as a Location header.

Can you explain the actual use case this is for please.

Do you mean the reason for this change is to follow redirects like so:

$ curl -I http://example.com/not/here.html
HTTP/1.1 301 Moved Permanently
Location: //example.com/but/here.html

?

Feels like there is significant scope for "upgrading broke my app" with this as buildUrl('//some/path') will change behavior.

Member

AD7six commented Sep 7, 2017

Some servers set //example.com as a Location header.

Can you explain the actual use case this is for please.

Do you mean the reason for this change is to follow redirects like so:

$ curl -I http://example.com/not/here.html
HTTP/1.1 301 Moved Permanently
Location: //example.com/but/here.html

?

Feels like there is significant scope for "upgrading broke my app" with this as buildUrl('//some/path') will change behavior.

@robertpustulka

This comment has been minimized.

Show comment
Hide comment
@robertpustulka

robertpustulka Sep 7, 2017

Member

Do you mean the reason for this change is to follow redirects like so:

Yes.

To me it looks like a bug that those URL weren't supported.
Locations like //example.com were redirected to http://example.com/example.com.

Member

robertpustulka commented Sep 7, 2017

Do you mean the reason for this change is to follow redirects like so:

Yes.

To me it looks like a bug that those URL weren't supported.
Locations like //example.com were redirected to http://example.com/example.com.

@AD7six

This comment has been minimized.

Show comment
Hide comment
@AD7six

AD7six Sep 7, 2017

Member

In the scope of a location header the change makes sense, but it applies much broader than that

I think this is a problem:

-> git diff
diff --git tests/TestCase/Http/ClientTest.php tests/TestCase/Http/ClientTest.php
index d8c379e..b9b4d52 100644
--- tests/TestCase/Http/ClientTest.php
+++ tests/TestCase/Http/ClientTest.php
@@ -146,6 +146,15 @@ class ClientTest extends TestCase
                 ],
                 'url without a scheme',
             ],
+            [
+                'http://example.com//some/file',
+                '//some/file',
+                [],
+                [
*                    'host' => 'example.com',
+                    'scheme' => 'http'
+                ],
+                'Borked but valid path should not be understood as a protocol-relative url',
+            ],
         ];
     }
 
-> phpunit tests/TestCase/Http/ClientTest.php
PHPUnit 6.3.0 by Sebastian Bergmann and contributors.

............F..........................                           39 / 39 (100%)

Time: 354 ms, Memory: 12.00MB

There was 1 failure:

1) Cake\Test\TestCase\Http\ClientTest::testBuildUrl with data set #11 ('http://example.com//some/file', '//some/file', array(), array('http'), 'Borked but valid path')
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'http://example.com//some/file'
+'http://some/file'

/tmp/cakephp/tests/TestCase/Http/ClientTest.php:169

I could be wrong, but I feel the right place for fixing that problem is https://github.com/cakephp/cakephp/blob/master/src/Http/Client.php#L401-L402

For comparison, curl doesn't understand protocol-relative urls:

$ curl -IL //example.com
curl: (3) <url> malformed

unless they are in a location header:

$ curl -IL http://cake.dev/foo
HTTP/1.1 302 Moved Temporarily
Location: //example.com

HTTP/1.1 200 OK
...

Note also that there is no need to use preg_match to test if a string matches one permutation. (substr($url, 0, 2) === '//') would suffice.

Member

AD7six commented Sep 7, 2017

In the scope of a location header the change makes sense, but it applies much broader than that

I think this is a problem:

-> git diff
diff --git tests/TestCase/Http/ClientTest.php tests/TestCase/Http/ClientTest.php
index d8c379e..b9b4d52 100644
--- tests/TestCase/Http/ClientTest.php
+++ tests/TestCase/Http/ClientTest.php
@@ -146,6 +146,15 @@ class ClientTest extends TestCase
                 ],
                 'url without a scheme',
             ],
+            [
+                'http://example.com//some/file',
+                '//some/file',
+                [],
+                [
*                    'host' => 'example.com',
+                    'scheme' => 'http'
+                ],
+                'Borked but valid path should not be understood as a protocol-relative url',
+            ],
         ];
     }
 
-> phpunit tests/TestCase/Http/ClientTest.php
PHPUnit 6.3.0 by Sebastian Bergmann and contributors.

............F..........................                           39 / 39 (100%)

Time: 354 ms, Memory: 12.00MB

There was 1 failure:

1) Cake\Test\TestCase\Http\ClientTest::testBuildUrl with data set #11 ('http://example.com//some/file', '//some/file', array(), array('http'), 'Borked but valid path')
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'http://example.com//some/file'
+'http://some/file'

/tmp/cakephp/tests/TestCase/Http/ClientTest.php:169

I could be wrong, but I feel the right place for fixing that problem is https://github.com/cakephp/cakephp/blob/master/src/Http/Client.php#L401-L402

For comparison, curl doesn't understand protocol-relative urls:

$ curl -IL //example.com
curl: (3) <url> malformed

unless they are in a location header:

$ curl -IL http://cake.dev/foo
HTTP/1.1 302 Moved Temporarily
Location: //example.com

HTTP/1.1 200 OK
...

Note also that there is no need to use preg_match to test if a string matches one permutation. (substr($url, 0, 2) === '//') would suffice.

@robertpustulka

This comment has been minimized.

Show comment
Hide comment
@robertpustulka

robertpustulka Sep 7, 2017

Member

//example.com syntax is used in a scope where a protocol is known, this is a "Protocol relative URL".

Maybe the buildUrl method should append the scheme only if a protocol has been explicitly passed as an option?

if (isset($options['scheme'] && (substr($url, 0, 2) === '//')) {
    $url = $options['scheme'] . ':' . $url;
}
...
$options += $defaults;
Member

robertpustulka commented Sep 7, 2017

//example.com syntax is used in a scope where a protocol is known, this is a "Protocol relative URL".

Maybe the buildUrl method should append the scheme only if a protocol has been explicitly passed as an option?

if (isset($options['scheme'] && (substr($url, 0, 2) === '//')) {
    $url = $options['scheme'] . ':' . $url;
}
...
$options += $defaults;
@markstory

This comment has been minimized.

Show comment
Hide comment
@markstory

markstory Sep 7, 2017

Member

@AD7six //some/file doesn't seem like valid path to me. It is ambiguous as //host/path style URLs are used quite often to load CSS/JS assets, and // should likely be interpreted the same in our Client as it would in a browser.

rfc 7231 states that Location should be a Relative Reference which indicates that the first segment after // is the authority or domain.

Member

markstory commented Sep 7, 2017

@AD7six //some/file doesn't seem like valid path to me. It is ambiguous as //host/path style URLs are used quite often to load CSS/JS assets, and // should likely be interpreted the same in our Client as it would in a browser.

rfc 7231 states that Location should be a Relative Reference which indicates that the first segment after // is the authority or domain.

@AD7six

This comment has been minimized.

Show comment
Hide comment
@AD7six

AD7six Sep 8, 2017

Member

@markstory my concern/point is that a request like this https://cakephp.org//css/cake.css is a valid response on most webservers, and is affected by the modified code. It's easy to write code which would make a request like that:

$someVariable = '/css/cake.css';
...
$path = '/';
...
$path .= $someVariable;
$result = (new Client())->get($path, [], ['host' => 'cakephp.org', 'scheme' => 'https']);

The behavior of this request is going to significantly change when this PR is merged. Hoping that's understood (if it already is - well 🤷‍♂️ :)) - I have no problem with the intended scope of the PR (handling protocol relative redirects) as I previously mentioned.

Member

AD7six commented Sep 8, 2017

@markstory my concern/point is that a request like this https://cakephp.org//css/cake.css is a valid response on most webservers, and is affected by the modified code. It's easy to write code which would make a request like that:

$someVariable = '/css/cake.css';
...
$path = '/';
...
$path .= $someVariable;
$result = (new Client())->get($path, [], ['host' => 'cakephp.org', 'scheme' => 'https']);

The behavior of this request is going to significantly change when this PR is merged. Hoping that's understood (if it already is - well 🤷‍♂️ :)) - I have no problem with the intended scope of the PR (handling protocol relative redirects) as I previously mentioned.

@markstory

This comment has been minimized.

Show comment
Hide comment
@markstory

markstory Sep 8, 2017

Member

That's a good point @AD7six I hadn't considered that. @robertpustulka Is there a way your change could be scoped closer to where Location headers are followed? Perhaps the scheme should only be applied if the scheme option was set as you mentioned earlier.

Member

markstory commented Sep 8, 2017

That's a good point @AD7six I hadn't considered that. @robertpustulka Is there a way your change could be scoped closer to where Location headers are followed? Perhaps the scheme should only be applied if the scheme option was set as you mentioned earlier.

@markstory markstory modified the milestones: 3.5.2, 3.5.3 Sep 12, 2017

@robertpustulka

This comment has been minimized.

Show comment
Hide comment
@robertpustulka

robertpustulka Sep 13, 2017

Member

Protocol relative fix is now opt-in. I added a protocolRelative option which is set to true in redirects.

Member

robertpustulka commented Sep 13, 2017

Protocol relative fix is now opt-in. I added a protocolRelative option which is set to true in redirects.

@markstory markstory merged commit 2163569 into cakephp:master Sep 14, 2017

5 checks passed

codecov/patch 100% of diff hit (target 94.86%)
Details
codecov/project Absolute coverage decreased by -1.7% but relative coverage increased by +5.13% compared to 6d25953
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
stickler-ci No lint errors found.
@markstory

This comment has been minimized.

Show comment
Hide comment
@markstory

markstory Sep 14, 2017

Member

Nice work!

Member

markstory commented Sep 14, 2017

Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment