Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BrowserKit] Add support for HttpClient #30602

Merged
merged 2 commits into from Mar 23, 2019

Conversation

@fabpot
Copy link
Member

commented Mar 19, 2019

Q A
Branch? master
Bug fix? no
New feature? yes
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets part of #30502
License MIT
Doc PR not yet

When combining the power of the new HttpClient component with the BrowserKit and Mime components, we can makes something really powerful... a full/better/awesome replacement for https://github.com/FriendsOfPHP/Goutte.

So, this PR is about integrating the HttpClient component with BrowserKit to give users a high-level interface to ease usages in the most common use cases.

Scraping websites can be done like this:

use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;

$client = HttpClient::create();
$browser = new HttpBrowser($client);

$browser->request('GET', 'https://example.com/');
$browser->clickLink('Log In');
$browser->submitForm('Sign In', ['username' => 'me', 'password' => 'pass']);
$browser->clickLink('Subscriptions')->filter('table tr:nth-child(2) td:nth-child(2)')->each(function ($node) {
    echo trim($node->text())."\n";
});

And voilà! Nice, isn't?

Want to add HTTP cache? Sure:

use Symfony\Component\HttpKernel\HttpCache\Store;

$client = HttpClient::create();
$store = new Store(sys_get_temp_dir().'/http-cache-store');

$browser = new HttpBrowser($client, $store);

// ...

Want logging and debugging of HTTP Cache? Yep:

use Psr\Log\AbstractLogger;

class EchoLogger extends AbstractLogger
{
    public function log($level, $message, array $context = [])
    {
        echo $message."\n";
    }
}

$browser = new HttpBrowser($client, $store, new EchoLogger());

The first time you run your code, you will get an output similar to:

Request: GET https://twig.symfony.com/
Response: 200 https://twig.symfony.com/
Cache: GET /: miss, store
Request: GET https://twig.symfony.com/doc/2.x/
Response: 200 https://twig.symfony.com/doc/2.x/
Cache: GET /doc/2.x/: miss, store

But then:

Cache: GET /: fresh
Cache: GET /doc/2.x/: fresh

Limit is the sky here as you get the full power of all the Symfony ecosystem.

Under the hood, these examples leverage HttpFoundation, HttpKernel (with HttpCache),
DomCrawler, BrowserKit, CssSelector, HttpClient, Mime, ...

Excited?

P.S. : Tests need to wait for the HttpClient Mock class to land into master.

Show resolved Hide resolved src/Symfony/Component/BrowserKit/README.md Outdated

@fabpot fabpot force-pushed the fabpot:http-with-browserkit branch 2 times, most recently from 8bde4ee to a15b707 Mar 19, 2019

@ro0NL
Copy link
Contributor

left a comment

Limit is the sky

You mean "Sky's the limit" i suppose :D

Show resolved Hide resolved src/Symfony/Component/BrowserKit/HttpBrowser.php Outdated

@nicolas-grekas nicolas-grekas force-pushed the fabpot:http-with-browserkit branch from a15b707 to 8443c88 Mar 20, 2019

@nicolas-grekas nicolas-grekas changed the title [HttpClient] Add support for BrowserKit [BrowserKit] Add support for HttpClient Mar 21, 2019

fabpot added a commit that referenced this pull request Mar 21, 2019

feature #30625 [HttpKernel] add RealHttpKernel: handle requests with …
…HttpClientInterface (fabpot)

This PR was merged into the 4.3-dev branch.

Discussion
----------

[HttpKernel] add RealHttpKernel: handle requests with HttpClientInterface

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

This commit is directly extracted from #30602 by @fabpot

Commits
-------

b579b02 [HttpKernel] add RealHttpKernel: handle requests with HttpClientInterface

@fabpot fabpot force-pushed the fabpot:http-with-browserkit branch from 8443c88 to feebfee Mar 21, 2019

@nicolas-grekas nicolas-grekas force-pushed the fabpot:http-with-browserkit branch 3 times, most recently from 380a517 to a25fe3f Mar 22, 2019

@nicolas-grekas
Copy link
Member

left a comment

Green in a minute, with tests provided by @ktherage. The caching part is now separated into #30629, and logging has been removed. It should be reintroduced in HttpClient directly.

class HttpBrowser extends AbstractBrowser
{
private $client;
private $httpKernelBrowser;

This comment has been minimized.

Copy link
@ro0NL

ro0NL Mar 22, 2019

Contributor

unused

This comment has been minimized.

Copy link
@nicolas-grekas

nicolas-grekas Mar 22, 2019

Member

removed thanks

@nicolas-grekas nicolas-grekas force-pushed the fabpot:http-with-browserkit branch from a25fe3f to b5b2a25 Mar 22, 2019

@fabpot

This comment has been minimized.

Copy link
Member Author

commented Mar 23, 2019

Thank you THERAGE Kévin.

@fabpot fabpot merged commit b5b2a25 into symfony:master Mar 23, 2019

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
fabbot.io Your code looks good.
Details

fabpot added a commit that referenced this pull request Mar 23, 2019

feature #30602 [BrowserKit] Add support for HttpClient (fabpot, THERA…
…GE Kévin)

This PR was merged into the 4.3-dev branch.

Discussion
----------

[BrowserKit] Add support for HttpClient

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no <!-- don't forget to update UPGRADE-*.md and src/**/CHANGELOG.md files -->
| Tests pass?   | yes    <!-- please add some, will be required by reviewers -->
| Fixed tickets | part of #30502
| License       | MIT
| Doc PR        | not yet

When combining the power of the new HttpClient component with the BrowserKit and Mime components, we can makes something really powerful... a full/better/awesome replacement for https://github.com/FriendsOfPHP/Goutte.

So, this PR is about integrating the HttpClient component with BrowserKit to give users a high-level interface to ease usages in the most common use cases.

Scraping websites can be done like this:

```php
use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;

$client = HttpClient::create();
$browser = new HttpBrowser($client);

$browser->request('GET', 'https://example.com/');
$browser->clickLink('Log In');
$browser->submitForm('Sign In', ['username' => 'me', 'password' => 'pass']);
$browser->clickLink('Subscriptions')->filter('table tr:nth-child(2) td:nth-child(2)')->each(function ($node) {
    echo trim($node->text())."\n";
});
```

And voilà! Nice, isn't?

Want to add HTTP cache? Sure:

```php
use Symfony\Component\HttpKernel\HttpCache\Store;

$client = HttpClient::create();
$store = new Store(sys_get_temp_dir().'/http-cache-store');

$browser = new HttpBrowser($client, $store);

// ...
```

Want logging and debugging of HTTP Cache? Yep:

```php
use Psr\Log\AbstractLogger;

class EchoLogger extends AbstractLogger
{
    public function log($level, $message, array $context = [])
    {
        echo $message."\n";
    }
}

$browser = new HttpBrowser($client, $store, new EchoLogger());
```

The first time you run your code, you will get an output similar to:

```
Request: GET https://twig.symfony.com/
Response: 200 https://twig.symfony.com/
Cache: GET /: miss, store
Request: GET https://twig.symfony.com/doc/2.x/
Response: 200 https://twig.symfony.com/doc/2.x/
Cache: GET /doc/2.x/: miss, store
```

But then:

```
Cache: GET /: fresh
Cache: GET /doc/2.x/: fresh
```

Limit is the sky here as you get the full power of all the Symfony ecosystem.

Under the hood, these examples leverage HttpFoundation, HttpKernel (with HttpCache),
DomCrawler, BrowserKit, CssSelector, HttpClient, Mime, ...

Excited?

P.S. : Tests need to wait for the HttpClient Mock class to land into master.

Commits
-------

b5b2a25 Add tests for HttpBrowser
dd55845 [BrowserKit] added support for HttpClient

fabpot added a commit that referenced this pull request Mar 23, 2019

feature #30629 [HttpClient] added CachingHttpClient (fabpot)
This PR was merged into the 4.3-dev branch.

Discussion
----------

[HttpClient] added CachingHttpClient

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

The proposed `CachingHttpClient` uses `HttpCache` from the HttpKernel component to provide an HTTP-compliant cache.

If this is accepted, it could replace the corresponding part in #30602

Commits
-------

dae5686 [HttpClient] added CachingHttpClient

@nicolas-grekas nicolas-grekas modified the milestones: next, 4.3 Apr 30, 2019

@fabpot fabpot referenced this pull request May 9, 2019

Merged

Release v4.3.0-BETA1 #31435

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.