Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HttpClient] Next steps #30502

Open
nicolas-grekas opened this Issue Mar 9, 2019 · 7 comments

Comments

Projects
None yet
5 participants
@nicolas-grekas
Copy link
Member

nicolas-grekas commented Mar 9, 2019

Our shiny new HTTP client (see #30413) raised some comments. Some of them are about missing features. I'm opening this RFC to keep track of them. If anyone would like to work on any item, it'd be very welcome. I'm adding the "help wanted" label to hopefully make this call more visible.

  • #30419 FrameworkBundle integration: autowireable alias + semantic configuration for default options
  • #30494 add TraceableHttpClient and integrate with the profiler
  • the on_progress callback could benefit from an additional ?string $chunk argument - it might be helpful e.g. to catch response's contents in TraceableHttpClient or any other decorators that want to monitor the content
  • #30604 add a mock client - a client that would return fake responses to help writing tests. Note that unlike HTTPlug's Mock Client, collecting requests could be delegated to the TraceableHttpClient and the profiler service instead. To be discussed in a separate issue of course. Achieving parity with CsaGuzzleBundle's mock middleware could be a nice goal.
  • #30537 logger integration - to me, this should be done by injecting a logger into existing implementations (a logging middleware wouldn't allow accessing some internal things that might be useful to debugging)
  • creating clients with custom behavior requires writing decorators. The component provides no base class to help doing so. It might be useful to provide an abstract one for this purpose. See #30413 (comment) and following comments.
  • at least for debugging purposes, it could be useful to give access to the full headers that were sent to the remote server. Two ideas: that could be done by the logger - it would then be implementation-dependent; or we could add a new request_headers entry to responses' getInfo().
  • the "buffer" option is required to allow accessing the content several times, but it should really be turned to false when downloading large files. We discussed with @joelwurtz about an "auto" mode, which could be the default one, and which would look at the content-type of the response to auto-enable buffering for some content types only. If anyone wants to take that task, please open a separate issue/PR to discuss more precisely how this could work.
  • NativeHttpClient currently relies on the native http stream wrappers. This means sending a request is blocking until the response headers come back. We discussed about that with @joelwurtz: porting his socket client to the component could remove this limitation. It would also allow sending requests through unix sockets, which only CurlHttpClient can do right now.
  • handling of cookies is "do it yourself" right now. Defining a cookie_jar option could be useful. Or maybe an option is not the correct way. More ideas welcome in a separate issue/PR.
  • we could provide a decorator that would add HTTP and/or HSTS caching. Using HttpCache internally would be awesome :)
  • CurlHttpClient works on steroids with HTTP/2 thanks to multiplexing and server pushes, but this is untested right now. We could use the symfony CLI binary to build such a test suite. That would allow testing HTTPS features also.
  • have exceptions carry responses
  • support for multipart/form-data
  • expose API with URI manipulation (not sure we should personnaly - future will tell)
  • provide a way to inject custom curlopts (not sure we should personnaly - existing options already cover most of them)
  • provide a way to inject custom JSON flags (not sure we should personnaly - ppl can always encode/decode on their own if they really want precise control on these)
@fabpot

This comment has been minimized.

Copy link
Member

fabpot commented Mar 9, 2019

I'm going to take the following items:

"support for multipart/form-data"
"we could provide a decorator that would add HTTP and/or HSTS caching. Using HttpCache internally would be awesome :)"
"handling of cookies is "do it yourself" right now. Defining a cookie_jar option could be useful. Or maybe an option is not the correct way. More ideas welcome in a separate issue/PR."

@antonch1989

This comment has been minimized.

Copy link
Contributor

antonch1989 commented Mar 11, 2019

I could try with the logging integration

@antonch1989

This comment has been minimized.

Copy link
Contributor

antonch1989 commented Mar 12, 2019

I'll also take have exceptions carry responses if there are no objections

@stof

This comment has been minimized.

Copy link
Member

stof commented Mar 12, 2019

* `NativeHttpClient` currently relies on the native `http` stream wrappers. This means sending a request is blocking until the response headers come back. We discussed about that with @joelwurtz: porting his [socket client](https://github.com/php-http/socket-client) to the component could remove this limitation. It would also allow sending requests through unix sockets, which only `CurlHttpClient` can do right now.

should it be a separate client implementation instead ?

@antonch1989

This comment has been minimized.

Copy link
Contributor

antonch1989 commented Mar 14, 2019

I'll work on giving access to the full headers that were sent to the remote server

@nicolas-grekas

This comment has been minimized.

Copy link
Member Author

nicolas-grekas commented Mar 16, 2019

I'll work on giving access to the full headers that were sent to the remote server

thanks - it should be done using CURLINFO_HEADER_OUT for curl - and mimicking the resulting behavior in NativeHttpClient as much as possible

(but please let's move your "logger" PR before:)

fabpot added a commit that referenced this issue Mar 19, 2019

feature #30567 [HttpClient] exceptions carry response (antonch1989)
This PR was squashed before being merged into the 4.3-dev branch (closes #30567).

Discussion
----------

[HttpClient] exceptions carry response

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no     <!-- see https://symfony.com/bc -->
| Deprecations? | no
| Tests pass?   | yes    <!-- please add some, will be required by reviewers -->
| Fixed tickets | #30502
| License       | MIT
| Doc PR        |

Commits
-------

103448c [HttpClient] exceptions carry response
@antonch1989

This comment has been minimized.

Copy link
Contributor

antonch1989 commented Mar 21, 2019

I investigated a bit with CURLINFO_HEADER_OUT. It is straightforward if there are curl_init -> curl_setopt -> curl_exec -> curl_getinfo calls, in this case the info will have request_header key if CURLINFO_HEADER_OUT option is set.

In case with multi handlers it doesn't work. There is no request_header key in CurlResponse::getInfo
if I set an option here
header_out

Probably doing this with the logger is not a bad idea in our terms

fabpot added a commit that referenced this issue Mar 23, 2019

feature #30602 [BrowserKit] Add support for HttpClient (fabpot, THERA…
…GE Kévin)

This PR was merged into the 4.3-dev branch.

Discussion
----------

[BrowserKit] Add support for HttpClient

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no <!-- don't forget to update UPGRADE-*.md and src/**/CHANGELOG.md files -->
| Tests pass?   | yes    <!-- please add some, will be required by reviewers -->
| Fixed tickets | part of #30502
| License       | MIT
| Doc PR        | not yet

When combining the power of the new HttpClient component with the BrowserKit and Mime components, we can makes something really powerful... a full/better/awesome replacement for https://github.com/FriendsOfPHP/Goutte.

So, this PR is about integrating the HttpClient component with BrowserKit to give users a high-level interface to ease usages in the most common use cases.

Scraping websites can be done like this:

```php
use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;

$client = HttpClient::create();
$browser = new HttpBrowser($client);

$browser->request('GET', 'https://example.com/');
$browser->clickLink('Log In');
$browser->submitForm('Sign In', ['username' => 'me', 'password' => 'pass']);
$browser->clickLink('Subscriptions')->filter('table tr:nth-child(2) td:nth-child(2)')->each(function ($node) {
    echo trim($node->text())."\n";
});
```

And voilà! Nice, isn't?

Want to add HTTP cache? Sure:

```php
use Symfony\Component\HttpKernel\HttpCache\Store;

$client = HttpClient::create();
$store = new Store(sys_get_temp_dir().'/http-cache-store');

$browser = new HttpBrowser($client, $store);

// ...
```

Want logging and debugging of HTTP Cache? Yep:

```php
use Psr\Log\AbstractLogger;

class EchoLogger extends AbstractLogger
{
    public function log($level, $message, array $context = [])
    {
        echo $message."\n";
    }
}

$browser = new HttpBrowser($client, $store, new EchoLogger());
```

The first time you run your code, you will get an output similar to:

```
Request: GET https://twig.symfony.com/
Response: 200 https://twig.symfony.com/
Cache: GET /: miss, store
Request: GET https://twig.symfony.com/doc/2.x/
Response: 200 https://twig.symfony.com/doc/2.x/
Cache: GET /doc/2.x/: miss, store
```

But then:

```
Cache: GET /: fresh
Cache: GET /doc/2.x/: fresh
```

Limit is the sky here as you get the full power of all the Symfony ecosystem.

Under the hood, these examples leverage HttpFoundation, HttpKernel (with HttpCache),
DomCrawler, BrowserKit, CssSelector, HttpClient, Mime, ...

Excited?

P.S. : Tests need to wait for the HttpClient Mock class to land into master.

Commits
-------

b5b2a25 Add tests for HttpBrowser
dd55845 [BrowserKit] added support for HttpClient
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.