Skip to content

Commit

Permalink
bug #54207 [HttpClient] Lazily initialize CurlClientState (arjenm)
Browse files Browse the repository at this point in the history
This PR was merged into the 5.4 branch.

Discussion
----------

[HttpClient] Lazily initialize CurlClientState

| Q             | A
| ------------- | ---
| Branch?       | 5.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Issues        | See below
| License       | MIT

We have a dependency that recently started requiring symfony/http-client. Once that happened, our testing pipeline started to fail.

The failure turned out to be running out of open file descriptors.

It took a while to pinpoint the cause, but we identified the constructor of CurlClientState, specifically the `curl_multi_init` call in it, to cause an ever increasing number of open files (specifically, unix streams).

In our platform, we don't actually use HttpClient. But several services in Symfony have an "optional" dependency on it. Our examples were the TransportFactory-classes in the Mailer-package (including that for the NullTransport used in our testing pipeline) and the JsonManifestVersionStrategy.

I.e. as soon as the http-client package was installed, any time those services where requested, the container now first had to create the HttpClient-service. Which then selected CurlHttpClient as the best option. Which constructed its CurlClientState. And that in turn preemptively opened that stream.

We trigger the Mailer-service for every test, and given our 11k+ tests, that happened a lot of times.  And apparently, the streams didn't get closed properly since we ran out of file descriptors after something like 9k tests done...

Unfortunately I was unable to create an isolated reproducer. My guess is that the reproducer was so much smaller, that PHP's memory cleanup could keep up and quickly identify and close abandoned resources (like the curl multihandle), whereas it could not in our much bigger application. But the issue (lots of 'STREAM' rows in the output of netstat or lsof) did not appear when an 'early return' was added to the CurlClientState's constructor (or when we explicitly specified null as httpclient for the mailer and our JsonManifestVersionStrategy).

This PR delays the (imo very heavy) construction of CurlClientState (including the initial reset) in CurlHttpClient to be a "just in time" occurrence.

Commits
-------

4966c75 [HttpClient] Lazily initialize CurlClientState
  • Loading branch information
nicolas-grekas committed Mar 12, 2024
2 parents c6506c4 + 4966c75 commit 0a7825b
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 21 deletions.
61 changes: 42 additions & 19 deletions src/Symfony/Component/HttpClient/CurlHttpClient.php
Expand Up @@ -50,6 +50,9 @@ final class CurlHttpClient implements HttpClientInterface, LoggerAwareInterface,
*/
private $logger;

private $maxHostConnections;
private $maxPendingPushes;

/**
* An internal object to share state between the client and its responses.
*
Expand All @@ -70,18 +73,22 @@ public function __construct(array $defaultOptions = [], int $maxHostConnections
throw new \LogicException('You cannot use the "Symfony\Component\HttpClient\CurlHttpClient" as the "curl" extension is not installed.');
}

$this->maxHostConnections = $maxHostConnections;
$this->maxPendingPushes = $maxPendingPushes;

$this->defaultOptions['buffer'] = $this->defaultOptions['buffer'] ?? \Closure::fromCallable([__CLASS__, 'shouldBuffer']);

if ($defaultOptions) {
[, $this->defaultOptions] = self::prepareRequest(null, null, $defaultOptions, $this->defaultOptions);
}

$this->multi = new CurlClientState($maxHostConnections, $maxPendingPushes);
}

public function setLogger(LoggerInterface $logger): void
{
$this->logger = $this->multi->logger = $logger;
$this->logger = $logger;
if (isset($this->multi)) {
$this->multi->logger = $logger;
}
}

/**
Expand All @@ -91,6 +98,8 @@ public function setLogger(LoggerInterface $logger): void
*/
public function request(string $method, string $url, array $options = []): ResponseInterface
{
$multi = $this->ensureState();

[$url, $options] = self::prepareRequest($method, $url, $options, $this->defaultOptions);
$scheme = $url['scheme'];
$authority = $url['authority'];
Expand Down Expand Up @@ -161,25 +170,25 @@ public function request(string $method, string $url, array $options = []): Respo
}

// curl's resolve feature varies by host:port but ours varies by host only, let's handle this with our own DNS map
if (isset($this->multi->dnsCache->hostnames[$host])) {
$options['resolve'] += [$host => $this->multi->dnsCache->hostnames[$host]];
if (isset($multi->dnsCache->hostnames[$host])) {
$options['resolve'] += [$host => $multi->dnsCache->hostnames[$host]];
}

if ($options['resolve'] || $this->multi->dnsCache->evictions) {
if ($options['resolve'] || $multi->dnsCache->evictions) {
// First reset any old DNS cache entries then add the new ones
$resolve = $this->multi->dnsCache->evictions;
$this->multi->dnsCache->evictions = [];
$resolve = $multi->dnsCache->evictions;
$multi->dnsCache->evictions = [];
$port = parse_url($authority, \PHP_URL_PORT) ?: ('http:' === $scheme ? 80 : 443);

if ($resolve && 0x072A00 > CurlClientState::$curlVersion['version_number']) {
// DNS cache removals require curl 7.42 or higher
$this->multi->reset();
$multi->reset();
}

foreach ($options['resolve'] as $host => $ip) {
$resolve[] = null === $ip ? "-$host:$port" : "$host:$port:$ip";
$this->multi->dnsCache->hostnames[$host] = $ip;
$this->multi->dnsCache->removals["-$host:$port"] = "-$host:$port";
$multi->dnsCache->hostnames[$host] = $ip;
$multi->dnsCache->removals["-$host:$port"] = "-$host:$port";
}

$curlopts[\CURLOPT_RESOLVE] = $resolve;
Expand Down Expand Up @@ -281,16 +290,16 @@ public function request(string $method, string $url, array $options = []): Respo
$curlopts += $options['extra']['curl'];
}

if ($pushedResponse = $this->multi->pushedResponses[$url] ?? null) {
unset($this->multi->pushedResponses[$url]);
if ($pushedResponse = $multi->pushedResponses[$url] ?? null) {
unset($multi->pushedResponses[$url]);

if (self::acceptPushForRequest($method, $options, $pushedResponse)) {
$this->logger && $this->logger->debug(sprintf('Accepting pushed response: "%s %s"', $method, $url));

// Reinitialize the pushed response with request's options
$ch = $pushedResponse->handle;
$pushedResponse = $pushedResponse->response;
$pushedResponse->__construct($this->multi, $url, $options, $this->logger);
$pushedResponse->__construct($multi, $url, $options, $this->logger);
} else {
$this->logger && $this->logger->debug(sprintf('Rejecting pushed response: "%s"', $url));
$pushedResponse = null;
Expand All @@ -300,7 +309,7 @@ public function request(string $method, string $url, array $options = []): Respo
if (!$pushedResponse) {
$ch = curl_init();
$this->logger && $this->logger->info(sprintf('Request: "%s %s"', $method, $url));
$curlopts += [\CURLOPT_SHARE => $this->multi->share];
$curlopts += [\CURLOPT_SHARE => $multi->share];
}

foreach ($curlopts as $opt => $value) {
Expand All @@ -310,7 +319,7 @@ public function request(string $method, string $url, array $options = []): Respo
}
}

return $pushedResponse ?? new CurlResponse($this->multi, $ch, $options, $this->logger, $method, self::createRedirectResolver($options, $host), CurlClientState::$curlVersion['version_number']);
return $pushedResponse ?? new CurlResponse($multi, $ch, $options, $this->logger, $method, self::createRedirectResolver($options, $host), CurlClientState::$curlVersion['version_number']);
}

/**
Expand All @@ -324,9 +333,11 @@ public function stream($responses, ?float $timeout = null): ResponseStreamInterf
throw new \TypeError(sprintf('"%s()" expects parameter 1 to be an iterable of CurlResponse objects, "%s" given.', __METHOD__, get_debug_type($responses)));
}

if (\is_resource($this->multi->handle) || $this->multi->handle instanceof \CurlMultiHandle) {
$multi = $this->ensureState();

if (\is_resource($multi->handle) || $multi->handle instanceof \CurlMultiHandle) {
$active = 0;
while (\CURLM_CALL_MULTI_PERFORM === curl_multi_exec($this->multi->handle, $active)) {
while (\CURLM_CALL_MULTI_PERFORM === curl_multi_exec($multi->handle, $active)) {
}
}

Expand All @@ -335,7 +346,9 @@ public function stream($responses, ?float $timeout = null): ResponseStreamInterf

public function reset()
{
$this->multi->reset();
if (isset($this->multi)) {
$this->multi->reset();
}
}

/**
Expand Down Expand Up @@ -439,6 +452,16 @@ private static function createRedirectResolver(array $options, string $host): \C
};
}

private function ensureState(): CurlClientState
{
if (!isset($this->multi)) {
$this->multi = new CurlClientState($this->maxHostConnections, $this->maxPendingPushes);
$this->multi->logger = $this->logger;
}

return $this->multi;
}

private function findConstantName(int $opt): ?string
{
$constants = array_filter(get_defined_constants(), static function ($v, $k) use ($opt) {
Expand Down
4 changes: 2 additions & 2 deletions src/Symfony/Component/HttpClient/Tests/CurlHttpClientTest.php
Expand Up @@ -63,9 +63,9 @@ public function testHandleIsReinitOnReset()
{
$httpClient = $this->getHttpClient(__FUNCTION__);

$r = new \ReflectionProperty($httpClient, 'multi');
$r = new \ReflectionMethod($httpClient, 'ensureState');
$r->setAccessible(true);
$clientState = $r->getValue($httpClient);
$clientState = $r->invoke($httpClient);
$initialShareId = $clientState->share;
$httpClient->reset();
self::assertNotSame($initialShareId, $clientState->share);
Expand Down

0 comments on commit 0a7825b

Please sign in to comment.