-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
100% of CPU usage on curl_multi_exec #756
Comments
When sending requests in parallel to a low latency server, you will get higher CPU utilization because the select calls made by Guzzle to curl will return almost immediately. This is because you're basically in a tight loop, and the only way to use less CPU would be to force a sleep. Are you on a fairly low latency connection to the server? How many requests are you running in parallel? Are you uploading data, and if so, how much on average per request? What's the average size of the response bodies you are downloading? |
The servers are into EC2 on same region, so low latency (one server send a group of POST requests to 3 other servers). e.g.: To server 1: To server 2: To server 3: What changes are the parameters of requests.
|
Thanks for the information. Yeah, that's a low latency connection. I've been playing around with the curl handling, but I don't think there's anything wrong with what's implemented in Guzzle. To prove this, I implemented what Guzzle does in a very basic boiled down script that uses curl directly with no abstractions. This boiled down script also results in nearly 100% CPU utilization when running 50 requests in parallel. When I reduced this to 2 in parallel, it resulted in around 60-70% CPU. // Yield a bunch of requests
$handleGen = function() {
for ($i = 0; $i < 100000; $i++) {
$h = curl_init();
curl_setopt_array($h, [
CURLOPT_URL => 'http://localhost:8125/guzzle-server/perf',
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => 'foo baz bar',
CURLOPT_CONNECTTIMEOUT => 150,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true,
]);
yield $h;
}
};
$mh = curl_multi_init();
$gener = $handleGen();
// Add initial handles to keep a pool size of 50 requests
foreach (new LimitIterator($gener, 0, 50) as $h) {
curl_multi_add_handle($mh, $h);
}
// Process messages and add more if available in the generator
function process_messages($mh, \Iterator $gener)
{
$addedRequest = false;
while ($done = curl_multi_info_read($mh)) {
curl_multi_remove_handle($mh, $done['handle']);
curl_close($done['handle']);
$gener->next();
if ($gener->current()) {
echo '.';
$addedRequest = true;
curl_multi_add_handle($mh, $gener->current());
}
}
return $addedRequest;
}
do {
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc === CURLM_CALL_MULTI_PERFORM);
if (process_messages($mh, $gener)) {
$active = true;
}
if (curl_multi_select($mh, 1) === -1) {
usleep(250);
}
} while ($active);
curl_multi_close($mh); |
Thanks for answer and tests! I implement a similar test here and the get same behavior. So, I think that I need to implement it as a qeue, of parallel processes. |
No problem. I also had a conversation on Twitter with cURL's creator, and we came to the same conclusion: https://twitter.com/bagder/status/495320333549711360. I think you could possibly work around this by adding a usleep() to the iterator that yields the parallel requests. Something like: $gen = function(ClientInterface $client, $total, $url, $body, $skip) {
for ($i = 0; $i < $total; $i++) {
if ($i > $skip) {
usleep(200);
}
yield $client->createRequest('PUT', $url, ['body' => $body]);
}
};
$client->sendAll(
$gen($client, $total, $url, $body, 50),
['parallel' => $parallel]
); You could even get more fancy and poll the system load average and sleep based on that: http://php.net/manual/en/function.sys-getloadavg.php. I was able to reduce CPU by adding All that said, there's nothing that can or should be done in Guzzle by default to work around this. |
I'll try to apply the work around and analyze results, some delay must resolve very well. When I have a solution I send you an e-mail to keep you informed about the situation. Thank you for your help! |
@mtdowling we are having the similar issue. What was your value for CURLOPT_MAX_SEND_SPEED_LARGE and CURLOPT_MAX_RECV_SPEED_LARGE? I'm having hard time understanding how CURLOPT_MAX_RECV_SPEED_LARGE works |
tried set MAX_RECV_SPEED_LARGE value to 1024 ? |
Hi,
I'm getting 100% of CPU usage when executing parallel request with sendAll(). This lock the CPU and the whole system is failing. See in attachment:
The text was updated successfully, but these errors were encountered: