New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low performance at the driver level at some times #1327
Comments
Assuming these requests are spread across various web workers, you might expect different timings depending on whether the request is able to re-use an existing libmongoc client from a previous request (see: Connection Handling). The libmongoc clients are hashed based on the Manager constructor arguments, so based on the script you provided any worker running this script beyond the first time would be able to re-use a client. Drivers implement server discovery and monitoring (SDAM), which adds a bit of overhead. This is particularly noticeable when a new client object is created, as the driver needs to discover all members of the topology. And PHP, by its use of the single-threaded libmongoc client, has more impact on the application since SDAM cannot be offloaded to a background thread and must be performed during application requests. The Manager constructor performs no IO apart from SRV lookups (not relevant here), so in your example script we would expect to see a delay in the first call to I will also note that the internal architecture of the If you would like to analyze performance more meaningfully and actually try to determine an issue with the PHP driver, I would suggest the following:
|
Thanks.
Mongodb
Ok, I tested it, but I don't think it made a difference.
import pymongo
import time
from bson import ObjectId
t0 = time.time()
c = pymongo.MongoClient("mongodb://user:pass@192.168.1.2,192.168.1.3,192.168.1.4/?authSource=MyDb&replicaSet=rs1&compressors=zstd")
print(time.time() - t0);
db = c["MyDb"]
col = db["sessions"]
t1 = time.time()
doc = col.find_one({"_id" : ObjectId("6295fdd2049dcd5aca422688")})
print((time.time() - t1));
print(doc) output :
<?php
class QueryTimeCollector implements \MongoDB\Driver\Monitoring\CommandSubscriber {
static $benchmark = [];
public function commandStarted( \MongoDB\Driver\Monitoring\CommandStartedEvent $event ) {
self::$benchmark[$event->GetRequestId()] = hrtime(true);
$server = $event->getServer();
echo "**commandStarted** \n";
echo "getCommandName : {$event->getCommandName()}\n";
echo "getDatabaseName : {$event->getDatabaseName()}\n";
echo "getOperationId : {$event->getOperationId()}\n";
echo "getRequestId : {$event->getRequestId()}\n";
echo "getServiceId : {$event->getServiceId()}\n";
echo "server info :\n";
echo "-- getHost : {$server->getHost()}\n";
echo "-- getLatency : {$server->getLatency()}\n";
echo "-- getPort : {$server->getPort()}\n";
$desc = $server->getServerDescription();
echo "-- getInfo :\n";
var_dump($server->getInfo());
echo "-- getServerDescription\n";
echo "----- getHost : {$desc->getHost()}\n";
echo "----- getLastUpdateTime : {$desc->getLastUpdateTime()}\n";
echo "----- getPort : {$desc->getPort()}\n";
echo "----- getRoundTripTime : {$desc->getRoundTripTime()}\n";
echo "----- getType : {$desc->getType()}\n";
echo "----- getHelloResponse:\n";
var_dump($desc->getHelloResponse());
echo "-- getTags :\n";
var_dump($server->getTags());
echo "-- getType : {$server->getType()}\n";
echo "-- isArbiter : {$server->isArbiter()}\n";
echo "-- isHidden : {$server->isHidden()}\n";
echo "-- isPassive : {$server->isPassive()}\n";
echo "-- isPrimary : {$server->isPrimary()}\n";
echo "-- isSecondary : {$server->isSecondary()}\n";
echo "\n\n\n";
}
public function commandSucceeded( \MongoDB\Driver\Monitoring\CommandSucceededEvent $event ) {
echo "**commandSucceeded** \n";
echo "php time:" . round((hrtime(true) - self::$benchmark[$event->GetRequestId()]) / 1000000,3) . "\n";
echo "getCommandName : {$event->getCommandName()}\n";
echo "getDurationMicros : {$event->getDurationMicros()}\n";
echo "getOperationId : {$event->getOperationId()}\n";
echo "getRequestId : {$event->getRequestId()}\n";
echo "getServiceId : {$event->getServiceId()}\n";
echo "getReply :\n";
var_dump($event->getReply());
$server = $event->getServer();
echo "server info :\n";
echo "-- getHost : {$server->getHost()}\n";
echo "-- getLatency : {$server->getLatency()}\n";
echo "-- getPort : {$server->getPort()}\n";
$desc = $server->getServerDescription();
echo "-- getInfo :\n";
var_dump($server->getInfo());
echo "-- getServerDescription\n";
echo "----- getHost : {$desc->getHost()}\n";
echo "----- getLastUpdateTime : {$desc->getLastUpdateTime()}\n";
echo "----- getPort : {$desc->getPort()}\n";
echo "----- getRoundTripTime : {$desc->getRoundTripTime()}\n";
echo "----- getType : {$desc->getType()}\n";
echo "----- getHelloResponse:\n";
var_dump($desc->getHelloResponse());
echo "-- getTags :\n";
var_dump($server->getTags());
echo "-- getType : {$server->getType()}\n";
echo "-- isArbiter : {$server->isArbiter()}\n";
echo "-- isHidden : {$server->isHidden()}\n";
echo "-- isPassive : {$server->isPassive()}\n";
echo "-- isPrimary : {$server->isPrimary()}\n";
echo "-- isSecondary : {$server->isSecondary()}\n";
echo "\n\n\n";
}
public function commandFailed( \MongoDB\Driver\Monitoring\CommandFailedEvent $event ){}
}
class SDAMTimeCollector implements \MongoDB\Driver\Monitoring\SDAMSubscriber{
static $benchmark = [
'serverOpening' => [],
'topologyOpening' => [],
'serverHeartbeatStarted' => [],
];
public function serverOpening(\MongoDB\Driver\Monitoring\ServerOpeningEvent $event){
$tid = (string)$event->getTopologyId();
echo "serverOpening : $tid\n";
self::$benchmark['serverOpening'][(string)$event->getTopologyId()] = hrtime(true);
}
public function serverChanged(\MongoDB\Driver\Monitoring\ServerChangedEvent $event){
echo "ServerChanged : {$event->getTopologyId()}\n";
}
public function serverClosed(\MongoDB\Driver\Monitoring\ServerClosedEvent $event){
$tid = (string)$event->getTopologyId();
echo
"ServerClosed($tid) :".
round((hrtime(true) - self::$benchmark['serverOpening'][$tid]) / 1000000,3).
"\n"
;
}
public function serverHeartbeatFailed(\MongoDB\Driver\Monitoring\ServerHeartbeatFailedEvent $event){}
public function serverHeartbeatStarted(\MongoDB\Driver\Monitoring\ServerHeartbeatStartedEvent $event){
echo "serverHeartbeatStarted\n";
self::$benchmark['serverHeartbeatStarted'][] = hrtime(true);
}
public function serverHeartbeatSucceeded(\MongoDB\Driver\Monitoring\ServerHeartbeatSucceededEvent $event){
echo
'serverHeartbeatSucceeded :'.
round((hrtime(true) - array_pop(self::$benchmark['serverHeartbeatStarted'])) / 1000000,3).
"\n"
;
}
public function topologyOpening(MongoDB\Driver\Monitoring\TopologyOpeningEvent $event){
$tid = (string)$event->getTopologyId();
echo "topologyOpening : $tid\n";
self::$benchmark['topologyOpening'][$tid] = hrtime(true);
}
public function topologyChanged(\MongoDB\Driver\Monitoring\TopologyChangedEvent $event){
echo "topologyChanged : {$event->getTopologyId()}\n";
}
public function topologyClosed(\MongoDB\Driver\Monitoring\TopologyClosedEvent $event){
$tid = (string)$event->getTopologyId();
echo
"topologyClosed($tid) :".
round((hrtime(true) - self::$benchmark['topologyOpening'][$tid]) / 1000000,3).
"\n"
;
}
}
\MongoDB\Driver\Monitoring\addSubscriber(new QueryTimeCollector());
\MongoDB\Driver\Monitoring\addSubscriber(new SDAMTimeCollector());
$firstTime1 = hrtime(true);
$mng = new \MongoDB\Driver\Manager(
'mongodb://user:pass@192.168.1.2,192.168.1.3,192.168.1.4/?authSource=MyDb&replicaSet=rs1&compressors=zstd',
[]
//,['disableClientPersistence' => true]
);
$endTime1 = hrtime(true) - $firstTime1;
$firstTime2 = hrtime(true);
//$mng->selectServer($mng->getReadPreference());
$endTime2 = hrtime(true) - $firstTime2;
$firstTime3 = hrtime(true);
$cursor = $mng->executeQuery(
'MyDb.sessions',
new \MongoDB\Driver\Query([
'_id' => new MongoDB\BSON\ObjectId('6295fdd2049dcd5aca422688')
])
);
$endTime3 = hrtime(true) - $firstTime3;
$cursor->SetTypeMap([
'root' => 'array',
'document' => 'array'
]);
$cursor->rewind();
$firstTime4 = hrtime(true);
$row = $cursor->current();
$endTime4 = hrtime(true) - $firstTime4;
// var_dump($row);
var_dump('endTime1: ' . (round($endTime1 / 1000000,3)));
var_dump('endTime2: ' . (round($endTime2 / 1000000,3)));
var_dump('endTime3: ' . (round($endTime3 / 1000000,3)));
var_dump('endTime4: ' . (round($endTime4 / 1000000,3))); Outputs :With
|
I've chosen to use milliseconds for the following measurements. I'll also note that PyMongo's MongoClient the PHP driver's Manager class are equivalent despite the different names.
Constructing the client object is very fast, which makes sense as no IO is performed (since you're not using a I expect PyMongo immediately spawns a background thread for SDAM, but since the first operation you perform in the Python script is a I've included all four of your PHP variations for completeness, but there is no noticeably difference between them. Since the script itself only constructs a single Manager instance, I also rewrote your original Python and PHP scripts while investigating this, which I've published in https://gist.github.com/jmikola/dfcad9bc4e512b22dbb04beed4dc0a99. The main changes were:
The full scripts are in the Gist above, but the results are reproduced below:
Both scripts were run against a local 6.0.0-rc8 replica set (one primary and one secondary) w/o authentication. AFAIK, your scripts were run against a replica set w/ authentication. These results aren't far from your own measurements. Constructing a client object appears to be faster in PHP (irrelevant since no IO is performed). In the PHP script, I also included an additional call to Given your original measurements and my own above, I see no indication that the PHP driver is behaving incorrectly. |
From web server :
From CLI :
In python :
|
In the "web server" example you shared, The query execution is still comparable to what you observed with PyMongo. I don't have an answer for why the CLI script's I realize I overlooked two possible factors in my original gist:
In my tests, the collection is empty apart from the one document we are deleting, inserting, and querying. That is something to consider if you were running this against an existing collection with data. The gist has been updated and my most recent results follow:
I did notice that running both scripts quickly in succession resulted in a much shorter timing being reported for Manager construction (other timings were not affected). This was likely due to libmongoc using a cached DNS result from the first lookup performed by PyMongo. For the above timings, I waited a bit between both scripts to demonstrate comparable DNS lookup times across both drivers. Here, the 379ms spent on the first The subsequent timings for PyMongo's Lastly, I should point out that since we're benchmarking here it's important not to focus on results from a single run, where there may be anomalies due to network latency. In my local testing, I ran the PHP script dozens of times and noted the measurements for Taking a step back and comparing the average results for both drivers against a common, remote server (Atlas free tier in this case) does not reveal any significant differences for me. |
jmikola i found a clue ! In my first comment, I talked about the following discussion :
In the last test, I did not complete the test. So I tested on the web server more than once: First request
Second request
Third request
Forth request (the clue is here)
I think the issue is in the Heartbeat. |
@jmikola I tested on localhost on my server (with drop web server 1 from replica set) and computer(by xampp application) on single instance of mongod but there is still this problem and this delay in heartbeat occurs. The important point in testing on xampp is that there is a delay in the first execution of the php script on the Apache web server, and in subsequent requests (for example, the next 10) there is no delay, and again after a while There may be another minute delay, but what is clear is that the number of occurrences of this delay is much lower than the server. This delay is always present in the computer's php console environment. |
When you say "there may be another minute delay", do you mean that a delay in the hearbeat event (e.g. 175ms instead of the typical 2ms) reoccurs after one minute has elapsed? If the delay occurs after one minute has elapsed, I expect that is tied to the heartbeat interval, which defaults to 60 seconds for single-threaded drivers (see: heartbeatFrequencyMS defaults to 10 seconds or 60 seconds).
If each test is running in a fresh CLI process, no state is inherited (unlike a PHP worker in a web environment, which might re-use a persisted libmongoc client), so I'd expect each process to start with a series of heartbeats as the topology is discovered.
In your most recent testing, is the delay only observable with the I'd also be curious to see what results you observe when connecting to a free tier Atlas cluster. If we can observe an error there, it would rule out any issues in your local MongoDB deployment and assure me that there is actually an issue with the driver or libmongoc. |
Yes you are right. Also this delay occurs on the product server(
According to what you said before, exactly.
No, In my last test, I removed
I tested this script in the free layer Atlas cluster with the following connection string:
Result from web server enviorment :First request
Second request (after a few second)
Third request (after a few second)
Result from CLI enviorment :
There still seems to be a problem with the heartbeat. |
The only processing happening in the PHP driver itself is going to be BSON conversion and invocation of event subscribers, neither of which would account for multiple seconds. The only other thing I can think of that would potentially add multiple seconds to these initial requests might be DNS resolution. That would not be included in the event durations reported by libmongoc, but would certainly appear in the timings we're generating from the event subscriber methods. AFAIK, there's no way to avoid DNS resolution for Atlas, so for this next experiment we should return to testing a local replica set. I'd like to repeat the replica set experiment without DNS resolution (i.e. using IP addresses for everything). Please do the following:
Repeat the tests through the web server environment, and stagger them in the same fashion so one of the later requests occurs after the point where we'd expect additional heartbeats to be issued (at least one minute from the first request within the same web worker process). |
That's right, the major speed problem was solved. But I still feel the speed can get better. I change the server name to ip address in two point :
I used this javascript test code: let
output = '',
maxCount = 0,
count = 0
;
function Test(start){
if(start){
output = '';
maxCount = count = start;
}
count--;
$.get('https://example.com/test.php').done(function(response){
output += `Response ${maxCount - count} : \n ${response}\n\n`;
if(count !== 0)
setTimeout(Test,1000);
else
$('body').html(output);
});
}
Test(180); output
Server info :
|
If so, that hints at DNS being at least one of the culprits here. It's not clear to me if the JavaScript code was hitting a single web server or if requests were being distributed between two, but I don't think that makes much difference. Grepping the output for heartbeats beyond the norm shows the following:
The fact that libmongoc's calculated event durations fall very closely in line with the Offhand, I'm not sure if Logging Slow Operations applies to If it turns out the server isn't spending a lot of time responding to |
What happened?
I have a problem with mongodb and I can not get the best performance.
Sometime my one query have a long delay(10ms or 50ms or 200ms) in the php program.
Environment
Testing on a
web server 1
https://example.com/test.php
Results
Problems with the driver?
I did the following to find the problem area and came to the conclusion that the problem should be with the driver.
I ran the following query in the secondary node mongodb shell in the
web server 1
.mongo 'mongodb://user:pass@192.168.1.2,192.168.1.3,192.168.1.4/?authSource=MyDb&replicaSet=rs1&compressors=zstd'
Thanks.
The text was updated successfully, but these errors were encountered: