Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No suitable servers found #791

Closed
RileyJia opened this issue Apr 2, 2018 · 16 comments
Closed

No suitable servers found #791

RileyJia opened this issue Apr 2, 2018 · 16 comments
Labels

Comments

@RileyJia
Copy link

RileyJia commented Apr 2, 2018

Description

The message is:

Fatal error: Uncaught exception 'MongoDB\Driver\Exception\ConnectionTimeoutException' with message 'No suitable servers found (`serverSelectionTryOnce` set): [connection timeout calling ismaster on 'cluster0-shard-00-00-b2gpc.mongodb.net:27017'] [connection timeout calling ismaster on 'cluster0-shard-00-01-b2gpc.mongodb.net:27017'] [TLS handshake failed: -9806 calling ismaster on 'cluster0-shard-00-02-b2gpc.mongodb.net:27017']' in /Users/jia/Sites/vendor/mongodb/mongodb/src/Collection.php:515
Stack trace:
#0 /Users/jia/Sites/vendor/mongodb/mongodb/src/Collection.php(515): MongoDB\Driver\Manager->selectServer(Object(MongoDB\Driver\ReadPreference))
#1 /Users/jia/Sites/testPem.php(25): MongoDB\Collection->find()
#2 {main} thrown in /Users/jia/Sites/vendor/mongodb/mongodb/src/Collection.php on line 515

It's not happening all the time, but I couldn't figure out what cause it. The code is

<?php
 
require 'vendor/autoload.php';
 
$server = "mongodb://csci571:<password>@cluster0-shard-00-00-b2gpc.mongodb.net:27017,cluster0-shard-00-01-b2gpc.mongodb.net:27017,cluster0-shard-00-02-b2gpc.mongodb.net:27017/test?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin";
 
$ctx = stream_context_create(array(
    "ssl" => array(
        "cafile"            => "/usr/local/etc/openssl/cert.pem",
        "allow_self_signed" => false,
        "verify_peer"       => true, 
        "verify_peer_name"  => true,
        "verify_expiry"     => true, 
    ),
  )
);
 
$client = new MongoDB\Client(
        $server,
        array("ssl" => true),
        array("context" => $ctx));
 
$database = $client->test;
$collection = $database->student;
$cursor = $collection->find(['mail' => '']);

function console_log( $data ){   
  echo '<script>';
  echo 'console.log('. json_encode( $data ) .')';
  echo '</script>';
}
?>

First time to use mongodb, I cannot find where the log is.

@derickr
Copy link
Contributor

derickr commented Apr 3, 2018

Hi!

When you say "it's not happening all the time" ­— could you be more precise? Does it happen once out of a hundred times, or once out of every 3?

Which version of PHP, and the MongoDB driver are you using? You can use phpversion(); and phpversion("mongodb"); to find out.

@RileyJia
Copy link
Author

RileyJia commented Apr 3, 2018

If I connect server like the code I provided above (the line for $server ), it's more like 4 out of 5.
Then I searched for solutions, and add &connectTimeoutMS=2000&readPreference=primaryPreferred at the end(don't know why), it will reduce to like 1 out of 5. But it's still not a good solution.

Some version info:

PHP Version => 5.6.35
mongodb

MongoDB support => enabled
MongoDB extension version => 1.3.4
MongoDB extension stability => stable
libbson headers version => 1.9.2
libbson library version => 1.9.3
libmongoc headers version => 1.9.2
libmongoc library version => 1.9.3
libmongoc SSL => enabled
libmongoc SSL library => Secure Transport
libmongoc crypto => enabled
libmongoc crypto library => Common Crypto
libmongoc crypto system profile => disabled
libmongoc SASL => enabled

Directive => Local Value => Master Value
mongodb.debug => no value => no value

When I enter "php -v" in my terminal, it says

PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/local/opt/php56-mongodb/mongodb.so' - dlopen(/usr/local/opt/php56-mongodb/mongodb.so, 9): image not found in Unknown on line 0

Warning: PHP Startup: Unable to load dynamic library '/usr/local/opt/php56-mongodb/mongodb.so' - dlopen(/usr/local/opt/php56-mongodb/mongodb.so, 9): image not found in Unknown on line 0
PHP 5.6.35 (cli) (built: Mar 31 2018 20:21:31) 

However, I already copy the mongodb.so to the extension directory in php56 directory, and set the follow in php.ini:
extension=/usr/local/Cellar/php56/5.6.33_9/lib/php/extensions/no-debug-non-zts-20131226/mongodb.so

I am not sure why it still wants to find the original mongodb.so, because I already delete that.
I used Homebrew to install everything, and this warning didn't appear when the first time I installed it. So I figured if I reinstall the php56-mongodb directory, this warning will disappear. But right now when I type
brew install php56-mongodb

it says

Error: No available formula with the name "php56-mongodb" 
==> Searching for a previously deleted formula (in the last month)...
Warning: homebrew/core is shallow clone. To get complete history run:
  git -C "$(brew --repo homebrew/core)" fetch --unshallow

Error: No previously deleted formula found.
==> Searching for similarly named formulae...
==> Searching local taps...
Error: No similarly named formulae found.
==> Searching taps...
==> Searching taps on GitHub...
Error: No formulae found in taps.

@jmikola
Copy link
Member

jmikola commented Apr 9, 2018

Connection Error

If we break down the original connection error, we see that server selection fails with the following individual errors for each node in the connection string's seed list:

  • connection timeout calling ismaster on 'cluster0-shard-00-00-b2gpc.mongodb.net:27017'
  • connection timeout calling ismaster on 'cluster0-shard-00-01-b2gpc.mongodb.net:27017'
  • TLS handshake failed: -9806 calling ismaster on 'cluster0-shard-00-02-b2gpc.mongodb.net:27017'

Connection timeout is indicative of a generic socket timeout; however, I don't think lowering connectionTimeoutMS from its 10000 default to 2000 would help with that. If the driver can't reach the server in 10 seconds, lowering the timeout to two seconds won't improve the situation. That said, it is advisable to tune connection timeout to the smallest value your deployment can tolerate (e.g. maximum expected latency plus a bit of overhead), as that will improve application responsiveness in the event of downtime (rather than block while waiting for connections to an inaccessible server to time out).

TLS handshake failure is more interesting and originates from mongoc_stream_tls_secure_transport_handshake(), since phpinfo() output indicates that you're using Secure Transport for TLS. Deciphering the -9806 error code requires us to consult the documentation for SSLHandshake() and Secure Transport Result Codes. This corresponds to errSSLClosedAbort, which is briefly described as:

The connection closed due to an error.

I'm not familiar with the inner workings of Secure Transport, but I expect this may be another manifestation of the connection timeout experienced with the other two nodes, but occurring during the TLS handshake process. Best I can tell, the only problem here is a connection error between your application server and the Atlas server.

As an aside, I've opened CDRIVER-2602 to request that libmongoc translate the codes on its own -- which should save us some time in the future.

Observations

I do have a few observations about your script. Since your connection URI includes ssl=true, specifying ['ssl' => true] as the second options array to the Client constructor is redundant. Additionally, most of the SSL Context Options you specified are not actually used by the driver. If you consider the MongoDB\Driver\Manager::__construct() documentation for the third, $driverOptions array, only cafile and allow_self_signed are being used. This can be attested by examining the logic in php_phongo_make_ssl_opt() within the extension. I realize it's written in C, but the logic for checking the primary and fallback context options should be readable and jive with what we read in the documentation.

Lastly, I find it a bit curious that you're using driver version 1.3.4 but libmongoc/libbson 1.9.3. The 1.3.4 driver bundles libmongoc 1.8.2. It's clear that you're using libmongoc and libbson as system libraries, but I wonder why you've not upgraded to driver version 1.4.2 in this case (which bundles libmongoc 1.9.3). That said, there should be no issues using a newer version of libmongoc/libbson with the older driver, since the shared libraries preserve ABI.

Installation Issues

Other users have reported similar issues installing the driver with Homebrew (Homebrew/brew#4021). I believe this is related to https://github.com/Homebrew/homebrew-php being recently deprecated, so we may need to update the macOS and Homebrew install docs to promote PECL as the correct install mechanism. That said, there appear to be some challenges with the move to PECL (Homebrew/homebrew-core#26108), so we'll have to consider that as well before rewriting the install docs.

@RileyJia
Copy link
Author

RileyJia commented Apr 10, 2018

I issued a problem on https://jira.mongodb.org/browse/MMSSUPPORT-19732 about not able to connect to mongoDB several weeks ago, and the above connection code was provided by an assignee.

Also, I don't know why the driver version is not matched with libmongoc. I don't recall that I intentionally specify any version during installation. Just download php56-mongodb using Home-brew, and add a .so path in php.ini file, and all above message appear in phpinfo(). So what should I do now? How to upgrade the driver?

@jmikola
Copy link
Member

jmikola commented Apr 10, 2018

I issued a problem on https://jira.mongodb.org/browse/MMSSUPPORT-19732 about not able to connect to mongoDB several weeks ago, and the above connection code was provided by an assignee.

Thanks for sharing. I've left a private comment on that issue to inform the support assignee of the mistake.

Looking at that issue, it appears that the connection issues were the result Atlas' IP whitelist and/or macOS Secure Transport not having the necessary certs available, which required you to manually provide a CA file.

If this issue is not resolved, it may be helpful to see log output for a basic script that connects to the server and issues a basic ping command (sufficient to initialize all connections).

// See: http://php.net/manual/en/mongodb.configuration.php
ini_set('mongodb.debug', 'stderr');
 
require 'vendor/autoload.php';
 
$uri = 'mongodb://csci571:<password>@cluster0-shard-00-00-b2gpc.mongodb.net:27017,cluster0-shard-00-01-b2gpc.mongodb.net:27017,cluster0-shard-00-02-b2gpc.mongodb.net:27017/test?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin';

$driverOptions = ['ca_file' => '/usr/local/etc/openssl/cert.pem'];
 
$client = new MongoDB\Client($uri, [], $driverOptions);
$cursor = $client->test->command(['ping' => 1]);
var_dump($cursor->toArray()[0]);

Since this logs to stderr, I'd suggest running this from the CLI environment. Just make sure that php.ini for the CLI environment is loading the driver in the same manner as your web environment.


Also, I don't know why the driver version is not matched with libmongoc. I don't recall that I intentionally specify any version during installation. Just download php56-mongodb using Home-brew, and add a .so path in php.ini file, and all above message appear in phpinfo(). So what should I do now? How to upgrade the driver?

Now that the PHP extension formulae have been removed from Homebrew, the suggested installation method is to use pecl install mongodb. The PECL binary should match that of the PHP environment with which you intend to use the driver.

You should be able to follow the standard PECL install instructions.

@RileyJia
Copy link
Author

RileyJia commented Apr 19, 2018

I have reinstall the mongodb using pecl

What information do you need to solve the original problem about fail to connect to server?

@jmikola
Copy link
Member

jmikola commented Apr 19, 2018

What information do you need to solve the original problem about fail to connect to server?

The first half of my previous comment (#791 (comment)) walks through steps to investigate a connection issue by running a basic script that connects and pings from the CLI with full debugging logged to stderr.

@RileyJia
Copy link
Author

I have run the code that you provided

object(MongoDB\Model\BSONDocument)#5 (1) { ["storage":"ArrayObject":private]=> array(1) { ["ok"]=> int(1) } } was the output

@RileyJia
Copy link
Author

Since I reinstall the mongodb, here is new version info:

PHP Version => 5.6.35

mongodb

MongoDB support => enabled
MongoDB extension version => 1.5.0-dev
MongoDB extension stability => devel
libbson bundled version => 1.9.4
libmongoc bundled version => 1.9.4
libmongoc SSL => enabled
libmongoc SSL library => Secure Transport
libmongoc crypto => enabled
libmongoc crypto library => Common Crypto
libmongoc crypto system profile => disabled
libmongoc SASL => enabled
libmongoc compression => enabled
libmongoc compression snappy => disabled
libmongoc compression zlib => enabled

Directive => Local Value => Master Value
mongodb.debug => no value => no value

@jmikola
Copy link
Member

jmikola commented Apr 20, 2018

I'm not sure how your extension version is 1.5.0-dev if you've installed the driver through PECL. The most recent stable release in PECL is 1.4.3.

Aside from that, there appears to be no issue connecting the the cluster if you're able to issue a ping command successfully (as indicated by the { "ok": 1 } response document.

@RileyJia
Copy link
Author

Not every time it will fail to connect, and I am not sure what's the frequency that it fails to connect. It seems like a random event to me.

@jmikola
Copy link
Member

jmikola commented Apr 21, 2018

If the only error observed is a connection timeout or "TLS handshake failed: -9806" (likely also a connection error per analysis in #791 (comment)), I expect the root cause is outside of the driver's control. For instance, this could be caused by a poor network connection between the application server and Atlas. While the connectTimeoutMS URI option could be increased from its default of 10 seconds, that is already a fairly long duration and I don't expect that asking the driver to wait longer will help.

The only other diagnostic tool available within the driver is trace logs, which would record the flow of functions called throughout libmongoc leading up to the error. These can be enabled via the mongodb.debug INI option. For a PHP script executed on the CLI, mongodb.debug can be set to "stderr" to log trace information to the STDERR stream. For web environments, a directory may be specified for the INI option to instruct the driver to write one trace log per request into that directory. This output is very verbose, so you will want to be selective about when you enable this trace mechanism (best during debugging sessions only and certainly not for production). That said, while trace logs could provide some insight into the exact point that libmongoc encounters the connection/socket error, I doubt it will be actionable information.

@jmikola
Copy link
Member

jmikola commented Apr 24, 2018

An issue recently popped up for libmongoc, which the PHP driver uses, that I believe may be related to this errors you're experiencing. In CDRIVER-2624, the client and server may fail when attempting TLS renegotiation on an existing connection. The issue appears to be related to a non-OpenSSL library being used by either the driver or server. At the moment, 3.6 servers on Atlas should all be using OpenSSL, so I expect this is related to the driver using Secure Channel on macOS.

With regard to the PHP driver, this would approximately happen at intervals >= heartbeatFrequencyMS, which defaults to 60 seconds. That is the interval with which sockets are monitored by the driver. Therefore, this is unlikely to come up in a short-lived CLI script, but it would be more common with a web server where PHP workers (e.g. Apache httpd or FPM worker) are persisting sockets between HTTP requests.

That said, I think you can easily verify whether you are hitting this bug by creating a CLI script based on this example in CDRIVER-2624:

<?php
 
require 'vendor/autoload.php';
 
$server = 'mongodb://csci571:<password>@cluster0-shard-00-00-b2gpc.mongodb.net:27017,cluster0-shard-00-01-b2gpc.mongodb.net:27017,cluster0-shard-00-02-b2gpc.mongodb.net:27017/test?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin';

/* Per my earlier comment about which SSL context options actually apply to the
 * driver, I've reduced this considerably. Note that weak_cert_validation, the
 * canonical alias of the allow_self_signed SSL context option, defaults to true
 * so I've omitted it here. */
$driverOptions = ['ca_file' => '/usr/local/etc/openssl/cert.pem'];

while (true) {
    /* We could easily create one client outside of the loop, but I wanted to
     * demonstrate that each client we construct will share the same persisted
     * sockets since the constructor arguments are the same. By lowering the
     * monitoring interval to 500ms from 60s, we should see the error sooner. */
    $client = new MongoDB\Client($uri, ['heartbeatFrequencyMS' => 500], $driverOptions);

    $client->test->student->findOne();

    /* Provide some indication of how many successful queries are performed */
    echo '.';
}

Running this from the CLI environment will hopefully reproduce the error within the first second of execution.

@jmikola jmikola added the bug label Apr 24, 2018
@jmikola
Copy link
Member

jmikola commented Apr 24, 2018

I've opened PHPC-1169 as a tracking ticket. Once we know whether the libmongoc fix will be backported or not, we can triage our ticket accordingly so that it can be scheduled for a 1.4.x patch release or (worst case) upcoming 1.5.0 release when we bump our libmongoc dependency.

@kaeverens
Copy link

I had a similar issue recently and found the cause was stale DNS caches upstream caused by recent changes to A records. The client would try to translate the domain name in the connection string into IP addresses. Sometimes this would work and sometimes not, as some DNS caches had the correct IP address and others had not yet updated. If that's the issue, you can simply replace the domain names with IP addresses in the connection string.

@jmikola
Copy link
Member

jmikola commented Oct 2, 2018

PHPC-1169 was resolved in driver version 1.5.0, so I am closing this out.

@jmikola jmikola closed this as completed Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants