Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(topology): connect server only on next tick #2685

Conversation

rose-m
Copy link
Contributor

@rose-m rose-m commented Jan 6, 2021

Description

The PR defers connecting to a server until the next tick.

If the connection is triggered immediately, an infinite recursion might occur. This is caused for example by an invalid TLS client certificate (i.e. invalid content) that can be specified via the tlsClientCertificateFile option during MongoClient.connect.

The recursion occurs due to the order of operations in connectServers (https://github.com/mongodb/node-mongodb-native/blob/master/src/sdam/topology.ts#L824). The server is connected to before it is added to the internal known server descriptions. An invalid TLS certificate file however causes an immediate error in server.connect() (https://github.com/mongodb/node-mongodb-native/blob/master/src/sdam/topology.ts#L810). This error is caught and tried to react synchronously in an event listener which tries to access the known server descriptions - which have not been updated yet.

A simple reproduction of the stack overflow is the following:

const MongoClient = require('mongodb').MongoClient;

(async() => {
    try {
        const client = await MongoClient.connect('mongodb://localhost/?directConnection=true', {
            tlsCertificateKeyFile: './package.json'
        });
        console.log('that should not work.');
    } catch (e) {
        console.error(e);
    }
})();

What changed?
Connecting to a server is not triggered immediately in the createAndConnectServer
call but on next tick afterwards.

Are there any files to ignore?
No.

NODE-2984

@kmahar kmahar requested review from nbbeeken and emadum January 8, 2021 16:27
@agolin95 agolin95 added the tracked-in-jira Ticket filed in MongoDB's Jira system label Jan 11, 2021
@nbbeeken
Copy link
Contributor

There seems to be a number of tests failing in the "Server Discovery and Monitoring (spec)" suite, looks like there's a few issues, the pool generation isn't changed in time for the test's assertion and there's an issue with the topology type still being unknown when the test runs. This could be something where the runner has to wait for the nextTick to occur but it seems like that internal tracking should be all correctly resolved by the time the callbacks execute the tests assertions. @rose-m you can explore test/unit/sdam/spec.test.js if you'd like to see if there's a fix but otherwise one of us will take a look when we get the chance to.

@rose-m
Copy link
Contributor Author

rose-m commented Jan 12, 2021

I've looked into it and the best way I could see was to also defer the serverSelection call inside Topology.connect to the next tick. Let me know if it still looks good to you! 😊

If the connection is triggered immediately, an infinite recursion
might occur. This is caused for example by an invalid TLS client
certificate (i.e. invalid content) that can be specified via the
tlsClientCertificateFile option during MongoClient.connect.
@rose-m rose-m force-pushed the fix/prevent-infinite-recursion-when-tls-file-invalid branch from cc6329c to 50c8b9e Compare January 12, 2021 07:35
Copy link
Contributor

@emadum emadum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are leaking resources as a result of this change (causing timeouts in Evergreen), probably related to that close call.

if (err) {
this.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this close should be here, did adding it fix a test that was failing or something?

Copy link
Contributor Author

@rose-m rose-m Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is just due to the diff view - it is essentially the same this.close() call as https://github.com/mongodb/node-mongodb-native/pull/2685/files#diff-e2ebfc23930db405572db68fb3054e441b2c45f9d7ac1e62b0806862e39de8ddL353.

It's now just part of the nextTick callback.

@rose-m
Copy link
Contributor Author

rose-m commented Jan 13, 2021

Unfortunately I've absolutely no idea how that happens or where it fails... even without that it locally fails for me right now...

@emadum
Copy link
Contributor

emadum commented Jan 15, 2021

Unfortunately I've absolutely no idea how that happens or where it fails... even without that it locally fails for me right now...

I just updated the branch I have for investigating this issue to the latest version of master, and am no longer able to reproduce the infinite recursion. @rose-m could you confirm if it's still an issue on your end? Thanks!

@rose-m
Copy link
Contributor Author

rose-m commented Jan 18, 2021

@emadum Indeed - I also tried it locally and it seems the problem has vanished 👏 I'll leaving closing the issue and JIRA ticket to you?

@emadum
Copy link
Contributor

emadum commented Jan 19, 2021

This is no longer an issue on latest master

@emadum emadum closed this Jan 19, 2021
@rose-m rose-m deleted the fix/prevent-infinite-recursion-when-tls-file-invalid branch January 19, 2021 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracked-in-jira Ticket filed in MongoDB's Jira system
Projects
None yet
4 participants