New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPEC-1555 Consider connection pool health during server selection #876
SPEC-1555 Consider connection pool health during server selection #876
Conversation
Due to discussion on the other draft PR, I updated the "health" determination to include the WaitQueue length of a particular pool. This will help in situations where all pools are full and have WaitQueues, and it will also help in situations where a non-full pool has developed a WaitQueue due to waiting on maxConnecting. |
I suggest adding rationales/descriptions for:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented the tests and they are passing as expected. A few comments on the test format.
``in_window`` array and values are numbers in [0, 1] indicating the frequency | ||
at which the server should have been selected. | ||
|
||
For each file, pass the information from `in_window` to whatever function is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should also configure the read preference used for server selection. I suggest adding a read_preference
field to be consistent with the existing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this in the meeting, but since this test is verifying what happens after the set of available servers is determined and the topologies will always be sharded, a read preference shouldn't be necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe my test runner implementation is different. In mine the read preference does actually have an impact. Right now I'm arbitrarily using Nearest. My implementation is roughly:
topology = create_mock_topology_from(test['topology_description'])
mock_operation_counts(topology, test['in_window']) # Mock operationCount for each server in 'in_window'
read_preference = Nearest()
counts = ...
for _ in range(ITERATIONS):
server = topology.select_server(read_preference)
counts[server.address] += 1
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, yeah that makes sense. In my implementation I skipped right to the in window portion, so I didn't need a TopologyDescription or a read preference. I imagine most drivers will probably implement your way though.
Rather than including a read preference in each spec test, I just updated the runner's description to say use a default or "primary" read preference since it shouldn't matter for sharded topologies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still feels a little odd that we have no explicit tests for replica sets. Perhaps the prose test can cover it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server selection logic tests already ensure that drivers properly determine what is in the latency window across different topology types, and these tests ensure that a server is selected properly from what's within the latency window. Putting those together should provide us full coverage, so I don't think it's necessary to test all topologies in these tests.
Given that selecting from within the window is topology agnostic, the topology is provided in these tests purely as a convenience for implementing the test runners; it doesn't really have any bearing on the results of the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand It's not unreasonable for a driver to have two code paths for server selection and adding one test for a replica set is trivial, so why not add a single replica set test? The test format supports it as long as we say to use Nearest read preference instead of Primary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main hesitation for adding it was that if we're treating the provided topology as more than just a convenience (i.e. as part of the tests' coverage), then drivers would be required to use it to implement the tests, even though it isn't necessary since the feature we're testing should ideally be topology agnostic. That said, if we think there could be separate implementations for selection based on topology, then it's probably worth having full coverage of that. Updated to include a few replset tests.
also only require multithreaded drivers to use new algorithm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has been rewritten to consolidate the various counts we retrieved from the pool to a single operationCount
that is incremented after a server is selected and decremented once the operation it was selected for is completed. Additionally, the consideration of availableConnectionCount was dropped, so no changes to or interaction with CMAP is required for this work. This should make the changes here much simpler and easier to understand.
``in_window`` array and values are numbers in [0, 1] indicating the frequency | ||
at which the server should have been selected. | ||
|
||
For each file, pass the information from `in_window` to whatever function is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this in the meeting, but since this test is verifying what happens after the set of available servers is determined and the topologies will always be sharded, a read preference shouldn't be necessary here.
@@ -65,3 +65,37 @@ the TopologyDescription. Each YAML file contains a key for these stages of serve | |||
Drivers implementing server selection MUST test that their implementation | |||
correctly returns the set of servers in ``in_latency_window``. Drivers SHOULD also test | |||
against ``suitable_servers`` if possible. | |||
|
|||
Selection Within Latency Window Tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the "algorithm" is much simpler, is it still worth it to have all drivers implement these unit tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's still worthwhile. I very much prefer these tests to no tests. That said, do you have anything in mind for real end-to-end tests for this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're limited a bit by the topology. We could include some of those experiments I did as prose tests--for example, describe a test against a two mongos sharded topology where one of the mongoses has a failpoint that makes every operation take 500ms, then do a ton of concurrent stuff and then assert that the non-failpoint node got picked a lot more. While this does verify the operationCount behavior, I'm not sure whether this is preferable to having a bunch of unit spec tests, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open a jira ticket so we can further discuss adding that prose test. I'm in favor of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just go ahead and add it now? I think it'll be most useful if drivers can have the test case when they're implementing this for the first time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's just add it now. I doubt the unified test format is expressive enough for a test like this so it'll need to be a prose test like you said. You can use the blockConnection option for failCommand like this:
db.adminCommand({
configureFailPoint: "failCommand",
mode: {times: 100000}, # or "alwaysOn"
data: {
failCommands: ["find"],
blockConnection: true,
blockTimeMS: 500,
appName: "loadBalancingTest",
},
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. When you run it, can you let me know what you get for % of operations routed to the slow node? Curious to see if theres any differences across drivers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented. I ran it a few times and see:
{('localhost', 27017): 11, ('localhost', 27018): 89}
{('localhost', 27017): 12, ('localhost', 27018): 88}
{('localhost', 27017): 13, ('localhost', 27018): 87}
{('localhost', 27017): 15, ('localhost', 27018): 85}
{('localhost', 27017): 17, ('localhost', 27018): 83}
Exciting to see this feature in action!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get around 12-15% too, so it seems like our implementations are consistent, nice!
Also, our discussions made me realize these changes have no effect on single threaded drivers, so I updated the spec to require this only for multi-threaded or async drivers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, our discussions made me realize these changes have no effect on single threaded drivers, so I updated the spec to require this only for multi-threaded or async drivers.
Interesting, this is in line with our scope which says:
[Non-goals]:
Change the behavior of single-threaded drivers
Single-threaded driver instances do not maintain connection pools and thus are unlikely to cause connection storms on their own
A quick note that if single threaded drivers begin supporting any feature that require connection pinning (like OP_MSG exhaust cursors), then they will actually need to implement operationCount.
@@ -65,3 +65,37 @@ the TopologyDescription. Each YAML file contains a key for these stages of serve | |||
Drivers implementing server selection MUST test that their implementation | |||
correctly returns the set of servers in ``in_latency_window``. Drivers SHOULD also test | |||
against ``suitable_servers`` if possible. | |||
|
|||
Selection Within Latency Window Tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's still worthwhile. I very much prefer these tests to no tests. That said, do you have anything in mind for real end-to-end tests for this feature?
``in_window`` array and values are numbers in [0, 1] indicating the frequency | ||
at which the server should have been selected. | ||
|
||
For each file, pass the information from `in_window` to whatever function is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe my test runner implementation is different. In mine the read preference does actually have an impact. Right now I'm arbitrarily using Nearest. My implementation is roughly:
topology = create_mock_topology_from(test['topology_description'])
mock_operation_counts(topology, test['in_window']) # Mock operationCount for each server in 'in_window'
read_preference = Nearest()
counts = ...
for _ in range(ITERATIONS):
server = topology.select_server(read_preference)
counts[server.address] += 1
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec changes look great, thank you.
I have not attempted to implement the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec changes look great, I'll LGTM but we should hold off on merge until Shane validates the tests
selected = in_window[0] | ||
else: | ||
server1, server2 = random two entries from in_window | ||
if server1.operation_count >= server2.operation_count: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be <=
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thank you for catching this! This would've been a pretty bad error to leave in there.
source/server-selection/tests/in_window/maxConnecting-waitQueue.yml
Outdated
Show resolved
Hide resolved
@@ -65,3 +65,37 @@ the TopologyDescription. Each YAML file contains a key for these stages of serve | |||
Drivers implementing server selection MUST test that their implementation | |||
correctly returns the set of servers in ``in_latency_window``. Drivers SHOULD also test | |||
against ``suitable_servers`` if possible. | |||
|
|||
Selection Within Latency Window Tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open a jira ticket so we can further discuss adding that prose test. I'm in favor of it.
Co-authored-by: Matt Broadstone <mbroadst@gmail.com>
selected = in_window[0] | ||
else: | ||
server1, server2 = random two entries from in_window | ||
if server1.operation_count >= server2.operation_count: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thank you for catching this! This would've been a pretty bad error to leave in there.
@@ -65,3 +65,37 @@ the TopologyDescription. Each YAML file contains a key for these stages of serve | |||
Drivers implementing server selection MUST test that their implementation | |||
correctly returns the set of servers in ``in_latency_window``. Drivers SHOULD also test | |||
against ``suitable_servers`` if possible. | |||
|
|||
Selection Within Latency Window Tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just go ahead and add it now? I think it'll be most useful if drivers can have the test case when they're implementing this for the first time.
source/server-selection/tests/in_window/maxConnecting-waitQueue.yml
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,60 @@ | |||
description: Selections from many choices occur at correct frequencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test includes some really small frequencies, so I upped the iteration count to 10,000 and decreased the threshold to 2%. Let me know if this works for your runner okay and doesn't take too long to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm 10000 makes the tests take ~300ms each vs ~50ms for 2000 iterations which is pretty slow for a unit test. What if we made this configurable per test?:
in_window: ...
outcome:
iterations: 10000
tolerance: 0.02
expected_frequencies:
a:27017: 0.22
b:27017: 0.18
c:27017: 0.18
d:27017: 0.125
e:27017: 0.125
f:27017: 0.074
g:27017: 0.074
h:27017: 0.0277
i:27017: 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be a nice solution, but the false negative you got in the other thread makes me worried about 2% tolerance with 10k iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we can still implement this proposal, let's just increase outcome.tolerance
for this test and/or reduce the number of servers in the topology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increased tolerance to 0.03, which seems to work fine for me. Let me know if that has any issues for your runner.
Note: I left iterations
outside of outcome
since it had more to do with running the test then inspecting the outcome. tolerance
is under there though.
@@ -0,0 +1,60 @@ | |||
description: Selections from many choices occur at correct frequencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm 10000 makes the tests take ~300ms each vs ~50ms for 2000 iterations which is pretty slow for a unit test. What if we made this configurable per test?:
in_window: ...
outcome:
iterations: 10000
tolerance: 0.02
expected_frequencies:
a:27017: 0.22
b:27017: 0.18
c:27017: 0.18
d:27017: 0.125
e:27017: 0.125
f:27017: 0.074
g:27017: 0.074
h:27017: 0.0277
i:27017: 0
- include iterations and tolerance - require the usage of the topology description - add tests for replica sets - rename in_window to mocked_topology_state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a minor suggestion!
6. Disable the failpoint. | ||
|
||
7. Repeat this test without any failpoints and assert that each mongos was | ||
selected roughly 50% of the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add roughly 50% of the time within +/-10% tolerance.
? I initially implemented this with 5% but that failed with: AssertionError: 0.55 != 0.5 within 0.05 delta (0.050000000000000044 difference)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@mbroadst @ShaneHarvey |
@patrickfreed looks good! |
SPEC-1555
This PR updates the Server Selection algorithm to consider connection pool health as per the design in DRIVERS-781. It also updates the CMAP spec to require liveness checking of pooled connections.
A first draft of this PR was filed on my fork while the maxConnecting changes were still in flux. See the initial rounds of discussion there: patrickfreed#1