Skip to content

Commit

Permalink
DRIVERS-2035 Use minimum RTT for CSOT maxTimeMS calculation instead o…
Browse files Browse the repository at this point in the history
…f 90th percentile (#1350)

Require at least 2 RTT samples, otherwise use 0 as RTT. Only keep last 10 samples.
Update tests to wait for multiple RTTs.
  • Loading branch information
ShaneHarvey committed Feb 16, 2023
1 parent 745e486 commit c06650d
Show file tree
Hide file tree
Showing 6 changed files with 323 additions and 86 deletions.
Expand Up @@ -299,7 +299,7 @@ Command Execution
~~~~~~~~~~~~~~~~~

If ``timeoutMS`` is set, drivers MUST append a ``maxTimeMS`` field to
commands executed against a MongoDB server using the 90th percentile RTT of
commands executed against a MongoDB server using the ``minRoundTripTime`` field of
the selected server. Note that this value MUST be retrieved during server
selection using the ``servers`` field of the same `TopologyDescription
<../server-discovery-and-monitoring/server-discovery-and-monitoring.rst#TopologyDescription>`__
Expand All @@ -309,17 +309,17 @@ server is reset to the default description (e.g. due to an error in the
monitoring thread) after it has been selected but before the RTT is
retrieved.

If the 90th percentile RTT of the selected server is less than the remaining
timeoutMS, the value of this field MUST be ``remaining timeoutMS - 90th
percentile RTT``. If not, drivers MUST return a timeout error without
If the ``minRoundTripTime`` is less than the remaining timeoutMS,
the value of this field MUST be ``remaining timeoutMS - minRoundTripTime``.
If not, drivers MUST return a timeout error without
attempting to send the message to the server. This is done to ensure that an
operation is not routed to the server if it will likely fail with a socket
timeout as that could cause connection churn. The ``maxTimeMS`` field MUST be
appended after all blocking work is complete.

After wire message construction, drivers MUST check for timeout before
writing the message to the server. If the timeout has expired or the amount
of time remaining is less than the selected server's 90th percentile RTT,
of time remaining is less than the selected server's minimum RTT,
drivers MUST return the connection to the pool and raise a timeout exception.
Otherwise, drivers MUST set the connection’s write timeout to the remaining
``timeoutMS`` value before writing a message to the server. After the write
Expand Down Expand Up @@ -899,6 +899,16 @@ introduce a new knob and increase the API surface of drivers without providing
a significant benefit.


Drivers use minimum RTT to short circuit operations
---------------------------------------------------

A previous version of this spec used the 90th percentile RTT to short
circuit operations that might otherwise fail with a socket timeout.
We decided to change this logic to avoid canceling operations that may
have a high chance of succeeding and also remove a dependency on t-digest.
Instead, drivers use the minimum RTT from the last 10 samples, or 0 until
at least 2 samples have been recorded.

Future work
===========

Expand All @@ -922,3 +932,4 @@ Changelog

:2022-10-05: Remove spec front matter.
:2022-01-19: Initial version.
:2022-11-17: Use minimum RTT for maxTimeMS calculation instead of 90th percentile RTT.
206 changes: 170 additions & 36 deletions source/client-side-operations-timeout/tests/command-execution.json
Expand Up @@ -3,7 +3,14 @@
"schemaVersion": "1.9",
"runOnRequirements": [
{
"minServerVersion": "4.9"
"minServerVersion": "4.9",
"topologies": [
"single",
"replicaset",
"sharded-replicaset",
"sharded"
],
"serverless": "forbid"
}
],
"createEntities": [
Expand Down Expand Up @@ -45,7 +52,7 @@
],
"appName": "reduceMaxTimeMSTest",
"blockConnection": true,
"blockTimeMS": 20
"blockTimeMS": 50
}
}
}
Expand All @@ -61,7 +68,9 @@
"useMultipleMongoses": false,
"uriOptions": {
"appName": "reduceMaxTimeMSTest",
"w": 1
"w": 1,
"timeoutMS": 500,
"heartbeatFrequencyMS": 500
},
"observeEvents": [
"commandStartedEvent"
Expand All @@ -75,33 +84,31 @@
"databaseName": "test"
}
},
{
"collection": {
"id": "regularCollection",
"database": "database",
"collectionName": "coll"
}
},
{
"collection": {
"id": "timeoutCollection",
"database": "database",
"collectionName": "timeoutColl",
"collectionOptions": {
"timeoutMS": 60
}
"collectionName": "timeoutColl"
}
}
]
}
},
{
"name": "insertOne",
"object": "regularCollection",
"object": "timeoutCollection",
"arguments": {
"document": {
"_id": 1
}
},
"timeoutMS": 100000
}
},
{
"name": "wait",
"object": "testRunner",
"arguments": {
"ms": 1000
}
},
{
Expand All @@ -123,10 +130,7 @@
"commandName": "insert",
"databaseName": "test",
"command": {
"insert": "coll",
"maxTimeMS": {
"$$exists": false
}
"insert": "timeoutColl"
}
}
},
Expand All @@ -137,7 +141,7 @@
"command": {
"insert": "timeoutColl",
"maxTimeMS": {
"$$lte": 60
"$$lte": 450
}
}
}
Expand All @@ -164,7 +168,7 @@
],
"appName": "rttTooHighTest",
"blockConnection": true,
"blockTimeMS": 20
"blockTimeMS": 50
}
}
}
Expand All @@ -180,7 +184,9 @@
"useMultipleMongoses": false,
"uriOptions": {
"appName": "rttTooHighTest",
"w": 1
"w": 1,
"timeoutMS": 10,
"heartbeatFrequencyMS": 500
},
"observeEvents": [
"commandStartedEvent"
Expand All @@ -196,31 +202,153 @@
},
{
"collection": {
"id": "regularCollection",
"id": "timeoutCollection",
"database": "database",
"collectionName": "coll"
"collectionName": "timeoutColl"
}
}
]
}
},
{
"name": "insertOne",
"object": "timeoutCollection",
"arguments": {
"document": {
"_id": 1
},
"timeoutMS": 100000
}
},
{
"name": "wait",
"object": "testRunner",
"arguments": {
"ms": 1000
}
},
{
"name": "insertOne",
"object": "timeoutCollection",
"arguments": {
"document": {
"_id": 2
}
},
"expectError": {
"isTimeoutError": true
}
},
{
"name": "insertOne",
"object": "timeoutCollection",
"arguments": {
"document": {
"_id": 3
}
},
"expectError": {
"isTimeoutError": true
}
},
{
"name": "insertOne",
"object": "timeoutCollection",
"arguments": {
"document": {
"_id": 4
}
},
"expectError": {
"isTimeoutError": true
}
}
],
"expectEvents": [
{
"client": "client",
"events": [
{
"commandStartedEvent": {
"commandName": "insert",
"databaseName": "test",
"command": {
"insert": "timeoutColl"
}
}
}
]
}
]
},
{
"description": "short-circuit is not enabled with only 1 RTT measurement",
"operations": [
{
"name": "failPoint",
"object": "testRunner",
"arguments": {
"client": "failPointClient",
"failPoint": {
"configureFailPoint": "failCommand",
"mode": "alwaysOn",
"data": {
"failCommands": [
"hello",
"isMaster"
],
"appName": "reduceMaxTimeMSTest",
"blockConnection": true,
"blockTimeMS": 100
}
}
}
},
{
"name": "createEntities",
"object": "testRunner",
"arguments": {
"entities": [
{
"client": {
"id": "client",
"useMultipleMongoses": false,
"uriOptions": {
"appName": "reduceMaxTimeMSTest",
"w": 1,
"timeoutMS": 90,
"heartbeatFrequencyMS": 100000
},
"observeEvents": [
"commandStartedEvent"
]
}
},
{
"database": {
"id": "database",
"client": "client",
"databaseName": "test"
}
},
{
"collection": {
"id": "timeoutCollection",
"database": "database",
"collectionName": "timeoutColl",
"collectionOptions": {
"timeoutMS": 2
}
"collectionName": "timeoutColl"
}
}
]
}
},
{
"name": "insertOne",
"object": "regularCollection",
"object": "timeoutCollection",
"arguments": {
"document": {
"_id": 1
}
},
"timeoutMS": 100000
}
},
{
Expand All @@ -230,9 +358,6 @@
"document": {
"_id": 2
}
},
"expectError": {
"isTimeoutError": true
}
}
],
Expand All @@ -245,9 +370,18 @@
"commandName": "insert",
"databaseName": "test",
"command": {
"insert": "coll",
"insert": "timeoutColl"
}
}
},
{
"commandStartedEvent": {
"commandName": "insert",
"databaseName": "test",
"command": {
"insert": "timeoutColl",
"maxTimeMS": {
"$$exists": false
"$$lte": 450
}
}
}
Expand Down

0 comments on commit c06650d

Please sign in to comment.