New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSOT #1215
base: master
Are you sure you want to change the base?
Conversation
Initial client side operation timeout (CSOT) work. The CSOT class is passed to all operations and currently encapsulate the following timeouts: - `timeoutMS` the new optional client side operation timeout - `maxTimeMS` the legacy maxTimeMS operation value. Ignored if CSOT is set. - `maxCommitTimeMS` the legacy commit timeout. Ignored if CSOT is set. - `maxAwaitTimeMS` the getMore await timeout. This initial work allows for the CSOT to be available to all operations and later work will pass / apply this timeout where required by the Spec. JAVA-4086
# Conflicts: # driver-core/src/main/com/mongodb/ConnectionString.java # driver-core/src/test/unit/com/mongodb/MongoClientSettingsSpecification.groovy
An immutable class that contains all user configured timeouts. This will eventaually allow access to user configuration down the stack into Bindings / Cluster.selectServer and Connection.command. JAVA-5169
In preparation for adding all contexts to OperationContext JAVA-5170
This will allow the timeout settings to be available to the Binding and this will allow the binding to create the TimeoutContext for the operation. JAVA-5170
Now contains RequestContext, SessionContext and TimeoutContext JAVA-5170
Operations now supply TimeoutSettings JAVA-5170
Pass OperationContext as a whole instead of RequestContext, SessionContext, TimeoutContext and ServerAPI JAVA-5170
Adds support for Explainable operations JAVA-5172
# Conflicts: # driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java # driver-core/src/main/com/mongodb/internal/connection/LoadBalancedCluster.java # driver-core/src/test/functional/com/mongodb/ClusterFixture.java # driver-core/src/test/unit/com/mongodb/internal/connection/BaseClusterSpecification.groovy # driver-reactive-streams/src/main/com/mongodb/reactivestreams/client/internal/ClientSessionBinding.java # driver-reactive-streams/src/test/unit/com/mongodb/reactivestreams/client/internal/ClientSessionBindingSpecification.groovy # driver-sync/src/test/unit/com/mongodb/client/internal/ClientSessionBindingSpecification.groovy
if (this == o) { | ||
return true; | ||
static Timeout expiresIn(final long duration, final TimeUnit unit) { | ||
// TODO (CSOT) confirm that all usages in final PR are non-negative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to address all TODOs prior to merging. (All files in github can be loaded using document.querySelectorAll('.load-diff-button').forEach(node => node.click())
in console, and then ctrl-f for TODO
)
JAVA-5290 Co-authored-by: Ross Lawley <ross.lawley@gmail.com>
protected abstract MongoClient createMongoClient(MongoClientSettings.Builder builder); | ||
|
||
@Test | ||
@Tag("setsFailPoint") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a proper use of the @Tag
functionality. We should either use something like com.mongodb.client.FailPoint
, or implement automatic turning off of failpoints without using the @Tag
functionality.
The original discussion is #1333 (comment).
@@ -680,7 +699,11 @@ private void updateSessionContext(final SessionContext sessionContext, final Res | |||
} | |||
} | |||
|
|||
private MongoException translateWriteException(final Throwable e) { | |||
private MongoException translateWriteException(final Throwable e, final OperationContext operationContext) { | |||
if (e instanceof MongoSocketWriteTimeoutException && operationContext.getTimeoutContext().hasExpired()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we check hasExpired, but below in translateReadException, we check hasTimeoutMS, though it seems both methods should have the same logic.
private long minRoundTripTimeMS = 0; | ||
|
||
public static MongoOperationTimeoutException createMongoTimeoutException() { | ||
return createMongoTimeoutException("Remaining timeoutMS is less than the servers minimum round trip time."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it more difficult to follow / work with code where an exception was created in one method, but thrown in another. The non-throwing usage of this method was:
return operationContext.getTimeoutContext().validateHasTimedOutForCommandExecution().map(e -> {
close();
commandEventSender.sendFailedEvent(e);
return e;
});
I cannot immediately understand if close can be called multiple times for the same exception. This differs from the usual pattern of handling close in a catch.
I understand that this might save having to write a try-catch in async code, but I don't think the cost to readability is worth this benefit.
* @see MongoClientSettings#getTimeout(TimeUnit) | ||
* @since CSOT | ||
*/ | ||
public final class MongoOperationTimeoutException extends MongoTimeoutException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed here, consider moving all messages to a central set of fields. At least, the various messages should be checked for consistency, for example, there are at least the following ways that timeouts are described:
- exceeded the timeout limit
- Operation timed out:
- Remaining timeoutMS is less than the server's minimum round trip time.
- The operation timeout has expired
- Timeout occurred during socket write
It seems they should all use the same terminology.
|
||
public static MongoDatabase databaseWithTimeout(final MongoDatabase database, | ||
final String message, | ||
@Nullable final Timeout timeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In these databaseWithTimeout
methods, I would expect a null timeout to result in a database without a timeout, but below it returns the database itself. I think we should avoid making these with
methods nullable.
long writeTimeoutMS = operationContext.getTimeoutContext().getWriteTimeoutMS(); | ||
Optional<WriteTimeoutHandler> writeTimeoutHandler = writeTimeoutMS != NO_SCHEDULE_TIME | ||
? Optional.of(new WriteTimeoutHandler(writeTimeoutMS, MILLISECONDS)) : Optional.empty(); | ||
writeTimeoutHandler.map(w -> channel.pipeline().addBefore("ChannelInboundHandlerAdapter", "WriteTimeoutHandler", w)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found uses of Optional difficult to reason about when attempting to inline/invert the write timeout. I think the above section can be re-written as something like:
if (writeTimeoutMS == 0) {
channel...addBefore...
}
} | ||
|
||
public long getMaxTimeMS() { | ||
return notNull("Should never be null", maxTimeSupplier.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is used only to set the maxTimeMS
value in CommandMessage#getExtraElements
, when non-zero. The value is:
- By default,
calculateMaxTimeMS
: if the timeout not present, use the legacymaxTimeMS
setting. Otherwise, shorten the timeout by the round trip time, throwing a timeout exception as needed. - In
CommitTransactionOperation
, the supplier is overridden to usegetMaxCommitTimeMS
, which uses the timeout if present, but otherwise uses the legacymaxCommitTimeMS
setting. - Elsewhere, the supplier is overridden to return a fixed value of 0, denoting infinite, or to a fixed value like
MaxAwaitTimeMS
.
In the second path, when the timeout
is present, it is not shortened by the round trip time. In the first, it is shortened, which seems correct. Should it be shortened in the second path?
In the third path, the value (0, or maxAwaitTime) is not overriden by timeout. In the first and second paths, fixed values are overridden by the timeout. Should they be overridden on this third path?
} | ||
|
||
boolean lastAttempt() { | ||
if (timeoutContext.hasTimeoutMS()){ | ||
return timeoutContext.hasExpired(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes lastAttempt non-deterministic within a particular attempt. The following code calls lastAttempt:
if (currentBulkWriteTracker.lastAttempt()) {
addRetryableWriteErrorLabel(writeConcernBasedError, maxWireVersion);
addErrorLabelsToWriteConcern(result.getDocument("writeConcernError"), writeConcernBasedError.getErrorLabels());
}
The intent is to add error labels on the last attempt. However, now, the above lastAttempt check can return "false", because the timeout has not yet timed out. So labels are not attached. At the end of the iteration, other code will check lastAttempt again, but now the timer will have run out, so no further iterations are performed, and labels are never attached. This appears to be a bug.
* @since CSOT | ||
* @see #getTimeout | ||
*/ | ||
MongoCollection<TDocument> withTimeout(long timeout, TimeUnit timeUnit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #1349 (comment) - clarify regarding use of times below 1ms.
public void resetTimeout() { | ||
assertNotNull(timeout); | ||
timeout = calculateTimeout(timeoutSettings.getTimeoutMS()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider changing to resetTimeoutIfPresent
, with an if check on null rather than the assert. All prod usages perform this check before calling.
Also prevent NPE leak from TlsChannelImpl Also added temp fix for timeoutRemainingMS (scheduled to be removed) JAVA-5401 Added temp fix for timeoutRemainingMS which will eventually be removed
Add CSOT error transformation. JAVA-5248 Co-authored-by: Ross Lawley <ross@mongodb.com>
- Resolve regression where CSOT exception is exposed despite CSOT being disabled. - Correct premature decrease in connect timeout before connection initiation. - Encapsulate logic within TimeoutContext. JAVA-5439
JAVA-5402 Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
JAVA-3828