Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSOT #1215

Draft
wants to merge 156 commits into
base: master
Choose a base branch
from
Draft

CSOT #1215

wants to merge 156 commits into from

Conversation

katcharov
Copy link
Contributor

rozza and others added 28 commits August 22, 2023 14:43
Initial client side operation timeout (CSOT) work.
The CSOT class is passed to all operations and currently encapsulate the following timeouts:

  - `timeoutMS` the new optional client side operation timeout
  - `maxTimeMS` the legacy maxTimeMS operation value. Ignored if CSOT is set.
  - `maxCommitTimeMS` the legacy commit timeout. Ignored if CSOT is set.
  - `maxAwaitTimeMS` the getMore await timeout.

This initial work allows for the CSOT to be available to all operations and later
work will pass / apply this timeout where required by the Spec.

JAVA-4086
# Conflicts:
#	driver-core/src/main/com/mongodb/ConnectionString.java
#	driver-core/src/test/unit/com/mongodb/MongoClientSettingsSpecification.groovy
An immutable class that contains all user configured timeouts.
This will eventaually allow access to user configuration down the
stack into Bindings / Cluster.selectServer and Connection.command.

JAVA-5169
In preparation for adding all contexts to OperationContext

JAVA-5170
This will allow the timeout settings to be available to the Binding
and this will allow the binding to create the TimeoutContext for the
operation.

JAVA-5170
Now contains RequestContext, SessionContext and TimeoutContext

JAVA-5170
Operations now supply TimeoutSettings

JAVA-5170
Pass OperationContext as a whole instead of RequestContext,
SessionContext, TimeoutContext and ServerAPI

JAVA-5170
Adds support for Explainable operations

JAVA-5172
# Conflicts:
#	driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java
#	driver-core/src/main/com/mongodb/internal/connection/LoadBalancedCluster.java
#	driver-core/src/test/functional/com/mongodb/ClusterFixture.java
#	driver-core/src/test/unit/com/mongodb/internal/connection/BaseClusterSpecification.groovy
#	driver-reactive-streams/src/main/com/mongodb/reactivestreams/client/internal/ClientSessionBinding.java
#	driver-reactive-streams/src/test/unit/com/mongodb/reactivestreams/client/internal/ClientSessionBindingSpecification.groovy
#	driver-sync/src/test/unit/com/mongodb/client/internal/ClientSessionBindingSpecification.groovy
if (this == o) {
return true;
static Timeout expiresIn(final long duration, final TimeUnit unit) {
// TODO (CSOT) confirm that all usages in final PR are non-negative
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to address all TODOs prior to merging. (All files in github can be loaded using document.querySelectorAll('.load-diff-button').forEach(node => node.click()) in console, and then ctrl-f for TODO)

JAVA-5290

Co-authored-by: Ross Lawley <ross.lawley@gmail.com>
protected abstract MongoClient createMongoClient(MongoClientSettings.Builder builder);

@Test
@Tag("setsFailPoint")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a proper use of the @Tag functionality. We should either use something like com.mongodb.client.FailPoint, or implement automatic turning off of failpoints without using the @Tag functionality.

The original discussion is #1333 (comment).

@@ -680,7 +699,11 @@ private void updateSessionContext(final SessionContext sessionContext, final Res
}
}

private MongoException translateWriteException(final Throwable e) {
private MongoException translateWriteException(final Throwable e, final OperationContext operationContext) {
if (e instanceof MongoSocketWriteTimeoutException && operationContext.getTimeoutContext().hasExpired()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we check hasExpired, but below in translateReadException, we check hasTimeoutMS, though it seems both methods should have the same logic.

private long minRoundTripTimeMS = 0;

public static MongoOperationTimeoutException createMongoTimeoutException() {
return createMongoTimeoutException("Remaining timeoutMS is less than the servers minimum round trip time.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it more difficult to follow / work with code where an exception was created in one method, but thrown in another. The non-throwing usage of this method was:

        return operationContext.getTimeoutContext().validateHasTimedOutForCommandExecution().map(e -> {
            close();
            commandEventSender.sendFailedEvent(e);
            return e;
        });

I cannot immediately understand if close can be called multiple times for the same exception. This differs from the usual pattern of handling close in a catch.

I understand that this might save having to write a try-catch in async code, but I don't think the cost to readability is worth this benefit.

* @see MongoClientSettings#getTimeout(TimeUnit)
* @since CSOT
*/
public final class MongoOperationTimeoutException extends MongoTimeoutException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed here, consider moving all messages to a central set of fields. At least, the various messages should be checked for consistency, for example, there are at least the following ways that timeouts are described:

  • exceeded the timeout limit
  • Operation timed out:
  • Remaining timeoutMS is less than the server's minimum round trip time.
  • The operation timeout has expired
  • Timeout occurred during socket write

It seems they should all use the same terminology.


public static MongoDatabase databaseWithTimeout(final MongoDatabase database,
final String message,
@Nullable final Timeout timeout) {
Copy link
Contributor Author

@katcharov katcharov Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these databaseWithTimeout methods, I would expect a null timeout to result in a database without a timeout, but below it returns the database itself. I think we should avoid making these with methods nullable.

rozza and others added 3 commits March 26, 2024 10:23
Ensure that users can't accidently set a 0 value
which in CSOT represents infinite timeouts.

JAVA-5351
Move appending of maxTimeMS to CommandMesage.

JAVA-5322

---------

Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
long writeTimeoutMS = operationContext.getTimeoutContext().getWriteTimeoutMS();
Optional<WriteTimeoutHandler> writeTimeoutHandler = writeTimeoutMS != NO_SCHEDULE_TIME
? Optional.of(new WriteTimeoutHandler(writeTimeoutMS, MILLISECONDS)) : Optional.empty();
writeTimeoutHandler.map(w -> channel.pipeline().addBefore("ChannelInboundHandlerAdapter", "WriteTimeoutHandler", w));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found uses of Optional difficult to reason about when attempting to inline/invert the write timeout. I think the above section can be re-written as something like:

if (writeTimeoutMS == 0) {
   channel...addBefore...
}

}

public long getMaxTimeMS() {
return notNull("Should never be null", maxTimeSupplier.get());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used only to set the maxTimeMS value in CommandMessage#getExtraElements, when non-zero. The value is:

  1. By default, calculateMaxTimeMS: if the timeout not present, use the legacy maxTimeMS setting. Otherwise, shorten the timeout by the round trip time, throwing a timeout exception as needed.
  2. In CommitTransactionOperation, the supplier is overridden to use getMaxCommitTimeMS, which uses the timeout if present, but otherwise uses the legacy maxCommitTimeMS setting.
  3. Elsewhere, the supplier is overridden to return a fixed value of 0, denoting infinite, or to a fixed value like MaxAwaitTimeMS.

In the second path, when the timeout is present, it is not shortened by the round trip time. In the first, it is shortened, which seems correct. Should it be shortened in the second path?

In the third path, the value (0, or maxAwaitTime) is not overriden by timeout. In the first and second paths, fixed values are overridden by the timeout. Should they be overridden on this third path?

}

boolean lastAttempt() {
if (timeoutContext.hasTimeoutMS()){
return timeoutContext.hasExpired();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes lastAttempt non-deterministic within a particular attempt. The following code calls lastAttempt:

if (currentBulkWriteTracker.lastAttempt()) {
    addRetryableWriteErrorLabel(writeConcernBasedError, maxWireVersion);
    addErrorLabelsToWriteConcern(result.getDocument("writeConcernError"), writeConcernBasedError.getErrorLabels());
} 

The intent is to add error labels on the last attempt. However, now, the above lastAttempt check can return "false", because the timeout has not yet timed out. So labels are not attached. At the end of the iteration, other code will check lastAttempt again, but now the timer will have run out, so no further iterations are performed, and labels are never attached. This appears to be a bug.

* @since CSOT
* @see #getTimeout
*/
MongoCollection<TDocument> withTimeout(long timeout, TimeUnit timeUnit);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1349 (comment) - clarify regarding use of times below 1ms.

Comment on lines 243 to 246
public void resetTimeout() {
assertNotNull(timeout);
timeout = calculateTimeout(timeoutSettings.getTimeoutMS());
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider changing to resetTimeoutIfPresent, with an if check on null rather than the assert. All prod usages perform this check before calling.

rozza and others added 17 commits April 3, 2024 13:40
Also prevent NPE leak from TlsChannelImpl
Also added temp fix for timeoutRemainingMS (scheduled to be removed)

JAVA-5401

Added temp fix for timeoutRemainingMS which will eventually be removed
Added createOptions support for collection creation
Fixed error with awaitable tailable cursors
Manually change tests to use defaultTimeoutMS for the session as there is no operation overrdie
Disable test JAVA-5406 - ClientSideEncryptionTest timeoutMS.json

JAVA-5374
Add CSOT error transformation.

JAVA-5248

Co-authored-by: Ross Lawley <ross@mongodb.com>
- Resolve regression where CSOT exception is exposed despite CSOT being disabled.
- Correct premature decrease in connect timeout before connection initiation.
- Encapsulate logic within TimeoutContext.

JAVA-5439
JAVA-5402

Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants