-
Notifications
You must be signed in to change notification settings - Fork 1.5k
JAVA-4034 Refactor read/write retries to accommodate for both PoolClearedError
and CSOT
#782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Since the target brach is not P.S. The |
PoolClearedError
and CSOT
PoolClearedError
and CSOT
PoolClearedError
and CSOT
PoolClearedError
and CSOT
bulkWriteTracker.batch().ifPresent(bulkWriteBatch -> { | ||
assertTrue(prospectiveFailedResult instanceof MongoWriteConcernWithResponseException); | ||
bulkWriteBatch.addResult((BsonDocument) ((MongoWriteConcernWithResponseException) prospectiveFailedResult) | ||
.getResponse()); | ||
BulkWriteTracker.attachNext(retryState, bulkWriteBatch); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code seems to be dead from the standpoint of the tests we have, i.e., the breakAndThrowIfRetryAnd
always breaks retrying in tests when it is executed. The same is true for executeSync
.
The rest of the code seem to be covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like it, mostly nits and a few change requests.
driver-core/src/main/com/mongodb/internal/async/function/AsyncCallbackFunction.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/async/function/AsyncCallbackFunction.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/async/function/AsyncCallbackFunction.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/async/function/AsyncCallbackBiFunction.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/async/function/AsyncCallbackSupplier.java
Outdated
Show resolved
Hide resolved
*/ | ||
public RetryState(final int retries) { | ||
assertTrue(retries >= 0); | ||
assertTrue(retries < INFINITE_ATTEMPTS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just reuse this constructor and default retries to INFINITE_ATTEMPTS in the no args constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
driver-sync/src/test/functional/com/mongodb/client/RetryableWritesProseTest.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/async/function/RetryState.java
Show resolved
Hide resolved
import java.util.function.Supplier; | ||
|
||
@ThreadSafe | ||
public final class SupplyingCallback<R> implements SingleResultCallback<R>, Supplier<R> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fantastic. Just a few nits, and one substantial refactoring request.
driver-core/src/main/com/mongodb/internal/binding/SingleServerBinding.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/operation/MixedBulkWriteOperation.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/operation/MixedBulkWriteOperation.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/operation/MixedBulkWriteOperation.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/operation/CommandOperationHelper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* | ||
* @see #iteration() | ||
*/ | ||
public boolean firstIteration() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our convention seems to be to prefix with is
even in internal code, so might as well follow that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
driver-core/src/main/com/mongodb/internal/async/function/AsyncCallbackFunction.java
Show resolved
Hide resolved
* | ||
* @see #iteration() | ||
*/ | ||
public boolean firstIteration() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats good to know, I'll defer to @jyemin but I'm happy either way.
driver-core/src/main/com/mongodb/internal/async/function/RetryState.java
Show resolved
Hide resolved
static BsonDocument executeCommand(final ReadBinding binding, final String database, final CommandCreator commandCreator, | ||
final boolean retryReads) { | ||
return executeCommand(binding, database, commandCreator, new BsonDocumentCodec(), retryReads); | ||
private static Throwable chooseAndMutateRetryableReadException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems confusing as this method doesn't mutate anything. It seems more a kin to a concrete BiFunction implementation.
Is there some expectation in the future from within these methods that an existing exception will be mutated?
driver-core/src/main/com/mongodb/internal/operation/CommandOperationHelper.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/operation/CommandOperationHelper.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/operation/retry/AttachmentKeys.java
Show resolved
Hide resolved
Apparently, the diff displayed by the GitHub PR UI is calculated not based on the state of the last commits in the "into"/"from" branches, but based on the state of the last commit that is common for the two aforementioned branches, and the last commits in the "from" branch. I thought this is not how the PR diff is calculated. Anyway, this prevents me from simply changing the "into" branch, because the PR diff now combines a diff for both its predecessor #701 and the changes that are actually proposed in the current PR. In order to redeem the diff for this PR, I will rebase the "from" branch on top of the new "into" ( |
2c08fd1
to
165568a
Compare
…and CSOT Changes in this commit do the following two major things: 1) These changes make selecting a server (`binding.get*ConnectionSource`) and checking out a connection (`source.getConnection`) part of a retryable operation. Such a change makes `MongoConnectionPoolClearedException`s part of the operation, thus allowing us to retry the operation when the exception happens. 2) The retry limit (not more than two attempts) was hardcoded in the driver's code structure, i.e., the code was written in such a way that it was describing the first attempt, then depending on the outcome, the code for the second attempt was executed, with the code for the second attempt mostly duplicating the code for the first attempt, but having different error handling. Such an approach not only negatively affects the code readability, but also prevents changing the number of attempts, let alone making decisions on whether to retry based not on an attempt limit. The pseudocode in the specifications is also written this way, and at least some other drivers, e.g., C#, Rust, took a similar approach to structure the code. Changes in this commit represent the outcome of refactoring the code related to the retry logic. In the new code the retry logic is decoupled from the logic of a retryable operation as much as possible. Since our operations are quite complex and may decide to break retrying in the middle of an attempt, e.g., because the server selected in the attempt does not support retries, the operation logic is still aware of the fact that it may be retried. However, it is important to understand that this awareness is a direct consequence of the business logic. It cannot be gotten rid of regardless of the approach taken to structure the code. Operations that have simpler business logic can be written in a retry-agnostic way without making changes to the retry framework that was added as part of this commit. With the changes in this commit, applying the client-side timeout (CSOT) to a retryable operation is as simple as replacing the hard retry limit condition with a condition that checks whether there is time left for attempting the operation again. JAVA-4034
JAVA-4034
…n while writing JAVA-4034
…rnalException` JAVA-4034
JAVA-4034
JAVA-4034
JAVA-4034
165568a
to
6f51a60
Compare
The baseline commit for this PR is the last commit in #701. As a consequence, I had to push the
stIncMale:JAVA-3928
branch to mongodb/mongo-java-driver to be able to create a PR with diffs relevant to JAVA-4034. Once this PR is reviewed and approved, I will:DefaultConnectionPool
pausable #701;master
mongodb:JAVA-3928
.Changes in this commit do the following two major things:
Selecting a server (
binding.get*ConnectionSource
)and checking out a connection (
source.getConnection
) is now part of a retryable operation.Such a change makes
MongoConnectionPoolClearedException
s part of the operation,thus allowing us to retry the operation when the exception happens.
The retry limit (not more than two attempts) was hardcoded in the driver's code structure,
i.e., the code was written in such a way that it was describing the first attempt, then depending
on the outcome, the code for the second attempt was executed, with the code for the second attempt
mostly duplicating the code for the first attempt, but having different error handling. Such an approach
not only negatively affects the code readability, but also prevents changing the number of attempts,
let alone making decisions on whether to retry based not on an attempt limit.
The pseudocode in the specifications is also written this way, and at least some other drivers,
e.g., C#, Rust, took a similar approach to structure the code.
Changes in this commit represent the outcome of refactoring the code related to the retry logic.
In the new code the retry logic is decoupled from the logic of a retryable operation as much as possible.
Since our operations are quite complex and may decide to break retrying in the middle of an attempt,
e.g., because the server selected in the attempt does not support retries, the operation logic
is still aware of the fact that it may be retried. However, it is important to understand that
this awareness is a direct consequence of the business logic. It cannot be gotten rid of
regardless of the approach taken to structure the code. Operations that have simpler business logic
can be written in a retry-agnostic way without making changes to the retry framework
that was added as part of this commit.
With the changes in this commit, applying the client-side timeout (CSOT) to a retryable operation
is as simple as replacing the hard retry limit condition with a condition that checks
whether there is time left for attempting the operation again.
Here are also a few interesting stats:
CommandOperationHelper
changed from 865 to 620OperationHelper
changed from 768 to 734MixedBulkWriteOperation
stayed roughly the same (changed from 553 to 551, but a portion of it is comments, while previously it was all code)Given that the business logic became more complicated due to the item #1 described above, I consider the above reduction in the amount of code expressing the business logic a very good thing, especially given that it is accompanied by the code (particularly callback-based) being made more linear and requiring almost no changes to support CSOT.
JAVA-4034