Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-97 Support sharded transactions recovery token #398

Merged
merged 7 commits into from
Jul 28, 2021

Conversation

NBSquare
Copy link
Contributor

Implemented recovery tokens from the spec:

Every successful (ok:1) command response in a sharded transaction includes a recoveryToken field. Drivers MUST track the most recently received recoveryToken field and MUST append this field to any subsequent commitTransaction or abortTransaction commands. Tracking the most recently returned recoveryToken allows the server to update the recoveryToken mid-transaction if needed.

This ticket retrieves the recoveryToken from responses and adds it next to the pinned mongos during sharded transactions, passing it back during the appropriate operations.

@@ -271,7 +271,8 @@ pub async fn run_v2_test(test_file: TestFile) {
.unwrap_or_else(|| panic!("ClientSession is not pinned"));

fail_point_guards.push(
internal_client
client
.deref()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it doesn't strictly matter here whether the failpoint is enabled on internal_client or the EventClient, but for consistency with the other failpoint we enable I changed this to the EventClient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the unified test runner specification says not to run failpoint commands on the internal client so we've decided to do the same here.

Hm I agree we should be consistent with which client we're using for failpoints but I think it would make more sense to do all of them with the internal_client rather than the EventClient. In general we just use the EventClient for operations that need to be tracked for command monitoring and the internal client is used for test setup and test runner operations.

@@ -48,10 +48,20 @@ impl FailPoint {
client: &Client,
criteria: impl Into<Option<SelectionCriteria>>,
) -> Result<FailPointGuard> {
// TODO: DRIVERS-1385 remove this logic for moving errorLabels to the top level.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because the new spec test (mongos-recovery-token) sets up a failpoint with a RetryableWriteError, which requires errorLabels to be on the failpoint top level when it's initialized. This can also be observed here

// TODO RUST-97: Remove this logic to bypass recoveryToken
if k == "recoveryToken" {
continue;
if k == "recoveryToken" && v.is_placeholder() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@isabelatkinson isabelatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! just a few comments/suggestions

}

impl Transaction {
pub(crate) fn start(&mut self, options: Option<TransactionOptions>) {
self.state = TransactionState::Starting;
self.options = options;
self.recovery_token = None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assignment necessary? Similarly to pinned_mongos I believe this field should be None by default upon the first transaction and reset to None after previous transactions have finished.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. From the spec:

Drivers MUST clear a session's cached recoveryToken when transitioning to the "no transaction" or "starting transaction" state.

So, I'm clearing the recoveryToken on Transaction::start and Transaction::reset

src/operation/run_command/mod.rs Outdated Show resolved Hide resolved
@@ -271,7 +271,8 @@ pub async fn run_v2_test(test_file: TestFile) {
.unwrap_or_else(|| panic!("ClientSession is not pinned"));

fail_point_guards.push(
internal_client
client
.deref()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the unified test runner specification says not to run failpoint commands on the internal client so we've decided to do the same here.

Hm I agree we should be consistent with which client we're using for failpoints but I think it would make more sense to do all of them with the internal_client rather than the EventClient. In general we just use the EventClient for operations that need to be tracked for command monitoring and the internal client is used for test setup and test runner operations.

src/test/util/matchable.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@isabelatkinson isabelatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! tagging in the rest of the team. can you run another evergreen patch and add it to the ticket?

Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! I just have a few minor comments / suggestions

src/test/util/matchable.rs Show resolved Hide resolved
src/client/executor.rs Outdated Show resolved Hide resolved
src/operation/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@NBSquare
Copy link
Contributor Author

@abr-egn @isabelatkinson please take a look at the last commit when you get the chance - i added a prose test and changed some logic for retrieving the recoveryToken which is probably worth looking at.

@NBSquare NBSquare merged commit ae2b7be into mongodb:sharded-transactions Jul 28, 2021
@NBSquare NBSquare deleted the RUST-97 branch July 28, 2021 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants