Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Make transaction rollback best effort. #1967

Merged
merged 2 commits into from Jan 5, 2024

Conversation

tom-andersen
Copy link
Contributor

@tom-andersen tom-andersen commented Jan 3, 2024

Internal tracking: b/316023452

A rollback is meant to be best effort. If the transaction has already expired, it is possible for the rollback to fail due to transaction no longer existing in Firestore. The retry logic will use attempts to rollback, and in the case where transaction no longer exists, all attempts to be exhausted attempted to rollback transaction.

This PR makes the rollback best effort, simply logging any rollback error and continuing.

@product-auto-label product-auto-label bot added size: s Pull request size is small. api: firestore Issues related to the googleapis/nodejs-firestore API. labels Jan 3, 2024
@tom-andersen tom-andersen changed the title Make rollback best effort. fix: Make transaction rollback best effort. Jan 3, 2024
@tom-andersen tom-andersen marked this pull request as ready for review January 3, 2024 19:37
@tom-andersen tom-andersen requested review from a team as code owners January 3, 2024 19:37
@tom-andersen tom-andersen added the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 4, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jan 4, 2024
'Best effort to rollback failed with error:',
reason
);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only question is whether there are rollback failures that should be retried?

maybe MISSING_TRANSACTION_HANDLE should be the only one that's best-effort?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation "Note that a rollback itself might fail, so the rollback should be a best-effort attempt only."

https://cloud.google.com/datastore/docs/best-practices#api_calls

However, I see your point.

This could be improved such that retryable errors from rollback cause rollback to be retried, except INTERNAL which happens when MISSING_TRANSACTION_HANDLE.

BUT, even if rollback experiences a non-retryable error, the transaction should still retry. That was not the case when I looked at Java, which seems completely wrong even if it was baked into a test case.

The phrasing "Best effort" leaves some nuance to be interpreted.

Ideally MISSING_TRANSACTION_HANDLE should just be a no-op on Firestore backend, and NOT throw an INTERNAL error. This code might simply be working around what should be a change to the backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally MISSING_TRANSACTION_HANDLE should just be a no-op on Firestore backend, and NOT throw an INTERNAL error. This code might simply be working around what should be a change to the backend.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be improved such that retryable errors from rollback cause rollback to be retried, except INTERNAL

Let's do this then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into code, and the underlying GAPIC client does do retries according to retry settings. Effectively, that means currently there is appropriate backoff, and 5 attempts to rollback before failure.

This second level of retry on a higher level was a complete duplication and would simply multiply the number of attempts with additional backoff.

In addition, after some investigation, the GAPIC level retry respects the error codes from service config. So it actually does a better job of determining when to retry than the higher transaction level retry.

In other words, we should be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Thanks

@ehsannas ehsannas assigned tom-andersen and unassigned ehsannas Jan 4, 2024
'Best effort to rollback failed with error:',
reason
);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be improved such that retryable errors from rollback cause rollback to be retried, except INTERNAL

Let's do this then

@ehsannas ehsannas removed their assignment Jan 5, 2024
@tom-andersen tom-andersen merged commit 1d76546 into main Jan 5, 2024
18 checks passed
@tom-andersen tom-andersen deleted the tomandersen/b/283455016 branch January 5, 2024 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. size: s Pull request size is small.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants