Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: transaction retries should not timeout #1009

Merged
merged 1 commit into from Mar 24, 2021
Merged

Conversation

olavloite
Copy link
Contributor

@olavloite olavloite commented Mar 23, 2021

Transactions that are retried because of an aborted transaction use the retry settings of the Rollback RPC. This ensures reasonable backoff values. It however also meant that transactions that are retried multiple times could exceed the total timeout of the retry settings, and that again would cause the Aborted error to propagate. This change sets the total timeout for transaction retries to 24 hours and disables any max attempts in the retry settings to prevent retries to fail because the deadline is exceeded.

Transactions can still fail with timeout errors if individual RPC invocations exceed the configured timeout of that RPC. This change only prevents timeouts from occurring because of repeated retries of an entire transaction.

Fixes #1008

Transactions that are retried because of an aborted transaction use the
retry settings of the Rollback RPC. This ensures reasonable backoff values.
It however also meant that transactions that are retried multiple times
could exceed the total timeout of the retry settings, and that again would
cause the Aborted error to propagate. This change sets the total timeout
for transaction retries to 24 hours and disables any max attempts in the
retry settings to prevent retries to fail because the deadline is exceeded.

Transactions can still fail with timeout errors if individual RPC invocations
exceed the configured timeout of that RPC. This change only prevents timeouts
from occurring because of repeated retries of an entire transaction.

Fixes #1008
@olavloite olavloite requested a review from thiagotnunes Mar 23, 2021
@olavloite olavloite requested a review from as a code owner Mar 23, 2021
@google-cla google-cla bot added the cla: yes label Mar 23, 2021
@product-auto-label product-auto-label bot added the api: spanner label Mar 23, 2021
@codecov
Copy link

@codecov codecov bot commented Mar 23, 2021

Codecov Report

Merging #1009 (6e1d08f) into master (a95f6f8) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1009      +/-   ##
============================================
- Coverage     85.12%   85.11%   -0.01%     
+ Complexity     2619     2618       -1     
============================================
  Files           154      154              
  Lines         14324    14329       +5     
  Branches       1334     1334              
============================================
+ Hits          12193    12196       +3     
- Misses         1569     1570       +1     
- Partials        562      563       +1     
Impacted Files Coverage Δ Complexity Δ
...a/com/google/cloud/spanner/SpannerRetryHelper.java 84.61% <100.00%> (+3.66%) 3.00 <3.00> (ø)
...ain/java/com/google/cloud/spanner/SessionPool.java 88.93% <0.00%> (-0.20%) 71.00% <0.00%> (-1.00%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a95f6f8...6e1d08f. Read the comment docs.

@olavloite olavloite merged commit 6d9c3b8 into master Mar 24, 2021
17 of 18 checks passed
@olavloite olavloite deleted the transaction-timeout branch Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants