Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction: on retry, replays should compare checksums of prior numbered statements that succeeded #34

Closed
odeke-em opened this issue Mar 12, 2020 · 0 comments · Fixed by #168
Assignees
Labels
api: spanner priority: p2 type: feature request

Comments

@odeke-em
Copy link

@odeke-em odeke-em commented Mar 12, 2020

Coming here from a project that plans on adding Cloud Spanner as a backend for Django.

In AUTOCOMMIT=off mode, we need to hold a Transaction for perhaps an indefinitely long time.
Cloud Spanner will abort:
a) Transactions when not used for 10seconds or more -- we can periodically send a SELECT 1=1 to keep it active
b) Transactions even when refreshed, can and will abort. This is because Cloud Spanner has a high abort rate

Thus we need to retry Transactions!

Current retry

The current code for retrying in this repository is to just re-invoke the function that was passed into *.run_in_transaction afresh with a new Transaction per

while True:
if self._transaction is None:
txn = self.transaction()
else:
txn = self._transaction
if txn._transaction_id is None:
txn.begin()
try:
attempts += 1
return_value = func(txn, *args, **kw)
except Aborted as exc:
del self._transaction
_delay_until_retry(exc, deadline, attempts)
continue
except GoogleAPICallError:
del self._transaction
raise
except Exception:
txn.rollback()
raise
try:
txn.commit()
except Aborted as exc:
del self._transaction
_delay_until_retry(exc, deadline, attempts)
except GoogleAPICallError:
del self._transaction
raise
else:
return return_value

Recommended retry

However, the correct way to retry Transactions as @bvandiver explained to me

You are getting quite close to the implementation in the open source JDBC driver. Rather than re-inventing things, I would suggest following their implementation. Of note, your current replay mechanism can lead to wrong answers. Imagine the canonical "transfer balance" transaction which decrements the balance in acct A, then increases the balance in acct B. However, between abort and retry someone deletes acct A - resulting in money magically appearing in acct B and no error (the update silently fails to update any rows). The long and the short of it is that you need to hash the results of all queries + DML and confirm on your retry that they give the same answers. You need query too (think a query to check if there was sufficient balance in acct A).

a) For every result returned by an operation on a Transaction, compute the checksum and add it a FIFO stack
b) At the point that a prior Transaction fails, that's the bottom of our stack
c) When retrying the Transaction from the first statement, compare its checksum with the same ordinal number/index on the FIFO stack -- if any of them don't match, abort the Transaction as not retryable

This is what the Java spanner-jdbc implementation does

Suggestion

The implementation of this feature when attempted outside of this package involves a whole lot of hacking since we need to consume the raw data sent to StreamedResultSets which requires then proto marshalling and wrapping StreamedResult -- quite non-ideal and will actually involve patches to python-spanner.

@bvandiver and I chatted again about this today and I also briefly raised this issue to @skuruppu this afternoon too.

@product-auto-label product-auto-label bot added the api: spanner label Mar 12, 2020
@yoshi-automation yoshi-automation added triage me 🚨 labels Mar 13, 2020
@larkee larkee added type: feature request priority: p2 and removed 🚨 triage me labels Mar 22, 2020
@c24t c24t closed this in #168 Nov 23, 2020
gcf-merge-on-green bot pushed a commit that referenced this issue Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner priority: p2 type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants