storage: relax assertion about "prevented" parallel commits that succeed#37784
Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom May 24, 2019
Merged
Conversation
This commit adds intent resolution to the parallel commits spec. It then uses intent resolution to more accurately model the implementation of QueryIntent requests. Doing so reveals an overly strict assertion in the `RecoverRecord` step, which fires immediately. Release note: None
I was seeing the following assertion fail when running TPC-C: ``` programming error: found COMMITTED record for prevented implicit commit ``` This was very concerning until I realized that the assertion was not taking into account how QueryIntent requests interact with resolved writes. If a transaction recovery process believes it successfully prevented a write that was in-flight while a transaction was performing a parallel commit then we would expect that the transaction record could only be committed if it has a higher epoch or timestamp. This is true if the recovery process did actually prevent the in-flight write. However, due to QueryIntent's implementation, a successful intent write that was already resolved after the parallel commit finished can be mistaken for a missing in-flight write by a recovery process. This ambiguity is harmless, as the transaction stays committed either way, but it means that we can't be quite as strict about what we assert in RecoverTxn as we would like to be. This commit removes the assertion and leaves a comment explaining the ambiguity in its place. Release note: None
Code movement and a better comment. Release note: None
Member
ajwerner
approved these changes
May 24, 2019
Contributor
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
Contributor
Author
|
TFTR. bors r=ajwerner |
craig Bot
pushed a commit
that referenced
this pull request
May 24, 2019
37784: storage: relax assertion about "prevented" parallel commits that succeed r=ajwerner a=nvanbenschoten I was seeing the following assertion fire when running TPC-C: ``` programming error: found COMMITTED record for prevented implicit commit ``` This was very concerning until I realized that the assertion was not taking into account how QueryIntent requests interact with resolved writes. If a transaction recovery process believes it successfully prevented a write that was in-flight while a transaction was performing a parallel commit then we would expect that the transaction record could only be committed if it has a higher epoch or timestamp. This is true if the recovery process did actually prevent the in-flight write. However, due to QueryIntent's implementation, a successful intent write that was already resolved after the parallel commit finished can be mistaken for a missing in-flight write by a recovery process. This ambiguity is harmless, as the transaction stays committed either way, but it means that we can't be quite as strict about what we assert in RecoverTxn as we would like to be. The PR starts by modeling intent resolution to demonstrate that it can cause the assertion to fire. It then fixes (read: removes) the assertion in the next commit. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Contributor
Build succeeded |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I was seeing the following assertion fire when running TPC-C:
This was very concerning until I realized that the assertion was not taking into account how QueryIntent requests interact with resolved writes.
If a transaction recovery process believes it successfully prevented a write that was in-flight while a transaction was performing a parallel commit then we would expect that the transaction record could only be committed if it has a higher epoch or timestamp. This is true if the recovery process did actually prevent the in-flight write.
However, due to QueryIntent's implementation, a successful intent write that was already resolved after the parallel commit finished can be mistaken for a missing in-flight write by a recovery process. This ambiguity is harmless, as the transaction stays committed either way, but it means that we can't be quite as strict about what we assert in RecoverTxn as we would like to be.
The PR starts by modeling intent resolution to demonstrate that it can cause the assertion to fire. It then fixes (read: removes) the assertion in the next commit.