Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: Move jepsen-initialized marker out of defer #37918

Merged
merged 1 commit into from Jun 3, 2019

Conversation

bdarnell
Copy link
Member

This was running whether the test succeeded or failed, which is silly.

Fixes #37831

Release note: None

This was running whether the test succeeded or failed, which is silly.

Fixes cockroachdb#37831

Release note: None
@bdarnell bdarnell requested a review from tbg May 29, 2019 18:00
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@bdarnell
Copy link
Member Author

bors r=tbg

1 similar comment
@bdarnell
Copy link
Member Author

bors r=tbg

craig bot pushed a commit that referenced this pull request May 31, 2019
37918: roachtest: Move jepsen-initialized marker out of defer r=tbg a=bdarnell

This was running whether the test succeeded or failed, which is silly.

Fixes #37831

Release note: None

Co-authored-by: Ben Darnell <ben@bendarnell.com>
@bobvawter
Copy link
Member

bors r=tbg

craig bot pushed a commit that referenced this pull request Jun 3, 2019
37900: tla-plus: QueryTxn on ambiguous QueryIntent failure during ParallelCommit r=nvanbenschoten a=nvanbenschoten

This PR starts by adding write pipelining and concurrent intent resolution by conflicting transactions to the parallel commits TLA+ spec. This causes an assertion to be triggered by the hazard discussed in #37866.

The assertion fires when the pre-commit QueryIntent for a pipelined write gets confused about an already resolved intent. This triggers a transaction retry, at which point the transaction record is unexpectedly already committed.

This PR then proposes a medium-term solution to #37866. In doing so, it resolved the model failure from the previous commit.

The medium-term solution is to catch `IntentMissingErrors` in `DistSender`'s `divideAndSendParallelCommit` method coming from the pre-commit QueryIntent batch (right around [here](https://github.com/cockroachdb/cockroach/blob/2428567cba5e0838615521cbc9d0e1310f0ee6ad/pkg/kv/dist_sender.go#L916)). When we see one of these errors, we immediately send a `QueryTxn` request to the transaction record. This will result in one of the four statuses:
1. PENDING: Unexpected because the parallel commit `EndTransactionRequest` succeeded. Ignore.
2. STAGING: Unambiguously not the issue from #37866. Ignore.
3. COMMITTED: Unambiguously the issue from #37866. Strip the error and return the updated proto.
4. ABORTED: Still ambiguous. Transform error into an AmbiguousCommitError and return.

This solution isolates the ambiguity caused by the loss of information during intent resolution to just the case where the result of the QueryTxn is ABORTED. This is because an ABORTED record can mean either 1) the transaction was ABORTED and the missing intent was removed or 2) the transaction was COMMITTED, all intents were resolved, and the transaction record was GCed.

This is a significant reduction in the cases where an AmbiguousCommitError will be needed and I suspect it will be able to tide us over until we're able to eliminate the loss of information caused by intent resolution almost entirely (e.g. by storing transaction IDs in committed values). There will still be some loss of information if we're not careful about MVCC GC, and it's still not completely clear to me how we'll need to handle that in every case. That's a discussion for a different time.

37918: roachtest: Move jepsen-initialized marker out of defer r=tbg a=bdarnell

This was running whether the test succeeded or failed, which is silly.

Fixes #37831

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Ben Darnell <ben@bendarnell.com>
@craig
Copy link
Contributor

craig bot commented Jun 3, 2019

Build succeeded

@craig craig bot merged commit 938cc60 into cockroachdb:master Jun 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

jepsen: /root/lein: No such file or directory
4 participants