Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling for mediator time jumps in datashard #2342

Merged
merged 1 commit into from
Feb 29, 2024

Conversation

snaury
Copy link
Member

@snaury snaury commented Feb 29, 2024

Changelog entry

Better handling for mediator time jumps in datashard

Changelog category

  • Bugfix

Additional information

While investigating G2-item and G-single-item anomalies detected with Jepsen, it was discovered that datashards didn't handle mediator time jumps very well. When mediator is restarted it may replay transaction stream that has not been acknowledged yet. This in turn could cause time cast atomic variable to jump backwards, and lead to confusion, where a chosen mvcc version wouldn't later produce intended side-effects and edge promotions. For example a write version may chose the current mediator step, but later current step jumps backwards, and PromoteCompleteEdge is not called, because the write is "in the future". This could theoretically cause later reads to incorrectly choose an earlier version (based on a concurrent distributed transaction) than intended. It's unclear whether there's an actual bug though, since PromoteImmediatePostExecuteEdges currently calls MarkPlannedLogicallyCompleteUpTo (which also calls PromoteCompleteEdge for all earlier inflight distributed transactions), however we may want to remove that call later (to avoid unintended writes when performing reads concurrently with distributed transactions), and the current code is not robust enough.

This PR has two fixes. First is to never allow atomic time cast variable to go backwards (it's too difficult to reason about code correctness otherwise). Second is to unambiguously choose mvcc versions: for new reads to always include all previously replied immediate writes, and for new writes to always happen after all previously performed immediate writes.

Fixes KIKIMR-21065.

Copy link

github-actions bot commented Feb 29, 2024

2024-02-29 12:50:20 UTC Pre-commit check for bbb52c1 has started.
2024-02-29 12:50:22 UTC Build linux-x86_64-release-cmake14 is running...
🟢 2024-02-29 12:52:46 UTC Build successful.

Copy link

github-actions bot commented Feb 29, 2024

2024-02-29 12:50:28 UTC Pre-commit check for bbb52c1 has started.
2024-02-29 12:50:32 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-29 12:53:07 UTC Build successful.
2024-02-29 12:53:19 UTC Tests are running...
🔴 2024-02-29 14:36:23 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14891 14722 0 27 117 25

Copy link

github-actions bot commented Feb 29, 2024

2024-02-29 12:50:37 UTC Pre-commit check for bbb52c1 has started.
2024-02-29 12:50:39 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-29 12:52:54 UTC Build successful.
2024-02-29 12:53:07 UTC Tests are running...
🔴 2024-02-29 14:07:08 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
68025 57112 0 4 10890 19

@snaury snaury marked this pull request as ready for review February 29, 2024 14:34
@snaury snaury merged commit cd8c306 into ydb-platform:main Feb 29, 2024
5 of 7 checks passed
@snaury snaury deleted the KIKIMR-21065-fix-read-version branch February 29, 2024 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants