Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lost coordinator steps during mediator reconnect race #2037

Merged
merged 1 commit into from
Feb 16, 2024

Conversation

snaury
Copy link
Member

@snaury snaury commented Feb 16, 2024

Changelog entry

Fix lost coordinator steps during mediator reconnect race.

Changelog category

  • Bugfix

Additional information

Coordinator's mediator queues were changed in 24-1 to support efficient planning of volatile transactions. Unfortunately mediator queue actor didn't handle mediator steps during mediator reconnect (never needed to, since old code was reloading the full queue from disk on every reconnect), and in rare failure modes this could cause entire planned steps to be lost at some mediators. The fix is to handle mediator steps during re-sync. Fixes KIKIMR-21077.

@snaury snaury marked this pull request as ready for review February 16, 2024 16:38
Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 16:39:12 UTC Pre-commit check for 0c4f274 has started.
2024-02-16 16:39:14 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-16 16:41:15 UTC Build successful.
2024-02-16 16:41:29 UTC Tests are running...
🔴 2024-02-16 17:57:13 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67529 56515 0 5 10976 33

Copy link

github-actions bot commented Feb 16, 2024

2024-02-16 16:41:49 UTC Pre-commit check for 0c4f274 has started.
2024-02-16 16:41:52 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-16 16:44:07 UTC Build successful.
2024-02-16 16:44:25 UTC Tests are running...
🔴 2024-02-16 18:29:40 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14778 14546 0 26 165 41

@snaury snaury merged commit 71f7291 into ydb-platform:main Feb 16, 2024
4 of 6 checks passed
@snaury snaury deleted the KIKIMR-21077-fix-missing-steps branch February 16, 2024 18:13
@mvgorbunov mvgorbunov mentioned this pull request Feb 22, 2024
@shnikd shnikd mentioned this pull request Feb 27, 2024
@mregrock mregrock mentioned this pull request May 15, 2024
This was referenced Jun 7, 2024
@CyberROFL CyberROFL mentioned this pull request Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants