Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed background apply fiber race condition in raft::state_machine_manager #16850

Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Mar 4, 2024

raft::state_machine_manager uses background apply fiber to
individually apply batches to state machines which are behind the main
apply fiber. When background apply fiber is active it reads and apply
batches up to current committed offset. When background apply fiber is
active it acquires the mutex. When mutex is acquired the main apply
fiber do not consider the stm as up to date.

The code was prone to very rare race condition as the background apply
was finished in one continuation but the units were release in
subsequent finally block. Normally this approach is harmless as the
semaphore is waited for and it will be signalled after the finally
fiber is executed. In state_machine_manager we only check if the
semaphore is available, this makes the solution vulnerable to timing.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Mar 4, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/45620#018e0a1b-796c-4865-ba57-5fc00b08a5e8:

"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type=CloudStorageType.ABS.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True.SpilloverManifestUploaded==True"

new failures in https://buildkite.com/redpanda/redpanda/builds/45628#018e0ab3-5547-482a-9e1a-35279f46e5b1:

"rptest.tests.retention_policy_test.ShadowIndexingCloudRetentionTest.test_cloud_time_based_retention.cloud_storage_type=CloudStorageType.S3"

`raft::state_machine_manager` uses background apply fiber to
individually apply batches to state machines which are behind the main
apply fiber. When background apply fiber is active it reads and apply
batches up to current committed offset. When background apply fiber is
active it acquires the mutex. When mutex is acquired the main apply
fiber do not consider the stm as up to date.

The code was prone to very rare race condition as the background apply
was finished in one continuation but the units were release in
subsequent `finally` block. Normally this approach is harmless as the
semaphore is waited for and it will be signaled after the `finally`
fiber is executed. In `state_machine_manager` we only check if the
semaphore is available, this makes the solution vulnerable to timing.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Added state machine manager test waiting for batches to be applied after
each replicate. This test is designed to detect a situation in which
background apply fiber finishes but managed stm is still behind the
others.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv
Copy link
Member Author

/ci-repeat 1

@mmaslankaprv mmaslankaprv merged commit 14dd782 into redpanda-data:dev Mar 5, 2024
16 checks passed
@mmaslankaprv mmaslankaprv deleted the stm-manager-race-condition-fix branch March 5, 2024 11:31
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants