-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed not applying snapshot for new stms #17112
Conversation
State machine manager manages the aggregated Raft snapshot for all the state machines created based on one Raft instance. The managed snapshot is a map containing individual snapshot data for each state machine. When a new STM is created while the managed snapshot was already taken the `state_machine_base::apply_raft_snapshot` should still be called even if the snapshot doesn't exists in the managed snapshot map. This way an STM will be informed that the log doesn't start from 0. Previously when snapshot was not present in the managed snapshot map we skipped calling `apply_snapshot` on the STM and didn't advance it's `_next` offset which lead to stuck background apply fiber loop. Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
079a841
to
b700b95
Compare
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46697#018e749b-d2dc-4e93-b812-57f7e9255718 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46697#018e749b-d2d6-42cd-b0be-326d63d097f1 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46713#018e76be-ac00-4a30-86c2-a8b6a3209b18 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46805#018e7bea-4ae4-4267-a8c4-2d9570bf07a0 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46805#018e7bea-4adb-48eb-836e-35a47429c9a2 |
/ci-repeat 1 |
|
||
if (stm_entry->stm->last_applied_offset() < last_offset) { | ||
if (it != snapshot.snapshot_map.end()) { | ||
co_await stm_entry->stm->apply_raft_snapshot(it->second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a use-after-free waiting to happen. apply_raft_snapshot
takes an iobuf&
, but within apply_raft_snapshot
if it co_await's then it may cause the snapshot map to change and the iterator (and the associated iobuf) invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the snapshot map is no changing here. Once deserialized it is set i stone.
/dt |
ci failure: #17247 |
/backport v23.3.x |
State machine manager manages the aggregated Raft snapshot for all the
state machines created based on one Raft instance. The managed snapshot
is a map containing individual snapshot data for each state machine.
When a new STM is created while the managed snapshot was already taken
the
state_machine_base::apply_raft_snapshot
should still be calledeven if the snapshot doesn't exists in the managed snapshot map.
This way an STM will be informed that the log doesn't start from 0.
Previously when snapshot was not present in the managed snapshot map we
skipped calling
apply_snapshot
on the STM and didn't advance it's_next
offset which lead to stuck background apply fiber loop.Fixes: #17086
Backports Required
Release Notes
Bug Fixes