Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async functions yield while traversing the gossiper endpoint state map #13899

Closed
bhalevy opened this issue May 16, 2023 · 4 comments · Fixed by #13900
Closed

Async functions yield while traversing the gossiper endpoint state map #13899

bhalevy opened this issue May 16, 2023 · 4 comments · Fixed by #13900
Assignees
Milestone

Comments

@bhalevy
Copy link
Member

bhalevy commented May 16, 2023

Currently, there are several places where async functions access the gossiper endpoint state map using get_endpoint_states (or internally in gossiper via _endpoint_state_map and they might preempt while traversing the map.

This is dangerous since rehashing the map keys while yielding may invalidate outstanding iterators.
See https://en.cppreference.com/w/cpp/container/unordered_map/insert

If rehashing occurs due to the insertion, all iterators are invalidated.

The places that need to be fixed according to my examination are:

co_await remove_endpoint(endpoint); // will put it in _just_removed_endpoints to respect quarantine delay

since a01c900 (5.0)

co_await coroutine::maybe_yield();

Since 84a78f4 (4.6)

sleep_abortable(std::chrono::seconds(1), _abort_source).get();

Since forever pretty much

@bhalevy bhalevy self-assigned this May 16, 2023
@bhalevy
Copy link
Member Author

bhalevy commented May 16, 2023

I think that we didn't hit this so far (that we know of) due to fortunate timing and the scarcity of node operations.

bhalevy added a commit to bhalevy/scylla that referenced this issue May 16, 2023
The map iterators might be invalidated while yielding
on insert if the map is rehashed.
See https://en.cppreference.com/w/cpp/container/unordered_map/insert

Refs scylladb#13899

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
bhalevy added a commit to bhalevy/scylla that referenced this issue May 16, 2023
The map iterators might be invalidated while yielding
on insert if the map is rehashed.
See https://en.cppreference.com/w/cpp/container/unordered_map/insert

Refs scylladb#13899

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
bhalevy added a commit to bhalevy/scylla that referenced this issue May 16, 2023
The map iterators might be invalidated while yielding
on insert if the map is rehashed.
See https://en.cppreference.com/w/cpp/container/unordered_map/insert

Refs scylladb#13899

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
bhalevy added a commit to bhalevy/scylla that referenced this issue May 16, 2023
The map iterators might be invalidated while yielding
on insert if the map is rehashed.
See https://en.cppreference.com/w/cpp/container/unordered_map/insert

Refs scylladb#13899

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
avikivity added a commit that referenced this issue May 16, 2023
… from Benny Halevy

This series introduces a new gossiper method: get_endpoints that returns a vector of endpoints (by value) based on the endpoint state map.

get_endpoints is used here by gossiper and storage_service for iterations that may preempt
instead of iterating direction over the endpoint state map (`_endpoint_state_map` in gossiper or via `get_endpoint_states()`) so to prevent use-after-free that may potentially happen if the map is rehashed while the function yields causing invalidation of the loop iterators.

Fixes #13899

Closes #13900

* github.com:scylladb/scylladb:
  storage_service: do not preempt while traversing endpoint_state_map
  gossiper: do not preempt while traversing endpoint_state_map
avikivity added a commit that referenced this issue May 16, 2023
… from Benny Halevy

This series introduces a new gossiper method: get_endpoints that returns a vector of endpoints (by value) based on the endpoint state map.

get_endpoints is used here by gossiper and storage_service for iterations that may preempt
instead of iterating direction over the endpoint state map (`_endpoint_state_map` in gossiper or via `get_endpoint_states()`) so to prevent use-after-free that may potentially happen if the map is rehashed while the function yields causing invalidation of the loop iterators.

Fixes #13899

Closes #13900

* github.com:scylladb/scylladb:
  storage_service: do not preempt while traversing endpoint_state_map
  gossiper: do not preempt while traversing endpoint_state_map
@DoronArazii DoronArazii added this to the 5.3 milestone Jun 20, 2023
@denesb
Copy link
Contributor

denesb commented Dec 15, 2023

Doesn't apply to 5.2 cleanly, @bhalevy please provide a backport PR.

bhalevy pushed a commit to bhalevy/scylla that referenced this issue Dec 17, 2023
… from Benny Halevy

This series introduces a new gossiper method: get_endpoints that returns a vector of endpoints (by value) based on the endpoint state map.

get_endpoints is used here by gossiper and storage_service for iterations that may preempt
instead of iterating direction over the endpoint state map (`_endpoint_state_map` in gossiper or via `get_endpoint_states()`) so to prevent use-after-free that may potentially happen if the map is rehashed while the function yields causing invalidation of the loop iterators.

\Fixes scylladb#13899

\Closes scylladb#13900

* github.com:scylladb/scylladb:
  storage_service: do not preempt while traversing endpoint_state_map
  gossiper: do not preempt while traversing endpoint_state_map

(cherry picked from commit d2d53fc)
@bhalevy
Copy link
Member Author

bhalevy commented Dec 17, 2023

Doesn't apply to 5.2 cleanly, @bhalevy please provide a backport PR.

@denesb: #16431

denesb pushed a commit that referenced this issue Dec 18, 2023
… from Benny Halevy

This series introduces a new gossiper method: get_endpoints that returns a vector of endpoints (by value) based on the endpoint state map.

get_endpoints is used here by gossiper and storage_service for iterations that may preempt
instead of iterating direction over the endpoint state map (`_endpoint_state_map` in gossiper or via `get_endpoint_states()`) so to prevent use-after-free that may potentially happen if the map is rehashed while the function yields causing invalidation of the loop iterators.

\Fixes #13899

\Closes #13900

* github.com:scylladb/scylladb:
  storage_service: do not preempt while traversing endpoint_state_map
  gossiper: do not preempt while traversing endpoint_state_map

(cherry picked from commit d2d53fc)

Closes #16431
@denesb
Copy link
Contributor

denesb commented Dec 18, 2023

Backport to 5.2 queued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants