Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reader_concurrency_semaphore: Fix stop() in face of evictable reads becoming inactive #15535

Conversation

raphaelsc
Copy link
Member

Scylla can crash due to a complicated interaction of service level drop, evictable readers, inactive read registration path.

  1. service level drop invoke stop of reader concurrency semaphore, which will wait for in flight requests

  2. turns out it stops first the gate used for closing readers that will become inactive.

  3. proceeds to wait for in-flight reads by closing the reader permit gate.

  4. one of evictable reads take the inactive read registration path, and finds the gate for closing readers closed.

  5. flat mutation reader is destroyed, but finds the underlying reader was not closed gracefully and triggers the abort.

By closing permit gate first, evictable readers becoming inactive will be able to properly close underlying reader, therefore avoiding the crash.

Fixes #15534.

…ecoming inactive

Scylla can crash due to a complicated interaction of service level drop,
evictable readers, inactive read registration path.

1) service level drop invoke stop of reader concurrency semaphore, which will
wait for in flight requests

2) turns out it stops first the gate used for closing readers that will
become inactive.

3) proceeds to wait for in-flight reads by closing the reader permit gate.

4) one of evictable reads take the inactive read registration path, and
finds the gate for closing readers closed.

5) flat mutation reader is destroyed, but finds the underlying reader was
not closed gracefully and triggers the abort.

By closing permit gate first, evictable readers becoming inactive will
be able to properly close underlying reader, therefore avoiding the
crash.

Fixes scylladb#15534.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
@github-actions github-actions bot deleted a comment from aws-amplify-us-east-2 bot Sep 24, 2023
@scylladb-promoter
Copy link
Contributor

denesb pushed a commit that referenced this pull request Sep 29, 2023
…ecoming inactive

Scylla can crash due to a complicated interaction of service level drop,
evictable readers, inactive read registration path.

1) service level drop invoke stop of reader concurrency semaphore, which will
wait for in flight requests

2) turns out it stops first the gate used for closing readers that will
become inactive.

3) proceeds to wait for in-flight reads by closing the reader permit gate.

4) one of evictable reads take the inactive read registration path, and
finds the gate for closing readers closed.

5) flat mutation reader is destroyed, but finds the underlying reader was
not closed gracefully and triggers the abort.

By closing permit gate first, evictable readers becoming inactive will
be able to properly close underlying reader, therefore avoiding the
crash.

Fixes #15534.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15535

(cherry picked from commit 914cbc1)
denesb pushed a commit that referenced this pull request Sep 29, 2023
…ecoming inactive

Scylla can crash due to a complicated interaction of service level drop,
evictable readers, inactive read registration path.

1) service level drop invoke stop of reader concurrency semaphore, which will
wait for in flight requests

2) turns out it stops first the gate used for closing readers that will
become inactive.

3) proceeds to wait for in-flight reads by closing the reader permit gate.

4) one of evictable reads take the inactive read registration path, and
finds the gate for closing readers closed.

5) flat mutation reader is destroyed, but finds the underlying reader was
not closed gracefully and triggers the abort.

By closing permit gate first, evictable readers becoming inactive will
be able to properly close underlying reader, therefore avoiding the
crash.

Fixes #15534.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15535

(cherry picked from commit 914cbc1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scylla can crash when stopping reader concurrency semaphore
3 participants