[DNM] util/quotapool: add LIFO queueing discipline#46655
Draft
ajwerner wants to merge 1 commit intocockroachdb:masterfrom
Draft
[DNM] util/quotapool: add LIFO queueing discipline#46655ajwerner wants to merge 1 commit intocockroachdb:masterfrom
ajwerner wants to merge 1 commit intocockroachdb:masterfrom
Conversation
Member
7a02b3a to
3a5ae6d
Compare
Contributor
Author
|
@nvanbenschoten @tbg the LIFO generalization of the quotapool. |
Contributor
Author
|
It should be easy enough to replace the notifyQueue with an interface and replace it with more generalized implementations. This work helps along that path by lifting some of the responsibility up. |
Sometimes you care more about something getting done than things getting done in order. This change is motivated by range snapshots. We currently have a timeout when sending a snapshot. In the short term, the easier fix for bad queueing behavior on the snapshot semaphore is to not wait on it for too long. Nevertheless, it motivated the typing of this change so I figure I'll post it. One point of discussion is the behavior when an acquire call is at the front of the queue and then fails to acquire because there was insufficient quota. Upon release of later quota, if there had been Release justification: doesn't have one, will wait to merge it. Release note: None
3a5ae6d to
df5d7dd
Compare
nvb
added a commit
to nvb/cockroach
that referenced
this pull request
Nov 30, 2021
Alternative to cockroachdb#46655. This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete. The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0. This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The later avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice. Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`.
nvb
added a commit
to nvb/cockroach
that referenced
this pull request
Nov 30, 2021
Alternative to cockroachdb#46655. This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete. The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0. This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The later avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice. Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`.
nvb
added a commit
to nvb/cockroach
that referenced
this pull request
Dec 23, 2021
Alternative to cockroachdb#46655. This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete. The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0. This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The later avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice. Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`.
craig bot
pushed a commit
that referenced
this pull request
Dec 23, 2021
73288: kv: apply limited timeout to snapshots waiting in reservation queue r=tbg,erikgrinaker a=nvanbenschoten Alternative to #46655. This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete. The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0. This is an alternative to strict LIFO queueing (#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The latter avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice. Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
gustasva
pushed a commit
to gustasva/cockroach
that referenced
this pull request
Jan 4, 2022
Alternative to cockroachdb#46655. This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete. The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0. This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The later avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice. Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`.
nvb
added a commit
to nvb/cockroach
that referenced
this pull request
May 16, 2022
Alternative to cockroachdb#46655. This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete. The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0. This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The later avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice. Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sometimes you care more about something getting done than things getting done
in order. This change is motivated by range snapshots. We currently have a
timeout when sending a snapshot. In the short term, the easier fix for bad
queueing behavior on the snapshot semaphore is to not wait on it for too long.
Nevertheless, it motivated the typing of this change so I figure I'll post it.
One point of discussion is the behavior when an acquire call is at the front of
the queue and then fails to acquire because there was insufficient quota. Upon
release of later quota, if there had been
Release justification: doesn't have one, will wait to merge it.
Release note: None