Skip to content

[DNM] util/quotapool: add LIFO queueing discipline#46655

Draft
ajwerner wants to merge 1 commit intocockroachdb:masterfrom
ajwerner:ajwerner/lifo-quotapool-queueing
Draft

[DNM] util/quotapool: add LIFO queueing discipline#46655
ajwerner wants to merge 1 commit intocockroachdb:masterfrom
ajwerner:ajwerner/lifo-quotapool-queueing

Conversation

@ajwerner
Copy link
Copy Markdown
Contributor

Sometimes you care more about something getting done than things getting done
in order. This change is motivated by range snapshots. We currently have a
timeout when sending a snapshot. In the short term, the easier fix for bad
queueing behavior on the snapshot semaphore is to not wait on it for too long.
Nevertheless, it motivated the typing of this change so I figure I'll post it.

One point of discussion is the behavior when an acquire call is at the front of
the queue and then fails to acquire because there was insufficient quota. Upon
release of later quota, if there had been

Release justification: doesn't have one, will wait to merge it.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@ajwerner ajwerner force-pushed the ajwerner/lifo-quotapool-queueing branch from 7a02b3a to 3a5ae6d Compare June 17, 2020 15:55
@ajwerner
Copy link
Copy Markdown
Contributor Author

@nvanbenschoten @tbg the LIFO generalization of the quotapool.

@ajwerner
Copy link
Copy Markdown
Contributor Author

It should be easy enough to replace the notifyQueue with an interface and replace it with more generalized implementations. This work helps along that path by lifting some of the responsibility up.

Sometimes you care more about something getting done than things getting done
in order. This change is motivated by range snapshots. We currently have a
timeout when sending a snapshot. In the short term, the easier fix for bad
queueing behavior on the snapshot semaphore is to not wait on it for too long.
Nevertheless, it motivated the typing of this change so I figure I'll post it.

One point of discussion is the behavior when an acquire call is at the front of
the queue and then fails to acquire because there was insufficient quota. Upon
release of later quota, if there had been

Release justification: doesn't have one, will wait to merge it.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/lifo-quotapool-queueing branch from 3a5ae6d to df5d7dd Compare October 20, 2020 03:11
@tbg tbg added the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021
nvb added a commit to nvb/cockroach that referenced this pull request Nov 30, 2021
Alternative to cockroachdb#46655.

This commit introduces a new cluster setting called
`kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a
snapshot's total timeout that it is allowed to spend queued on the receiver
waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout
is intended to prevent starvation of snapshots in cases where a queue of
snapshots waiting for reservations builds and no single snapshot acquires the
semaphore with sufficient time to complete, but each holds the semaphore long
enough to ensure that later snapshots in the queue encounter this same
situation. This is a case of FIFO queuing + timeouts leading to starvation. By
rejecting snapshot attempts earlier, we ensure that those that do acquire the
semaphore have sufficient time to complete.

The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation`
which reproduces this starvation without the fix. With the fix, the test passes
and goodput never collapses to 0.

This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to
Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former
avoids starvation but at the expense of fairness even under low but steady
concurrency. The later avoids compromising on fairness until it switches from
FIFO to LIFO, but is fairly complex. The approach taken in this PR is a
compromise that does not trade fairness under low concurrency and is still
relatively simple, but does retain some risk of starvation in the case where
`totalTimeout - queueTimeout < processingTime`. The default settings ensure that
`processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget`
is used) before this will become a problem in practice.

Release notes (bug fix): Raft snapshots no longer risk starvation under
very high concurrency. Before this fix, it was possible that a thundering
herd of Raft snapshots could be starved and prevented from succeeding due
to timeouts, which were accompanied by errors like `error rate limiting
bulk io write: context deadline exceeded`.
nvb added a commit to nvb/cockroach that referenced this pull request Nov 30, 2021
Alternative to cockroachdb#46655.

This commit introduces a new cluster setting called
`kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a
snapshot's total timeout that it is allowed to spend queued on the receiver
waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout
is intended to prevent starvation of snapshots in cases where a queue of
snapshots waiting for reservations builds and no single snapshot acquires the
semaphore with sufficient time to complete, but each holds the semaphore long
enough to ensure that later snapshots in the queue encounter this same
situation. This is a case of FIFO queuing + timeouts leading to starvation. By
rejecting snapshot attempts earlier, we ensure that those that do acquire the
semaphore have sufficient time to complete.

The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation`
which reproduces this starvation without the fix. With the fix, the test passes
and goodput never collapses to 0.

This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to
Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former
avoids starvation but at the expense of fairness even under low but steady
concurrency. The later avoids compromising on fairness until it switches from
FIFO to LIFO, but is fairly complex. The approach taken in this PR is a
compromise that does not trade fairness under low concurrency and is still
relatively simple, but does retain some risk of starvation in the case where
`totalTimeout - queueTimeout < processingTime`. The default settings ensure that
`processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget`
is used) before this will become a problem in practice.

Release notes (bug fix): Raft snapshots no longer risk starvation under
very high concurrency. Before this fix, it was possible that a thundering
herd of Raft snapshots could be starved and prevented from succeeding due
to timeouts, which were accompanied by errors like `error rate limiting
bulk io write: context deadline exceeded`.
nvb added a commit to nvb/cockroach that referenced this pull request Dec 23, 2021
Alternative to cockroachdb#46655.

This commit introduces a new cluster setting called
`kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a
snapshot's total timeout that it is allowed to spend queued on the receiver
waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout
is intended to prevent starvation of snapshots in cases where a queue of
snapshots waiting for reservations builds and no single snapshot acquires the
semaphore with sufficient time to complete, but each holds the semaphore long
enough to ensure that later snapshots in the queue encounter this same
situation. This is a case of FIFO queuing + timeouts leading to starvation. By
rejecting snapshot attempts earlier, we ensure that those that do acquire the
semaphore have sufficient time to complete.

The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation`
which reproduces this starvation without the fix. With the fix, the test passes
and goodput never collapses to 0.

This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to
Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former
avoids starvation but at the expense of fairness even under low but steady
concurrency. The later avoids compromising on fairness until it switches from
FIFO to LIFO, but is fairly complex. The approach taken in this PR is a
compromise that does not trade fairness under low concurrency and is still
relatively simple, but does retain some risk of starvation in the case where
`totalTimeout - queueTimeout < processingTime`. The default settings ensure that
`processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget`
is used) before this will become a problem in practice.

Release notes (bug fix): Raft snapshots no longer risk starvation under
very high concurrency. Before this fix, it was possible that a thundering
herd of Raft snapshots could be starved and prevented from succeeding due
to timeouts, which were accompanied by errors like `error rate limiting
bulk io write: context deadline exceeded`.
craig bot pushed a commit that referenced this pull request Dec 23, 2021
73288: kv: apply limited timeout to snapshots waiting in reservation queue r=tbg,erikgrinaker a=nvanbenschoten

Alternative to #46655.

This commit introduces a new cluster setting called `kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a snapshot's total timeout that it is allowed to spend queued on the receiver waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout is intended to prevent starvation of snapshots in cases where a queue of snapshots waiting for reservations builds and no single snapshot acquires the semaphore with sufficient time to complete, but each holds the semaphore long enough to ensure that later snapshots in the queue encounter this same situation. This is a case of FIFO queuing + timeouts leading to starvation. By rejecting snapshot attempts earlier, we ensure that those that do acquire the semaphore have sufficient time to complete.

The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation` which reproduces this starvation without the fix. With the fix, the test passes and goodput never collapses to 0.

This is an alternative to strict LIFO queueing (#46655) and an alternative to Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former avoids starvation but at the expense of fairness even under low but steady concurrency. The latter avoids compromising on fairness until it switches from FIFO to LIFO, but is fairly complex. The approach taken in this PR is a compromise that does not trade fairness under low concurrency and is still relatively simple, but does retain some risk of starvation in the case where `totalTimeout - queueTimeout < processingTime`. The default settings ensure that `processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget` is used) before this will become a problem in practice.

Release notes (bug fix): Raft snapshots no longer risk starvation under very high concurrency. Before this fix, it was possible that a thundering herd of Raft snapshots could be starved and prevented from succeeding due to timeouts, which were accompanied by errors like `error rate limiting bulk io write: context deadline exceeded`.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
gustasva pushed a commit to gustasva/cockroach that referenced this pull request Jan 4, 2022
Alternative to cockroachdb#46655.

This commit introduces a new cluster setting called
`kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a
snapshot's total timeout that it is allowed to spend queued on the receiver
waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout
is intended to prevent starvation of snapshots in cases where a queue of
snapshots waiting for reservations builds and no single snapshot acquires the
semaphore with sufficient time to complete, but each holds the semaphore long
enough to ensure that later snapshots in the queue encounter this same
situation. This is a case of FIFO queuing + timeouts leading to starvation. By
rejecting snapshot attempts earlier, we ensure that those that do acquire the
semaphore have sufficient time to complete.

The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation`
which reproduces this starvation without the fix. With the fix, the test passes
and goodput never collapses to 0.

This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to
Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former
avoids starvation but at the expense of fairness even under low but steady
concurrency. The later avoids compromising on fairness until it switches from
FIFO to LIFO, but is fairly complex. The approach taken in this PR is a
compromise that does not trade fairness under low concurrency and is still
relatively simple, but does retain some risk of starvation in the case where
`totalTimeout - queueTimeout < processingTime`. The default settings ensure that
`processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget`
is used) before this will become a problem in practice.

Release notes (bug fix): Raft snapshots no longer risk starvation under
very high concurrency. Before this fix, it was possible that a thundering
herd of Raft snapshots could be starved and prevented from succeeding due
to timeouts, which were accompanied by errors like `error rate limiting
bulk io write: context deadline exceeded`.
nvb added a commit to nvb/cockroach that referenced this pull request May 16, 2022
Alternative to cockroachdb#46655.

This commit introduces a new cluster setting called
`kv.snapshot_receiver.queue_timeout_fraction` which dictates the fraction of a
snapshot's total timeout that it is allowed to spend queued on the receiver
waiting for a reservation. Enforcement of this snapshotApplySem-scoped timeout
is intended to prevent starvation of snapshots in cases where a queue of
snapshots waiting for reservations builds and no single snapshot acquires the
semaphore with sufficient time to complete, but each holds the semaphore long
enough to ensure that later snapshots in the queue encounter this same
situation. This is a case of FIFO queuing + timeouts leading to starvation. By
rejecting snapshot attempts earlier, we ensure that those that do acquire the
semaphore have sufficient time to complete.

The commit adds a new test called `TestReserveSnapshotQueueTimeoutAvoidsStarvation`
which reproduces this starvation without the fix. With the fix, the test passes
and goodput never collapses to 0.

This is an alternative to strict LIFO queueing (cockroachdb#46655) and an alternative to
Adaptive LIFO queueing (https://queue.acm.org/detail.cfm?id=2839461). The former
avoids starvation but at the expense of fairness even under low but steady
concurrency. The later avoids compromising on fairness until it switches from
FIFO to LIFO, but is fairly complex. The approach taken in this PR is a
compromise that does not trade fairness under low concurrency and is still
relatively simple, but does retain some risk of starvation in the case where
`totalTimeout - queueTimeout < processingTime`. The default settings ensure that
`processingTime` needs to be at least `30s` (assuming `kv.queue.process.guaranteed_time_budget`
is used) before this will become a problem in practice.

Release notes (bug fix): Raft snapshots no longer risk starvation under
very high concurrency. Before this fix, it was possible that a thundering
herd of Raft snapshots could be starved and prevented from succeeding due
to timeouts, which were accompanied by errors like `error rate limiting
bulk io write: context deadline exceeded`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

X-noremind Bots won't notify about PRs with X-noremind

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants