kv,kvcoord: assign high priority for admission control if the txn has…#69337
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Aug 25, 2021
Merged
Conversation
Member
RaduBerinde
approved these changes
Aug 24, 2021
Member
RaduBerinde
left a comment
There was a problem hiding this comment.
What a dramatic difference in tail latencies!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
… locks This is a crude way to limit priority inversion where a txn holding locks could be waiting in an admission queue while admitted requests are waiting in the lock table queues for this txn to make progress and release locks. It can also fare better than no admission control, since work from txns holding locks will get prioritized, versus no prioritization in goroutine scheduler. A tpcc run with 3000 warehouses shows 2x reduction in lock waiters and 10+% improvement in txn throughput with this change (both before and after experiments running with admission control enabled). When compared with admission control disabled, we see even higher improvements in lock waiters and txn throughput. Informs cockroachdb#65955 Release note: None Release justification: Low-risk update to new functionality.
590e667 to
120ac09
Compare
sumeerbhola
commented
Aug 25, 2021
Collaborator
Author
sumeerbhola
left a comment
There was a problem hiding this comment.
TFTR!
I've also done a run with no admission control, since lack of prioritization of lock holders could make that worse. The throughput and latency is worse with no admission control. See attached screenshots. In this experiment both cpu and storage became a bottleneck in one node, whose runnable goroutines exceeded 100 and L0 sub-levels exceeded 30.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner)
Member
|
Very cool results! |
Collaborator
Author
|
bors r+ |
Contributor
|
Build succeeded: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
… locks
This is a crude way to limit priority inversion where a txn holding locks
could be waiting in an admission queue while admitted requests are waiting
in the lock table queues for this txn to make progress and release locks.
It can also fare better than no admission control, since work from txns
holding locks will get prioritized, versus no prioritization in goroutine
scheduler.
A tpcc run with 3000 warehouses shows 2x reduction in lock waiters and
10+% improvement in txn throughput with this change (both before and
after experiments running with admission control enabled). When compared
with admission control disabled, we see even higher improvements in lock
waiters and txn throughput.
Some before/after graphs from running tpcc with 3000 warehouses, and
comparison with no admission control.
before (lock waiters):



after (lock waiters):
no admission control (lock waiters):
before (txn throughput and latency):



after (txn throughput and latency):
no admission control (txn throughput and latency):
Informs #65955
Release note: None
Release justification: Low-risk update to new functionality.