Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: simplify and track entire set of gossiped IOThresholds #85739

Merged
merged 1 commit into from
Aug 15, 2022

Conversation

tbg
Copy link
Member

@tbg tbg commented Aug 8, 2022

This commit makes the following changes:

  • track all IOThresholds in the store's map, not just the ones for
    overloaded stores.
  • improve the container for these IOThresholds to be easier to work
    with.
  • Rather than "hard-coding" a value of "1.0" to mean overloaded,
    use (and plumb) the value of the cluster setting. "1.0" is the
    value at which I/O admission control chooses to engage; but
    the cluster setting is usually smaller and determines when to
    consider followers on a remote store pausable. The API now
    reflects that and avoids this kind of confusion.
  • Rename all uses of the container away from "overload" towards
    "IOThreshold".
  • add a Sequence() method that is bumped whenever the set of Stores
    whose IOThreshold score indicates pausability changes.

I originally started to work on this to address #84465, but realized
that we couldn't "just" leave the set of paused followers untouched
absent sequence changes. This is because the set of paused followers
has additional inputs, most importantly the set of live followers.
This set is per-Replica and subject to change, so we can't be too
sure the outcome would be the same, and we do want to be reactive
to followers becoming nonresponsive by, if necessary, unpausing
followers.

I think we will have to address #84465 by reducing the frequency
at which the paused stores are revisited, but adding an eager
pass whenever the sequence is bumped.

Additionally, for #84252, we are likely also going to be able to rely on
the sequence number to trigger unquiescing of ranges that were
previously quiesced in the presence of a paused follower.

Regardless of these future possible uses, this is a nice conceptual
clean-up and a good last PR to land for pausing in the 22.2 cycle
(unless we find time to add quiescence in presence of paused followers,
in which case that would be worthy follow-up).

I verified that with this commit, the roachtest still works and
effectively avoids I/O admission control activation a large percentage
of the time at a setting of 0.8. This gives good confidence - at least
for this exact test - that with 0.5 we'd probably never see admission
control throttle foreground writes. However, the test is fairly
specific since it severely constrains n3's disk throughput, so
0.8 might be perfectly appropriate in practice still. We'll need
some more experience to tell.

Touches #84465.
Touches #84252.

Release note: None
Release justification: low-risk improvement to new functionality

@tbg tbg requested a review from a team August 8, 2022 13:39
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg force-pushed the lazy-update-paused-followers branch 2 times, most recently from 33d307c to 05e3d12 Compare August 11, 2022 14:34
@tbg tbg force-pushed the lazy-update-paused-followers branch from 05e3d12 to 21eb10b Compare August 12, 2022 10:42
@tbg tbg changed the title kvserver: track when the set of overloaded stores changes kvserver: simplify and track entire set of gossiped IOThresholds Aug 12, 2022
@tbg tbg force-pushed the lazy-update-paused-followers branch from 21eb10b to c4df7db Compare August 12, 2022 11:16
@tbg tbg marked this pull request as ready for review August 12, 2022 13:31
@tbg tbg requested a review from a team as a code owner August 12, 2022 13:31
@tbg tbg force-pushed the lazy-update-paused-followers branch from c4df7db to 4eb0c8f Compare August 12, 2022 13:33
@tbg
Copy link
Member Author

tbg commented Aug 12, 2022

@erikgrinaker this is now ready for review. I've run the latest round of tests with this PR so I'm confident it "works" and I think it leaves things in much cleaner place than before. I propose that a follow-up PR enables the cluster setting with a value of 0.8.
It would be nice to also get the gossip reactivity change in for quiescing, but it's unclear whether there is enough bandwidth left given competing priorities. If absence of quiescence is a problem, at least we can disable the cluster setting and will not get pausing, but at least get quiescence, so we don't forcibly regress against the status quo.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of these future possible uses, this is a nice conceptual clean-up and a good last PR to land for pausing in the 22.2 cycle (unless we find time to add quiescence in presence of paused followers, in which case that would be worthy follow-up).

This change seems fine, but we're planning on doing something about the following before 22.2 right?

I think we will have to address #84465 by reducing the frequency at which the paused stores are revisited, but adding an eager pass whenever the sequence is bumped.

pkg/kv/kvserver/replica_raft_overload.go Outdated Show resolved Hide resolved
@tbg tbg force-pushed the lazy-update-paused-followers branch from 4eb0c8f to b44a21b Compare August 15, 2022 07:22
@tbg
Copy link
Member Author

tbg commented Aug 15, 2022

This change seems fine, but we're planning on doing something about the following before 22.2 right?

We would given enough bandwidth, but if I am to join the fun on MVCC tombstones this may be something that we won't get to. Let's discuss separately.

@tbg tbg force-pushed the lazy-update-paused-followers branch from b44a21b to bb277cc Compare August 15, 2022 10:09
This commit makes the following changes:

- track *all* IOThresholds in the store's map, not just the ones for
  overloaded stores.
- improve the container for these IOThresholds to be easier to work
  with.
- Rather than "hard-coding" a value of "1.0" to mean overloaded,
  use (and plumb) the value of the cluster setting. "1.0" is the
  value at which I/O admission control chooses to engage; but
  the cluster setting is usually smaller and determines when to
  consider followers on a remote store pausable. The API now
  reflects that and avoids this kind of confusion.
- Rename all uses of the container away from "overload" towards
  "IOThreshold".
- add a Sequence() method that is bumped whenever the set of Stores
  whose IOThreshold score indicates pausability changes.

I originally started to work on this to address cockroachdb#84465, but realized
that we couldn't "just" leave the set of paused followers untouched
absent sequence changes. This is because the set of paused followers
has additional inputs, most importantly the set of live followers.
This set is per-Replica and subject to change, so we can't be too
sure the outcome would be the same, and we do want to be reactive
to followers becoming nonresponsive by, if necessary, unpausing
followers.

I think we will have to address cockroachdb#84465 by reducing the frequency
at which the paused stores are revisited, but adding an eager
pass whenever the sequence is bumped.

Additionally, for cockroachdb#84252, we are likely also going to be able to rely on
the sequence number to trigger unquiescing of ranges that were
previously quiesced in the presence of a paused follower.

Regardless of these future possible uses, this is a nice conceptual
clean-up and a good last PR to land for pausing in the 22.2 cycle
(unless we find time to add quiescence in presence of paused followers,
in which case that would be worthy follow-up).

I verified that with this commit, the [roachtest] still works and
effectively avoids I/O admission control activation a large percentage
of the time at a setting of 0.8. This gives good confidence - at least
for this exact test - that with 0.5 we'd probably never see admission
control throttle foreground writes. However, the test is fairly
specific since it severely constrains n3's disk throughput, so
0.8 might be perfectly appropriate in practice still. We'll need
some more experience to tell.

[roachtest]: cockroachdb#81516

Touches cockroachdb#84465.
Touches cockroachdb#84252.

Release note: None
@tbg tbg force-pushed the lazy-update-paused-followers branch from bb277cc to 892d5b4 Compare August 15, 2022 10:09
@tbg
Copy link
Member Author

tbg commented Aug 15, 2022

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Aug 15, 2022

Build succeeded:

@craig craig bot merged commit 1d8f18c into cockroachdb:master Aug 15, 2022
@tbg tbg deleted the lazy-update-paused-followers branch August 15, 2022 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants