-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Maintain a separate set of unquiesced replicas #24956
Conversation
Reviewed 2 of 2 files at r1, 4 of 4 files at r2, 1 of 1 files at r3, 1 of 1 files at r4. pkg/storage/replica.go, line 592 at r4 (raw file):
We've historically had subtle errors in logic like this. Perhaps this code should be restructured for testability. For example, the logic could be placed in a pkg/storage/replica.go, line 3914 at r8 (raw file):
Is this method being used anymore? pkg/storage/store.go, line 511 at r8 (raw file):
pkg/storage/store.go, line 512 at r8 (raw file):
Comments from Reviewable |
mod @petermattis' comments and waiting until the precursor has merged. Reviewed 2 of 4 files at r2, 2 of 2 files at r6, 1 of 1 files at r7, 2 of 2 files at r8. pkg/storage/store.go, line 2308 at r8 (raw file):
Comments from Reviewable |
This is no longer used Release note: None
Release note: None
646debb
to
0fbcbd7
Compare
This means that idle replicas no longer have a per-tick CPU cost, which is one of the bottlenecks limiting the amount of data we can handle per store. Fixes cockroachdb#17609 Release note (performance improvement): Reduced CPU overhead of idle ranges
0fbcbd7
to
f4c5e3c
Compare
Review status: 0 of 3 files reviewed at latest revision, 5 unresolved discussions. pkg/storage/replica.go, line 3914 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
No. Removed. pkg/storage/store.go, line 511 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/store.go, line 512 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
OK, changed to a regular map with a mutex (a new mutex, since lock ordering gets tricky if we add it to pkg/storage/store.go, line 2308 at r8 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
This has been changed to a regular map. Comments from Reviewable |
Also, did you do any testing that shows that this reduces the CPU overhead of idle ranges? I believe I filed this issue when doing some scalability testing where I artificially created 500k ranges on a cluster. See the Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. pkg/storage/replica.go, line 592 at r4 (raw file): Previously, petermattis (Peter Mattis) wrote…
Ping. Comments from Reviewable |
Release note: None
Review status: 0 of 4 files reviewed at latest revision, 2 unresolved discussions. pkg/storage/replica.go, line 592 at r4 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done Comments from Reviewable |
Review status: 0 of 4 files reviewed at latest revision, 1 unresolved discussion. Comments from Reviewable |
bors r=petermattis I have not done any testing with large numbers of ranges, FYI. |
24956: storage: Maintain a separate set of unquiesced replicas r=petermattis a=bdarnell This means that idle replicas no longer have a per-tick CPU cost, which is one of the bottlenecks limiting the amount of data we can handle per store. Fixes #17609 Release note (performance improvement): Reduced CPU overhead of idle ranges The first five commits are from #24920; that PR should be merged and tested in isolation first. 25735: sql: fix null normalization r=RaduBerinde a=RaduBerinde The normalization rules are happy to convert `NULL::TEXT` to `NULL`. While both expressions evaluate to `DNull`, the `ResolvedType()` is different. It seems unsound for normalization to change the type. This issue is shown by trying to run a query containing `ARRAY_AGG(NULL::TEXT)` through distsql planning: by the time the distsql planner looks at it, the `NULL::TEXT` is just `DNull` (with the `Unknown` type) and the distsql planner cannot find the builtin. This change fixes the normalization rules by retaining the cast in this case. In general, any expression that statically evaluates to NULL gets a cast to the original expression type. The same is done in the opt execbuilder. Fixes #25724. Release note (bug fix): Fixed query errors in some cases involving a NULL constant that is cast to a specific type. Co-authored-by: Ben Darnell <ben@cockroachlabs.com> Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Build succeeded |
I suspect that cockroachdb#26257 is caused by the unquiescedReplicas map introduced in cockroachdb#24956 getting out of sync with the per-replica quiescent flag. Add debug pages to help us see if that's happening. Release note: None
I suspect that cockroachdb#26257 is caused by the unquiescedReplicas map introduced in cockroachdb#24956 getting out of sync with the per-replica quiescent flag. Add debug pages to help us see if that's happening. Release note: None
26269: server,ui: Add debugging for quiesced ranges r=bdarnell a=bdarnell I suspect that #26257 is caused by the unquiescedReplicas map introduced in #24956 getting out of sync with the per-replica quiescent flag. Add debug pages to help us see if that's happening. Release note: None Co-authored-by: Ben Darnell <ben@cockroachlabs.com>
This reverts the main effect of cockroachdb#24956. The supporting infrastructure is left in place to minimize merge conflicts and to aid in diagnosing why the maps get out of sync. Fixes cockroachdb#26257 Release note: None
26291: storage: Tick all replicas, not just unquiesced ones r=bdarnell a=bdarnell This reverts the main effect of #24956. The supporting infrastructure is left in place to minimize merge conflicts and to aid in diagnosing why the maps get out of sync. Fixes #26257 Release note: None Co-authored-by: Ben Darnell <ben@cockroachlabs.com>
This means that idle replicas no longer have a per-tick CPU cost,
which is one of the bottlenecks limiting the amount of data we can
handle per store.
Fixes #17609
Release note (performance improvement): Reduced CPU overhead of idle
ranges
The first five commits are from #24920; that PR should be merged and tested in isolation first.