roachtest: add tpccbench style workload to measure throughput while sustaining an avg latency threshold.#42203
Conversation
|
@ajwerner, once you feel that the methodology is correct, I can spin up a few runs of the benchmark for the schema variations we're interested in (sharded/unsharded with/without secondary index). |
ajwerner
left a comment
There was a problem hiding this comment.
@ajwerner, once you feel that the methodology is correct, I can spin up a few runs of the benchmark for the schema variations we're interested in (sharded/unsharded with/without secondary index).
I encourage you to try it out with uniform runs and see if the data it produces line up with data in https://roachperf.crdb.dev/
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
ajwerner
left a comment
There was a problem hiding this comment.
My first pass is going to mean a bunch of refactoring so here it is
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @ajwerner)
pkg/cmd/roachtest/kvbench.go, line 193 at r1 (raw file):
initCmd := strings.Builder{} initCmd.WriteString("CREATE TABLE kv (k BIGINT NOT NULL, v BYTES NOT NULL")
This stuff feels like it should live in workload. This is the workload init stuff. While the search itself should live here, the workload setup and run should live in workload. We want the sharded sequential kv workload to be just like any other workload.
nvb
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @ajwerner)
pkg/cmd/roachtest/kvbench.go, line 193 at r1 (raw file):
Previously, ajwerner wrote…
This stuff feels like it should live in workload. This is the
workload initstuff. While the search itself should live here, the workload setup and run should live in workload. We want the sharded sequential kv workload to be just like any other workload.
Yes, I agree with this.
pkg/cmd/roachtest/kvbench.go, line 303 at r1 (raw file):
if failErr == nil { ttycolor.Stdout(ttycolor.Green) t.l.Printf(`--- PASS: kv workload maintained a p50 latency of %0.1fms`+
p50 != avg
895f8e8 to
6c675cb
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, and @nvanbenschoten)
pkg/cmd/roachtest/kvbench.go, line 193 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Yes, I agree with this.
Moved the initialization logic into workload.
pkg/cmd/roachtest/kvbench.go, line 303 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
p50 != avg
Fixed.
09d0a97 to
51e7af9
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @ajwerner)
pkg/cmd/roachtest/kvbench.go, line 39 at r2 (raw file):
type kvBenchSpec struct { Nodes int
Separate these fields into a group of fields that go into the name and a group of fields that don't. That will make it easier to understand which fields are configuration options and which aren't. This is important because we can't have duplicate test names.
pkg/cmd/roachtest/kvbench.go, line 51 at r2 (raw file):
// splitting mechanism should achieve something similar to this, but that would take // too long for it to be feasibly relied upon for the purposes of this benchmark. ShouldPreSplitTable bool
Why is this a separate config flag? It's always the same as NumShards > 0.
pkg/cmd/roachtest/kvbench.go, line 51 at r2 (raw file):
// splitting mechanism should achieve something similar to this, but that would take // too long for it to be feasibly relied upon for the purposes of this benchmark. ShouldPreSplitTable bool
Why do we need to presplit when using a hash index? Isn't load-based splitting sufficient once hash indexing gives it clear split points?
pkg/cmd/roachtest/kvbench.go, line 54 at r2 (raw file):
SecondaryIndex bool RampDurationSecs int LoadDurationSecs int
Do these two need to be configurable?
pkg/cmd/roachtest/kvbench.go, line 56 at r2 (raw file):
LoadDurationSecs int MinVersion string Tags []string
If we don't need Tags or MinVersion, let's remove them.
pkg/cmd/roachtest/kvbench.go, line 61 at r2 (raw file):
func registerKVBenchSpec(r *testRegistry, b kvBenchSpec) { nameParts := []string{ "kvbench",
We should make it clear in the name that this is running kv0.
pkg/cmd/roachtest/kvbench.go, line 64 at r2 (raw file):
fmt.Sprintf("nodes=%d", b.Nodes), fmt.Sprintf("cpu=%d", b.CPUs), fmt.Sprintf("shards=%d", b.NumShards),
Don't include this if the table isn't sharded. This isn't really a core property of the workload, it's more of an auxiliary option that will be useful for testing out hash indexes.
pkg/cmd/roachtest/kvbench.go, line 79 at r2 (raw file):
} switch b.SecondaryIndex {
if ... else ...
And we prefer to keep the name short. How about something like if b.SecondaryIndex { nameParts = append(nameParts, "2nd_idx") }
pkg/cmd/roachtest/kvbench.go, line 234 at r2 (raw file):
// sustaining an avg latency of less than `LatencyThresholdMs`. // TODO(aayush): `avg` here is just an arbitrary statistic, should I make the // concerned statistic a config option?
I wouldn't. Let's keep the configs to a minimum wherever possible.
pkg/cmd/roachtest/kvbench.go, line 239 at r2 (raw file):
t.Fatal(errors.Wrapf(err, `failed to create temp results dir`)) } s := search.NewLineSearcher(100, 10000000, b.EstimatedMaxThroughput, initStepSize, precision)
Use /* name_of_arg */ for the inline values, per our style guide.
pkg/cmd/roachtest/kvbench.go, line 253 at r2 (raw file):
db := c.Conn(ctx, 1) initCmd := strings.Builder{}
var initCmd strings.Builder
pkg/cmd/roachtest/kvbench.go, line 258 at r2 (raw file):
// load-based splitting, which can significantly change the underlying layout // of the table and affect benchmark results. fmt.Fprintf(&initCmd, `./workload init kv --drop --num-shards=%d {pgurl%s}`,
Since we're not maintaining any state across runs, I think we can do better. Let's just c.Wipe before each iteration.
pkg/cmd/roachtest/kvbench.go, line 271 at r2 (raw file):
// but failing seems like a better idea since this is almost certainly a // configuration error. panic(`ShouldPreSplitTable only applies to sharded` +
Return an error instead of panic-ing.
pkg/cmd/roachtest/kvbench.go, line 274 at r2 (raw file):
`kv schemas (ie NumShards > 0)`) } splitCmd := strings.Builder{}
var splitCmd strings.Builder
pkg/cmd/roachtest/kvbench.go, line 307 at r2 (raw file):
// would lead to the `searchPredicate` always returning true as long as the // latency at this throughput ceiling was below the configured threshold. const loadConcurrency int = 128
nit: no need for int
pkg/cmd/roachtest/kvbench.go, line 310 at r2 (raw file):
fmt.Fprintf(&workloadCmd, `./workload run kv --ramp=%ds --duration=%ds {pgurl%s}`+
Pass --read-percent=0 to be clear that this is kv0.
pkg/cmd/roachtest/kvbench.go, line 320 at r2 (raw file):
workloadCmd.WriteString(` --zipfian`) case random: // Nothing, since random is the default.
Might as well pass the flag so that we can change the default without breaking this.
pkg/cmd/roachtest/kvbench.go, line 372 at r2 (raw file):
} type result struct {
This is being leaked into the entire roachtest packages scope. Because of that, it's probably not appropriately named.
pkg/cmd/roachtest/kvbench.go, line 380 at r2 (raw file):
// so this could definitely be cleaner and better abstracted. func newResultFromSnapshots(maxrate int, snapshots map[string][]histogram.SnapshotTick) *result { var start time.Time
nit: var start, end time.Time
pkg/workload/kv/kv.go, line 186 at r2 (raw file):
} checkConstraint := strings.Builder{} checkConstraint.WriteString(`(shard in (`)
nit: IN
pkg/workload/kv/kv.go, line 193 at r2 (raw file):
fmt.Fprintf(&checkConstraint, "%d", i) } checkConstraint.WriteString("))")
nit: move the second parentheses into the template.
pkg/workload/kv/kv.go, line 194 at r2 (raw file):
} checkConstraint.WriteString("))") table.Schema = fmt.Sprintf(schema, w.shards, checkConstraint.String())
nit: no need for .String()
pkg/workload/kv/kv.go, line 256 at r2 (raw file):
// are available. See // https://github.com/cockroachdb/cockroach/issues/39340#issuecomment-535338071 // for details.
Add a TODO here to fix this once that's addressed.
6ba0e81 to
fc9450a
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @ajwerner, and @nvanbenschoten)
pkg/cmd/roachtest/kvbench.go, line 39 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Separate these fields into a group of fields that go into the name and a group of fields that don't. That will make it easier to understand which fields are configuration options and which aren't. This is important because we can't have duplicate test names.
Done.
pkg/cmd/roachtest/kvbench.go, line 51 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why is this a separate config flag? It's always the same as
NumShards > 0.
I thought it could be useful to demonstrate / measure how long load-based splitting takes to achieve a "good" set of splits. Maybe this is too vague. Removed for now.
pkg/cmd/roachtest/kvbench.go, line 51 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why do we need to presplit when using a hash index? Isn't load-based splitting sufficient once hash indexing gives it clear split points?
Relying on load-based splitting made the benchmarks require a way longer ramp duration to achieve an even load across nodes. Let me know if you'd prefer not splitting in the benchmark.
pkg/cmd/roachtest/kvbench.go, line 54 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do these two need to be configurable?
In some sense, having these configurable is sort of a footgun. While I was playing with this stuff, I noticed that if these durations are too short, the unsharded/unsplit schema (with secondary index) performed better than the sharded/split schema (also with secondary index); since we go through the single-range fast path until the size of the data exceeds our 64mb threshold. Removed.
pkg/cmd/roachtest/kvbench.go, line 56 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
If we don't need
TagsorMinVersion, let's remove them.
Done.
pkg/cmd/roachtest/kvbench.go, line 61 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We should make it clear in the name that this is running kv0.
Done.
pkg/cmd/roachtest/kvbench.go, line 64 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Don't include this if the table isn't sharded. This isn't really a core property of the workload, it's more of an auxiliary option that will be useful for testing out hash indexes.
Done.
pkg/cmd/roachtest/kvbench.go, line 79 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
if ... else ...And we prefer to keep the name short. How about something like
if b.SecondaryIndex { nameParts = append(nameParts, "2nd_idx") }
Done.
pkg/cmd/roachtest/kvbench.go, line 234 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I wouldn't. Let's keep the configs to a minimum wherever possible.
Done.
pkg/cmd/roachtest/kvbench.go, line 239 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Use
/* name_of_arg */for the inline values, per our style guide.
Done.
pkg/cmd/roachtest/kvbench.go, line 253 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
var initCmd strings.Builder
Done.
pkg/cmd/roachtest/kvbench.go, line 258 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Since we're not maintaining any state across runs, I think we can do better. Let's just
c.Wipebefore each iteration.
Done.
pkg/cmd/roachtest/kvbench.go, line 271 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Return an error instead of panic-ing.
Removed the ShouldPreSplitTable option.
pkg/cmd/roachtest/kvbench.go, line 274 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
var splitCmd strings.Builder
Done.
pkg/cmd/roachtest/kvbench.go, line 307 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: no need for
int
Done.
pkg/cmd/roachtest/kvbench.go, line 310 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Pass
--read-percent=0to be clear that this is kv0.
Done.
pkg/cmd/roachtest/kvbench.go, line 320 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Might as well pass the flag so that we can change the default without breaking this.
Done.
pkg/cmd/roachtest/kvbench.go, line 372 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This is being leaked into the entire
roachtestpackages scope. Because of that, it's probably not appropriately named.
Renamed.
pkg/cmd/roachtest/kvbench.go, line 380 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
var start, end time.Time
Done.
pkg/workload/kv/kv.go, line 186 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
IN
Done.
pkg/workload/kv/kv.go, line 193 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: move the second parentheses into the template.
Done.
pkg/workload/kv/kv.go, line 256 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Add a TODO here to fix this once that's addressed.
Done.
56877a7 to
a0f945f
Compare
|
How do we get this merged in before I leave? |
nvb
left a comment
There was a problem hiding this comment.
this is great! Mind posting a few perf numbers on this PR before merging though? What are the results of these new benchmarks?
Reviewed 2 of 2 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @ajwerner)
pkg/cmd/roachtest/kvbench.go, line 220 at r3 (raw file):
// affect benchmark results. c.Wipe(ctx, roachNodes) c.Stop(ctx, roachNodes)
I think Wipe also Stops the cluster, so we shouldn't need this.
pkg/cmd/roachtest/kvbench.go, line 274 at r3 (raw file):
const loadConcurrency = 192 const ramp = 300 const duration = 300
nit: make these two typed constants (time.Duration) and use .Second() when passing to the cmd.
ajwerner
left a comment
There was a problem hiding this comment.
Sorry for the delay, this fell off my radar.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @aayushshah15)
pkg/workload/kv/kv.go, line 42 at r3 (raw file):
INDEX (v) )` // TODO(aayush): Change this to use the "easier" hash sharded index syntax once that
feel free to make this ajwerner
pkg/workload/kv/kv.go, line 254 at r3 (raw file):
} } else { // TODO(aayush): We're currently manually plumbing down the computed shard column
ditto.
sustaining an avg latency threshold. This PR adds a new benchmarking workload that can be used to measure max throughput on the `kv` workload (under various distributions) that be sustained while maintaining an avg latency lower than a configured threshold. This tool was primarily written in order to demonstrate the performance characteristics of hash sharded indexes under write-heavy sequential worklaods, which would, in the absence of a hash sharded index, cause a hotspot on a single range (and consequently, a single node). Release note: None
a0f945f to
7026018
Compare
|
TFTRs! The perf numbers on my RFC were all generated with |
|
bors r+ |
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner and @nvanbenschoten)
pkg/cmd/roachtest/kvbench.go, line 220 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think
WipealsoStops the cluster, so we shouldn't need this.
Done.
pkg/cmd/roachtest/kvbench.go, line 274 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: make these two typed constants (
time.Duration) and use.Second()when passing to the cmd.
Done.
pkg/workload/kv/kv.go, line 42 at r3 (raw file):
Previously, ajwerner wrote…
feel free to make this
ajwerner
Done.
pkg/workload/kv/kv.go, line 254 at r3 (raw file):
Previously, ajwerner wrote…
ditto.
Done.
42203: roachtest: add tpccbench style workload to measure throughput while sustaining an avg latency threshold. r=aayushshah15 a=aayushshah15 This PR adds a new benchmarking workload that can be used to measure max throughput on the `kv` workload (under various distributions) that be sustained while maintaining an avg latency lower than a configured threshold. This tool was primarily written in order to demonstrate the performance characteristics of hash sharded indexes under write-heavy sequential worklaods, which would, in the absence of a hash sharded index, cause a hotspot on a single range (and consequently, a single node). Release note: None 43380: changefeedccl: transfer todos from aayush to dan/ajwerner r=aayushshah15 a=aayushshah15 Transferring my TODOs to dan/ajwerner before I leave. Release note: None Co-authored-by: Aayush Shah <aayush.shah15@gmail.com>
Build succeeded |
This PR adds a new benchmarking workload that can be used to measure max
throughput on the
kvworkload (under various distributions) that besustained while maintaining an avg latency lower than a configured
threshold.
This tool was primarily written in order to demonstrate the performance
characteristics of hash sharded indexes under write-heavy sequential
worklaods, which would, in the absence of a hash sharded index, cause
a hotspot on a single range (and consequently, a single node).
Release note: None