-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: add roachtest for snapshot ingest with excises #124591
roachtest: add roachtest for snapshot ingest with excises #124591
Conversation
357fde6
to
765053a
Compare
Charts for this roachtest can be found in this internal thread. |
I will hold the backport until this test bakes in master for a while. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi, @DarrylWong, and @herkolategan)
pkg/cmd/roachtest/tests/admission_control_snapshot_overload_excise.go
line 57 at r1 (raw file):
t.Fatalf("expected at least 4 nodes, found %d", c.Spec().NodeCount) }
Worth adding a comment like:
// COCKROACH_CONCURRENT_COMPACTIONS is set to 1 since we want to ensure that snapshot ingests don't result in LSM inversion even with a very low compaction rate. With Pebble's IngestAndExcise all the ingested sstables should ingest into L6.
// COCKROACH_CONCURRENT_SNAPSHOT* is increased so that the rate of snapshot application is high.
// COCKROACH_RAFT_LOG_TRUNCATION_THRESHOLD is reduced so that there is certainty that the restarted node will be caught up via snapshots, and not via raft log replay.
pkg/cmd/roachtest/tests/admission_control_snapshot_overload_excise.go
line 186 at r1 (raw file):
// Assert on l0 sublevel count and p99 latencies. latencyMetric := divQuery("histogram_quantile(0.99, sum by(le) (rate(sql_service_latency_bucket[2m])))", 1<<20 /* 1ms */) const latencyThreshold = 100
So 100ms because of the divQuery
. Worth adding a code comment.
765053a
to
677bf83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r=sumeerbhola
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DarrylWong, @herkolategan, and @sumeerbhola)
pkg/cmd/roachtest/tests/admission_control_snapshot_overload_excise.go
line 57 at r1 (raw file):
Previously, sumeerbhola wrote…
Worth adding a comment like:
// COCKROACH_CONCURRENT_COMPACTIONS is set to 1 since we want to ensure that snapshot ingests don't result in LSM inversion even with a very low compaction rate. With Pebble's IngestAndExcise all the ingested sstables should ingest into L6. // COCKROACH_CONCURRENT_SNAPSHOT* is increased so that the rate of snapshot application is high. // COCKROACH_RAFT_LOG_TRUNCATION_THRESHOLD is reduced so that there is certainty that the restarted node will be caught up via snapshots, and not via raft log replay.
Done.
pkg/cmd/roachtest/tests/admission_control_snapshot_overload_excise.go
line 186 at r1 (raw file):
Previously, sumeerbhola wrote…
So 100ms because of the
divQuery
. Worth adding a code comment.
Done.
Build failed: |
This patch adds a roachtest for running snapshots with excises enabled. In this workload, when splits and excises are disabled, we see an inverted LSM and degraded p99 latencies. The test asserts that the LSM stays healthy while doing the snapshot ingest, and p99 latencies don't spike over a threshold. Informs cockroachdb#80607. Release note: None
677bf83
to
bb693c8
Compare
lint failure 🫤 will wait on branch passing before merging |
bors r=sumeerbhola |
This patch adds a roachtest for running snapshots with excises enabled. In this workload, when splits and excises are disabled, we see an inverted LSM and degraded p99 latencies.
The test asserts that the LSM stays healthy while doing the snapshot ingest, and p99 latencies don't spike over a threshold.
Informs #80607.
Release note: None