pebble: speed up root package tests#6007
Merged
RaduBerinde merged 11 commits intocockroachdb:masterfrom Apr 30, 2026
Merged
Conversation
Increase the outer-loop step from `rng.Int32N(5)+1` to `rng.Int32N(10)+3`. Each iteration opens a database and replays writes up to a crash point; sampling crash points at a coarser stride reduces test time roughly 2-3x while still exercising a representative range of crash positions. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Cut the run-time budget from 1s to 200ms. The test asserts that `Metrics.WAL.BytesWritten` never decreases under concurrent writers and flushes; many violations would surface within tens of milliseconds, so 200ms is sufficient to detect the regressed behavior tracked in cockroachdb#3505. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Drop the 100,000-byte value length and parallelize the `(startFromEmpty, walDirname)` subtests. The four resulting subtests are independent on disk; running them concurrently halves wall time while preserving the sequential-reopen semantics within each group. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Member
28b1602 to
1a1f52f
Compare
Reduce the shared key prefix from 1MB to 100KB and drop the SST target file size from 4MB to 400KB. Multi-block index partitioning (the property `TestLargeKeys` exercises, see cockroachdb#4518) is preserved at the smaller scale, with the same number of files and index partitions. To avoid flakes, we disable compactions and set `MemTableSize` to 16MiB to keep the SET batch (~3MB) below `largeBatchThreshold`. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Move the WAL recovery directory setup inside `runTest` so each subtest gets its own pre-acquired lock and unlocked recovery directory, then run the `memfs` and `disk/absolute` subtests in parallel. The `disk/relative` subtests can't be parallelized because they call `t.Chdir`. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Reduce the number of checkpoints created from 200 to 50. The test exercises concurrent writes, compactions, and checkpoint creation/open; 50 iterations is enough to catch concurrency issues without the extra cost of validating the remaining 150. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Reduce the number of 1MB writes from 200 to 50. The write stall fires after ~8 writes; once the elevation goroutine clears it, the remaining writes only verify that the elevated threshold continues to admit writes — 50 is enough to exercise that. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Reduce the metamorphic operation count from 10,000 to 3,000. The test generates a random database state and verifies a read-only re-open doesn't mutate the filesystem; 3,000 ops produce a sufficiently rich state with multiple SSTs and WAL segments. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Drop `UnhealthySamplingInterval` from 100ms to 10ms and `UnhealthyOperationLatencyThreshold` from 1s to 50ms. The test runs entirely against in-memory `vfs.NewMem`, so the latency thresholds only serve to gate when the failover decision triggers; tightening them shortens the time-to-failover and cuts the test runtime ~3x. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Reduce the number of runs from 200 to 50. Each run writes O(runs) keys and takes a snapshot, so total work scales as O(runs^2). The expanding DeleteRange pattern is fully exercised at 50 runs. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Drop the run count from 100 to 25 and replace the random 1-10ms flush delays with a fixed 1ms. The test runs inside `synctest.Test`, which makes wall-clock variance irrelevant, so the random delay range adds nothing but extra simulated time per iteration. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
1a1f52f to
475ed8f
Compare
xxmplus
approved these changes
Apr 30, 2026
Contributor
xxmplus
left a comment
There was a problem hiding this comment.
@xxmplus made 1 comment.
Reviewable status: 0 of 10 files reviewed, all discussions resolved (waiting on annrpom and sumeerbhola).
Member
Author
|
TFTR! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Speed up the slowest tests in the root package.
Total package time dropped from 40.5s → 24.9s (~38% faster).
pebble: speed up TestDBCompactionCrash
Increase the outer-loop step from
rng.Int32N(5)+1torng.Int32N(10)+3.Each iteration opens a database and replays writes up to a crash point;
sampling crash points at a coarser stride reduces test time roughly
2-3x while still exercising a representative range of crash positions.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestMetricsWALBytesWrittenMonotonicity
Cut the run-time budget from 1s to 200ms. The test asserts that
Metrics.WAL.BytesWrittennever decreases under concurrent writers andflushes; many violations would surface within tens of milliseconds, so
200ms is sufficient to detect the regressed behavior tracked in #3505.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestOpenCloseOpenClose
Drop the 100,000-byte value length and parallelize the
(startFromEmpty, walDirname)subtests. The four resulting subtests areindependent on disk; running them concurrently halves wall time while
preserving the sequential-reopen semantics within each group.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: shrink testdata/large_keys
Reduce the shared key prefix from 1MB to 100KB and drop the SST target
file size from 4MB to 400KB. Multi-block index partitioning (the
property
TestLargeKeysexercises, see #4518) is preserved at thesmaller scale, with the same number of files and index partitions.
To avoid flakes, we disable compactions and set
MemTableSizeto16MiB to keep the SET batch (~3MB) below
largeBatchThreshold.Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestOpenAlreadyLocked
Move the WAL recovery directory setup inside
runTestso each subtestgets its own pre-acquired lock and unlocked recovery directory, then run
the
memfsanddisk/absolutesubtests in parallel. Thedisk/relativesubtests can't be parallelized because they callt.Chdir.Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestCheckpointCompaction
Reduce the number of checkpoints created from 200 to 50. The test
exercises concurrent writes, compactions, and checkpoint creation/open;
50 iterations is enough to catch concurrency issues without the extra
cost of validating the remaining 150.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestElevateThresholdAfterWriteStallUnblocksStall
Reduce the number of 1MB writes from 200 to 50. The write stall fires
after ~8 writes; once the elevation goroutine clears it, the remaining
writes only verify that the elevated threshold continues to admit
writes — 50 is enough to exercise that.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestReadOnlyRecovery
Reduce the metamorphic operation count from 10,000 to 3,000. The test
generates a random database state and verifies a read-only re-open
doesn't mutate the filesystem; 3,000 ops produce a sufficiently rich
state with multiple SSTs and WAL segments.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestWALFailoverAvoidsWriteStall
Drop
UnhealthySamplingIntervalfrom 100ms to 10ms andUnhealthyOperationLatencyThresholdfrom 1s to 50ms. The test runsentirely against in-memory
vfs.NewMem, so the latency thresholds onlyserve to gate when the failover decision triggers; tightening them
shortens the time-to-failover and cuts the test runtime ~3x.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestSnapshotRangeDeletionStress
Reduce the number of runs from 200 to 50. Each run writes O(runs) keys
and takes a snapshot, so total work scales as O(runs^2). The expanding
DeleteRange pattern is fully exercised at 50 runs.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com
pebble: speed up TestFlushDelayStress
Drop the run count from 100 to 25 and replace the random 1-10ms flush
delays with a fixed 1ms. The test runs inside
synctest.Test, whichmakes wall-clock variance irrelevant, so the random delay range adds
nothing but extra simulated time per iteration.
Co-Authored-By: roachdev-claude roachdev-claude-bot@cockroachlabs.com