release-25.4: roachtest: reduce the number of concurrent works in tpcc/create#166165
Conversation
The tpcc/create tests is a bit of a stress test for the kv writer and it's an even bigger challenge for the crud writer. In steady state, the kv writer uses ~40% cpu to replicate the workload and the crud writer uses ~60% cpu. But this test isn't checking the steady state, its building up a backlog while the initial scan runs. So the workload needs to catch up using the remaining CPU headroom, which is what causes this test to fail due to high replication latency. Now, the test uses 1/2 the warehouse count when running the workload. This lets us use a large data load when testing the create replicated table statement and produces a manageable backlog for the crud writer to catch up on. Release note: none Fixes: cockroachdb#156185 Part of: #1584390
|
Thanks for opening a backport. Before merging, please confirm that it falls into one of the following categories (select one):
Add a brief release justification to the PR description explaining your selection. Also, confirm that the change does not break backward compatibility and complies with all aspects of the backport policy. All backports must be reviewed by the TL and EM for the owning area. |
|
✅ PR #166165 is compliant with backport policy Confidence: high ✅ ENGREQ Check Passed: No ENGREQ required (non-production code or serious issues). 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
|
Thanks for the review! |
Backport 1/1 commits from #159175.
/cc @cockroachdb/release
Fixes: #166164
Informs: #162008
The tpcc/create tests is a bit of a stress test for the kv writer and it's an even bigger challenge for the crud writer. In steady state, the kv writer uses ~40% cpu to replicate the workload and the crud writer uses ~60% cpu. But this test isn't checking the steady state, its building up a backlog while the initial scan runs. So the workload needs to catch up using the remaining CPU headroom, which is what causes this test to fail due to high replication latency.
Now, the test uses 1/2 the warehouse count when running the workload. This lets us use a large data load when testing the create replicated table statement and produces a manageable backlog for the crud writer to catch up on.
Release note: none
Fixes: #156185
Part of: #158439
Release justification: test only change.