-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: import/tpcc/warehouses=1000/nodes=32 failed [replication failures and clock latency jumps] #122348
Comments
Looks like SQL Foundations |
The error message is the workload health check failing at 15:59:07:
The first step of the healthcheck is to connect to the cluster. Part of authenticating is to run the The logs from n4 show some replication errors happening earlier:
There are also are some clock latency jumps:
This might mean the cluster is overloaded, or there's some other problem at a lower level. @rytaft I'm moving this back to SQL Queries so you can decide which direction to pursue. |
This shouldn't block a beta, but marking as GA-blocker for now since we'd like to dig a bit deeper |
Looking closer at n4, here are some interesting messages
Then at 15:59:07 we get this error. I think KV team should look closer whether this behavior is expected |
roachtest.import/tpcc/warehouses=1000/nodes=32 failed with artifacts on release-24.1 @ 88c9d88cb0093d35f2d630137fd178518aa48569:
Parameters:
|
The backported fix just merged in; this should be fixed. |
roachtest.import/tpcc/warehouses=1000/nodes=32 failed with artifacts on release-24.1 @ 6e3e9e6012dc04dabc64333a87e77a80d9f9d46b:
Parameters:
ROACHTEST_arch=amd64
ROACHTEST_cloud=gce
ROACHTEST_coverageBuild=false
ROACHTEST_cpu=4
ROACHTEST_encrypted=false
ROACHTEST_fs=ext4
ROACHTEST_localSSD=true
ROACHTEST_metamorphicBuild=false
ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
See: Grafana
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-37821
The text was updated successfully, but these errors were encountered: