-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: improve handling of cluster creation errors from cloud provider #114523
Labels
A-testing
Testing tools and infrastructure
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-testeng
TestEng Team
Comments
renatolabs
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-testing
Testing tools and infrastructure
T-testeng
TestEng Team
labels
Nov 15, 2023
cc @cockroachdb/test-eng |
renatolabs
added a commit
to renatolabs/cockroach
that referenced
this issue
Mar 19, 2024
As a stopgap measure to reduce the chances of "zone exhausted" errors we see during roachtest runs[^1], we randomize the default zone used when creating clusters with roachprod. [^1]: for an example, see cockroachdb#120621 (comment) Informs: cockroachdb#114523 Release note: None
craig bot
pushed a commit
that referenced
this issue
Mar 19, 2024
120714: roachprod: randomize default zone r=srosenberg a=renatolabs As a stopgap measure to reduce the chances of "zone exhausted" errors we see during roachtest runs[^1], we randomize the default zone used when creating clusters with roachprod. [^1]: for an example, see #120621 (comment) Informs: #114523 Release note: None Co-authored-by: Renato Costa <renato@cockroachlabs.com>
blathers-crl bot
pushed a commit
that referenced
this issue
Mar 20, 2024
As a stopgap measure to reduce the chances of "zone exhausted" errors we see during roachtest runs[^1], we randomize the default zone used when creating clusters with roachprod. [^1]: for an example, see #120621 (comment) Informs: #114523 Release note: None
blathers-crl bot
pushed a commit
that referenced
this issue
Mar 20, 2024
As a stopgap measure to reduce the chances of "zone exhausted" errors we see during roachtest runs[^1], we randomize the default zone used when creating clusters with roachprod. [^1]: for an example, see #120621 (comment) Informs: #114523 Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-testing
Testing tools and infrastructure
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-testeng
TestEng Team
We have been seeing a number of cluster creation errors every night on roachtest nightly runs. Part of this has to do with GCP not having enough resources on
us-east1-b
, where we create VMs by default [1]. These errors typically look like:Recent example: #108629 (comment).
The error message includes a computer-readable payload that indicates other AZs where there are resources available for the request that failed; that information should be in
errorInfo.metadatas.zonesAvailable
:(extracted from the error message linked above).
Roachtest could be smarter about its cluster creation retry mechanism and take this information into account.
It also wouldn't hurt to rotate the default AZ (i.e., use
us-east1-{b,c,d}
), or even useus-central
as well.[1]
cockroach/pkg/roachprod/vm/gce/gcloud.go
Line 969 in bab4335
Jira issue: CRDB-33544
The text was updated successfully, but these errors were encountered: