Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: cluster_creation failed #123874

Closed
cockroach-teamcity opened this issue May 9, 2024 · 5 comments
Closed

roachtest: cluster_creation failed #123874

cockroach-teamcity opened this issue May 9, 2024 · 5 comments
Labels
branch-release-24.1.0-rc Used to mark GA and release blockers and technical advisories for 24.1.0-rc O-roachtest O-robot Originated from a bot. T-testeng TestEng Team X-infra-flake the automatically generated issue was closed due to an infrastructure problem not a product issue
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented May 9, 2024

roachtest.cluster_creation failed with artifacts on release-24.1.0-rc @ 6205244e922606f85761dad2137b842f43a53716:

test kv50/enc=false/nodes=4/cpu=96/batch=64 failed: (test_runner.go:779).func4: in provider: gce: Command: gcloud [compute instances create --subnet default --labels usage=roachtest,roachprod=true,cluster=teamcity-15175684-1715233756-98-n5cpu96,lifetime=12h0m0s,arch=amd64,created=2024-05-09t12_56_06z --scopes cloud-platform --image ubuntu-2204-jammy-v20230727 --image-project ubuntu-os-cloud --boot-disk-type pd-ssd --service-account 21965078311-compute@developer.gserviceaccount.com --maintenance-policy MIGRATE --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --machine-type n2d-custom-96-196608 --min-cpu-platform AMD Milan --metadata-from-file startup-script=/tmp/gce-startup-script1897167942 --project cockroach-ephemeral --boot-disk-size=32GB --zone us-east1-c teamcity-15175684-1715233756-98-n5cpu96-0001 teamcity-15175684-1715233756-98-n5cpu96-0002 teamcity-15175684-1715233756-98-n5cpu96-0003 teamcity-15175684-1715233756-98-n5cpu96-0004 teamcity-15175684-1715233756-98-n5cpu96-0005]
Output: Created [https://www.googleapis.com/compute/v1/projects/cockroach-ephemeral/zones/us-east1-c/instances/teamcity-15175684-1715233756-98-n5cpu96-0001].
WARNING: Some requests generated warnings:
 - Disk size: '32 GB' is larger than image size: '10 GB'. You might need to resize the root repartition manually if the operating system does not support automatic resizing. See https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd for details.
 - The resource 'projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20230727' is deprecated. A suggested replacement is 'projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240501'.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
 - Quota 'LOCAL_SSD_TOTAL_GB_PER_VM_FAMILY' exceeded.  Limit: 80000.0 in region us-east1.
	metric name = compute.googleapis.com/local_ssd_total_storage_per_vm_family
	limit name = LOCAL-SSD-TOTAL-GB-PER-VM-FAMILY-per-project-region
	limit = 80000.0
	dimensions = region: us-east1
vm_family: N2D
Try your request in another zone, or view documentation on how to increase quotas: https://cloud.google.com/compute/quotas.
 - Quota 'N2D_CPUS' exceeded.  Limit: 1000.0 in region us-east1.
	metric name = compute.googleapis.com/n2d_cpus
	limit name = N2D-CPUS-per-project-region
	limit = 1000.0
	dimensions = region: us-east1
Try your request in another zone, or view documentation on how to increase quotas: https://cloud.google.com/compute/quotas.: exit status 1 [owner=test-eng]

Parameters:

  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=96
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

/cc @cockroachdb/test-eng

This test on roachdash | Improve this report!

Jira issue: CRDB-38583

@cockroach-teamcity cockroach-teamcity added branch-release-24.1.0-rc Used to mark GA and release blockers and technical advisories for 24.1.0-rc O-roachtest O-robot Originated from a bot. T-testeng TestEng Team X-infra-flake the automatically generated issue was closed due to an infrastructure problem not a product issue labels May 9, 2024
@cockroach-teamcity cockroach-teamcity added this to the 24.1 milestone May 9, 2024
@cockroach-teamcity
Copy link
Member Author

roachtest.cluster_creation failed with artifacts on release-24.1.0-rc @ 6205244e922606f85761dad2137b842f43a53716:

test tpccbench/nodes=9/cpu=4/multi-region failed: (test_runner.go:779).func4: cluster.PutE: put "/go/src/github.com/cockroachdb/cockroach/bin/cockroach.linux-amd64" failed: error persisted after 2 attempts: TRANSIENT_ERROR(ssh_problem): ~ scp -r -C -o StrictHostKeyChecking=no -o ConnectTimeout=10 -i /home/roach/.ssh/id_rsa -i /home/roach/.ssh/google_compute_engine /go/src/github.com/cockroachdb/cockroach/bin/cockroach.linux-amd64 ubuntu@34.168.29.254:./cockroach
Warning: Permanently added '34.168.29.254' (ECDSA) to the list of known hosts.
client_loop: send disconnect: Broken pipe
lost connection: exit status 1 [owner=test-eng]

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.cluster_creation failed with artifacts on release-24.1.0-rc @ 5e4ca9e26f1a25681de9c944298cfa139c344466:

test kv95/enc=false/nodes=3/cpu=96 failed: (test_runner.go:779).func4: in provider: gce: Command: gcloud [compute instances create --subnet default --labels usage=roachtest,created=2024-05-23t11_28_51z,roachprod=true,cluster=teamcity-15371638-1716443308-84-n4cpu96,lifetime=12h0m0s,arch=amd64 --scopes cloud-platform --image ubuntu-2204-jammy-v20230727 --image-project ubuntu-os-cloud --boot-disk-type pd-ssd --service-account 21965078311-compute@developer.gserviceaccount.com --maintenance-policy MIGRATE --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --machine-type n2d-custom-96-196608 --min-cpu-platform AMD Milan --metadata-from-file startup-script=/tmp/gce-startup-script3428712998 --project cockroach-ephemeral --boot-disk-size=32GB --zone us-east1-d teamcity-15371638-1716443308-84-n4cpu96-0001 teamcity-15371638-1716443308-84-n4cpu96-0002 teamcity-15371638-1716443308-84-n4cpu96-0003 teamcity-15371638-1716443308-84-n4cpu96-0004]
Output: WARNING: Some requests generated warnings:
 - Disk size: '32 GB' is larger than image size: '10 GB'. You might need to resize the root repartition manually if the operating system does not support automatic resizing. See https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd for details.
 - The resource 'projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20230727' is deprecated. A suggested replacement is 'projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240519'.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
 - Quota 'LOCAL_SSD_TOTAL_GB_PER_VM_FAMILY' exceeded.  Limit: 80000.0 in region us-east1.
	metric name = compute.googleapis.com/local_ssd_total_storage_per_vm_family
	limit name = LOCAL-SSD-TOTAL-GB-PER-VM-FAMILY-per-project-region
	limit = 80000.0
	dimensions = region: us-east1
vm_family: N2D
Try your request in another zone, or view documentation on how to increase quotas: https://cloud.google.com/compute/quotas.
 - Quota 'N2D_CPUS' exceeded.  Limit: 1000.0 in region us-east1.
	metric name = compute.googleapis.com/n2d_cpus
	limit name = N2D-CPUS-per-project-region
	limit = 1000.0
	dimensions = region: us-east1
Try your request in another zone, or view documentation on how to increase quotas: https://cloud.google.com/compute/quotas.: exit status 1 [owner=test-eng]

Parameters:

  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=96
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.cluster_creation failed with artifacts on release-24.1.0-rc @ 5e4ca9e26f1a25681de9c944298cfa139c344466:

test kv50/enc=false/nodes=4/cpu=96/batch=64 failed: (test_runner.go:779).func4: in provider: gce: Command: gcloud [compute instances create --subnet default --labels usage=roachtest,roachprod=true,cluster=teamcity-15371638-1716443308-105-n5cpu96,lifetime=12h0m0s,arch=amd64,created=2024-05-23t12_24_27z --scopes cloud-platform --image ubuntu-2204-jammy-v20230727 --image-project ubuntu-os-cloud --boot-disk-type pd-ssd --service-account 21965078311-compute@developer.gserviceaccount.com --maintenance-policy MIGRATE --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --local-ssd interface=NVME --machine-type n2d-custom-96-196608 --min-cpu-platform AMD Milan --metadata-from-file startup-script=/tmp/gce-startup-script37976578 --project cockroach-ephemeral --boot-disk-size=32GB --zone us-east1-c teamcity-15371638-1716443308-105-n5cpu96-0001 teamcity-15371638-1716443308-105-n5cpu96-0002 teamcity-15371638-1716443308-105-n5cpu96-0003 teamcity-15371638-1716443308-105-n5cpu96-0004 teamcity-15371638-1716443308-105-n5cpu96-0005]
Output: Created [https://www.googleapis.com/compute/v1/projects/cockroach-ephemeral/zones/us-east1-c/instances/teamcity-15371638-1716443308-105-n5cpu96-0004].
WARNING: Some requests generated warnings:
 - Disk size: '32 GB' is larger than image size: '10 GB'. You might need to resize the root repartition manually if the operating system does not support automatic resizing. See https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd for details.
 - The resource 'projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20230727' is deprecated. A suggested replacement is 'projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240519'.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
 - Quota 'LOCAL_SSD_TOTAL_GB_PER_VM_FAMILY' exceeded.  Limit: 80000.0 in region us-east1.
	metric name = compute.googleapis.com/local_ssd_total_storage_per_vm_family
	limit name = LOCAL-SSD-TOTAL-GB-PER-VM-FAMILY-per-project-region
	limit = 80000.0
	dimensions = region: us-east1
vm_family: N2D
Try your request in another zone, or view documentation on how to increase quotas: https://cloud.google.com/compute/quotas.
 - Quota 'N2D_CPUS' exceeded.  Limit: 1000.0 in region us-east1.
	metric name = compute.googleapis.com/n2d_cpus
	limit name = N2D-CPUS-per-project-region
	limit = 1000.0
	dimensions = region: us-east1
Try your request in another zone, or view documentation on how to increase quotas: https://cloud.google.com/compute/quotas.: exit status 1 [owner=test-eng]

Parameters:

  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=96
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.cluster_creation failed with artifacts on release-24.1.0-rc @ 5e4ca9e26f1a25681de9c944298cfa139c344466:

test cdc/scan/catchup/nodes=5/cpu=16/rows=1G/ranges=100/protocol=mux/format=json/sink=null failed: (test_runner.go:779).func4: cluster.PutE: put "/go/src/github.com/cockroachdb/cockroach/bin/cockroach.linux-amd64" failed: TRANSIENT_ERROR(ssh_problem): ~ scp -r -C -o StrictHostKeyChecking=no -o ConnectTimeout=10 -i /home/roach/.ssh/id_rsa -i /home/roach/.ssh/google_compute_engine /go/src/github.com/cockroachdb/cockroach/bin/cockroach.linux-amd64 ubuntu@34.23.7.206:./cockroach
ssh: connect to host 34.23.7.206 port 22: Connection timed out
lost connection: exit status 1 [owner=test-eng]

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=16
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=true
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

This test on roachdash | Improve this report!

@renatolabs
Copy link
Contributor

Branch deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-24.1.0-rc Used to mark GA and release blockers and technical advisories for 24.1.0-rc O-roachtest O-robot Originated from a bot. T-testeng TestEng Team X-infra-flake the automatically generated issue was closed due to an infrastructure problem not a product issue
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants