-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Router quota exceeded error causing GCE tests to fail #14611
Comments
from boskos janitor logs:
|
cc @krzyzacy |
it looks like the image was last updated a month ago 0fd634d
|
I can't tell what actually broke when yet here. AFAICT we're running an image from august since then and haven't been having issues, also, the previous image has the same missing |
|
I execed to the pods and unsurprisingly they do seem to be running the janitor script from when the image was updated, so I don't think there were any terribly recent changes actually deployed. |
we are obviously not cleaning up routers in https://github.com/kubernetes/test-infra/blob/master/boskos/janitor/gcp_janitor.py#L35-L67 also seems gcloud deprecated some flags (that --global one), but should be unrelated. |
Thanks @krzyzacy Sen! @BenTheElder the new CAPG job uses boskos to acquire a project to create the actual cluster (uses kind to boostrap and then gcp to run the actual cluster) seems to have ended up with some problems. I do try to clean that up here, but some runs may have run into trouble and ended up leaking. |
@BenTheElder @krzyzacy Here's a fix for one more thing that could leak: |
waiting for #14617 to merge and then we need to update the deployment |
if anyone can see https://github.com/kubernetes/test-infra/compare/master...BenTheElder:github-compare-is-broken-ugh?expand=1 or https://github.com/kubernetes/test-infra/compare/master...BenTheElder:upgrade-gcloud-bazel?expand=1 I can't file the PR because github is erroring |
ok so we have the gcloud bump in, running a new https://prow.k8s.io/?job=ci-test-infra-autobump-prow and then will let prow bump / deploy |
see kubernetes/kubernetes#83493 for the real root cause 🤦♂ TLDR these scale presubmits are using a fixed GCP project, I've bumped the quota 3x from 10 -> 30, but I have no idea if that's sufficient. So far I've observed through manual polling a max of 16/30. |
AFAICT this is fixed. |
What happened:
W1004 01:31:15.149] Creating router [e2e-51036-95a39-nat-router]...
W1004 01:31:18.991] ....................failed.
W1004 01:31:19.173] ERROR: (gcloud.compute.routers.create) Quota 'ROUTERS' exceeded. Limit: 10.0 globally.
Please provide links to example occurrences, if any:
https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/51036/pull-kubernetes-e2e-gce-100-performance/1179926617770692608/
https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/51036/pull-kubernetes-kubemark-e2e-gce-big/1179926617825218560/
Anything else we need to know?:
Potential boskos cleaning issue
The text was updated successfully, but these errors were encountered: