-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment Integration Test Goroutine Limit Exceeded #53617
Comments
@kubernetes/sig-api-machinery-bugs Is it expected that an integration test would exceed 8192 goroutines (mostly started in apiserver code) if it starts a number of apiservers? That seems excessive to me, but if it's normal we should probably limit the concurrency of integration tests. If it's not normal, it seems like we are leaking goroutines. Some examples of what those 8192 goroutines are doing:
|
Do you have a full stack dump of all the goroutines? We could run them through panicparse to get some summarized details. |
@ncdc Yes, but the file is very big (around 4MB). Most of the goroutines have same output. The partial stack dump above highlights most of the output. |
The goroutines sampled above seem to be part of the client sitting between the REST Store and etcd. Perhaps it will help to incorporate calls to kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go Lines 173 to 174 in 77c8b6e
|
cc @jpbetz |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
to me this looks like dup of root cause in #49489 |
/remove-lifecycle stale |
I also encountered the error when working on a DaemonSet's integration test: #59013 |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale Removing help-wanted because the direction is not clear. /remove-help |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened:
When number of deployment integration tests increases more than a threshold, an error running
race: limit on 8192 simultaneously alive goroutines is exceeded, dying
happens when running the tests locally using bazel. The error does not happen when the number of tests is small.What you expected to happen:
The integration tests should not create so many alive goroutines (more than 8192).
How to reproduce it (as minimally and precisely as possible):
(1) Duplicate each deployment tests under
test/integration/deployment
directory twice with a digit identifier(2)
bazel build //test/integration/deployment/...
(3)
bazel test //test/integration/deployment/...
Anything else we need to know?:
The error also happens for replicaset. It may be related to how integration test environment is set up.
/kind bug
/sig apps
The text was updated successfully, but these errors were encountered: