Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apiserver does not shutdown gracefully in integration tests after closing configmaps rest storage connection #85948

Closed
ingvagabund opened this issue Dec 5, 2019 · 8 comments
Labels
area/apiserver area/test kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one.

Comments

@ingvagabund
Copy link
Contributor

ingvagabund commented Dec 5, 2019

What happened:

Currently, master does not allow to call clean up functions of created REST storages. Each REST storage carries a connection to etcd. When an API server is created in a loop, there is no
way to clean all the connections. So the number of open connections grows up until it exceeds configured upper limit and new REST storages fail to be created.

Given every API server is mostly run as a separate service (as a single instance), there is no need to close the etcd connections. However, when golang benchmark is used, semantics of the benchmark requires to create multiple instances of the API server in sequence. Thus, it's necessary to close all open connection when an API server is torn down. Otherwise, the number of open connections will exceed configured upper limit eventually and API server initialization will fail.

In order to allow the performance tests to iterate cleanly, I have opened a PR that closes all rest storage connections #84667. However, after merging #82705, the integration tests started to fail in #84667. PR #82705 introduces cluster authentication trust controller that creates a shared informer for config maps (among other things). After disabling the controller, resp. comment out go c.kubeSystemConfigMapInformer.Run(stopCh) line, the integration tests start to pass. I managed to track the issue to a point where a watch for config map is created. Replacing the config map watch with an empty one, the integration test pass. Putting the original watch, the tests fail.

I am opening the issue to allow #84667 to be merged by closing 54 (from total 55) open connections to rest storage except to the config map rest storage for now. So we can figure out why the remaining open connection is not closed properly later and unblock the #84667.

What you expected to happen:

Integration tests pass after closing connection to configmap rest storage.

How to reproduce it (as minimally and precisely as possible):

Do not skip configMaps rest storage destroy method in NewLegacyRESTStorage under pkg/registry/core/rest/storage_core.go in #84667. Run integration tests after.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@ingvagabund ingvagabund added the kind/bug Categorizes issue or PR as related to a bug. label Dec 5, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Dec 5, 2019
@k8s-ci-robot
Copy link
Contributor

@ingvagabund: There are no sig labels on this issue. Please add a sig label by either:

  1. mentioning a sig: @kubernetes/sig-<group-name>-<group-suffix>
    e.g., @kubernetes/sig-contributor-experience-<group-suffix> to notify the contributor experience sig, OR

  2. specifying the label manually: /sig <group-name>
    e.g., /sig scalability to apply the sig/scalability label

Note: Method 1 will trigger an email to the group. See the group list.
The <group-suffix> in method 1 has to be replaced with one of these: bugs, feature-requests, pr-reviews, test-failures, proposals.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ingvagabund
Copy link
Contributor Author

/sig apiserver

@ingvagabund ingvagabund changed the title Apiserver does not shutdown gracefully in integration tests after destroy configmaps rest storage connection Apiserver does not shutdown gracefully in integration tests after closing configmaps rest storage connection Dec 5, 2019
@liggitt
Copy link
Member

liggitt commented Dec 5, 2019

duplicate of #49489?

@ingvagabund
Copy link
Contributor Author

duplicate of #49489?

Quickly reading the issue, I think it is. Checking @frobware 's changes in frobware@345af91 it looks like we are doing the same, i.e. calling DestroyFunc() of k8s.io/apiserver/pkg/registry/generic/registry.Store. Though, I can't find any PR committing the changes. @frobware can you still recall how (and if) did you resolve the issue?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 4, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 3, 2020
@ingvagabund
Copy link
Contributor Author

/close

@k8s-ci-robot
Copy link
Contributor

@ingvagabund: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver area/test kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

4 participants