Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod are repeatedly created and deleted #3601

Open
Wang-Kai opened this issue Jul 16, 2024 · 0 comments · May be fixed by #3686
Open

Pod are repeatedly created and deleted #3601

Wang-Kai opened this issue Jul 16, 2024 · 0 comments · May be fixed by #3686
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Wang-Kai
Copy link
Contributor

What happened:

Due to the bug, the job was deleted from etcd, but it still remains in the cache. This causes the Volcano controller to create a pod, and then the GC controller deletes the pod instantly, resulting the operation being executed repeatedly. This causes a lot of load on the apiserver.

企业微信截图_2243befd-7db5-44f9-9ffe-347384317ebd

What you expected to happen:

The volcano controller's cache keep same with etcd, and should not fight with the GC controller about pod.

How to reproduce it (as minimally and precisely as possible):

When a pod is updating, delete the owner job instantly.

Anything else we need to know?:

Environment: linux

  • Volcano Version: v1.8.2
  • Kubernetes version (use kubectl version): v1.20
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
  • Kernel (e.g. uname -a): Linux 5.10.0-103-bili-colo x86_64
  • Install tools:
  • Others:
@Wang-Kai Wang-Kai added the kind/bug Categorizes issue or PR as related to a bug. label Jul 16, 2024
@Wang-Kai Wang-Kai linked a pull request Aug 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant