Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wait for all pods to be deleted before deleting serviceaccount/cluste… #4035

Merged
merged 1 commit into from
May 20, 2024

Conversation

zhangzujian
Copy link
Member

…rrole/clusterrolebinding

Pull Request

What type of this PR

Examples of user facing changes:

  • Features
  • Bug fixes
  • Docs
  • Tests

Which issue(s) this PR fixes

Sometimes the ovs-ovn pod cannot be terminated normally:

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  17m                   default-scheduler  Successfully assigned kube-system/ovs-ovn-wjqj5 to kube-ovn-worker2
  Normal   Pulled     17m                   kubelet            Container image "docker.io/kubeovn/kube-ovn:v1.13.0" already present on machine
  Normal   Created    17m                   kubelet            Created container openvswitch
  Normal   Started    17m                   kubelet            Started container openvswitch
  Warning  Unhealthy  15m                   kubelet            Liveness probe failed: ovsdb-client: failed to connect to "unix:/var/run/openvswitch/db.sock" (No such file or directory)
  Normal   Killing    15m                   kubelet            Stopping container openvswitch
  Warning  Unhealthy  15m (x2 over 15m)     kubelet            Readiness probe failed: ovsdb-client: failed to connect to "unix:/var/run/openvswitch/db.sock" (No such file or directory)
  Warning  Unhealthy  15m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "3ec0fc74b2d03cacb1ff54dcaee63d8e29d94a2b0993b2160335743f38d3307e": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  15m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "5c1d72340480f545dd885009e84b2e93eeaf537f0014c3d866801e5bf85ceb75": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  15m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "3799c13cac4bfd7189515a7c15e5d185ca8689c3221fff60be462d8ccd7c926c": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  15m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "e501dafe346b1cdf5103062e615449400ddc401d6c9bfeef503f68a2cd6386cf": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  15m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "d211d8245cb11042631ff303fd17c7a0281775b4cce7d98e1285847be9159221": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  15m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "1d444354c7dbe448c9cbfc0f37cf8c1ad0ba5baa5ca9674cf94f0018e68ae5e6": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  14m                   kubelet            Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "8de8c8bbbb52009ccd8959ef529ada0cbe696ac3d021a73d56d56eebc037639e": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
  Warning  Unhealthy  119s (x156 over 14m)  kubelet            (combined from similar events): Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "bd664559ec19be5798316753c5950999179c833ca0a4a0163a374250ca59f743": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown

kubelet logs:

May 16 07:51:45 kube-ovn-worker2 kubelet[221]: E0516 07:51:45.963955     221 remote_runtime.go:496] "ExecSync cmd from runtime service failed" err="rpc error: code = Unknown desc = failed to exec in container: failed to start exec \"e407f76790fca145a2325e46ceb08e018203e6e358558b10bed937de64e1237d\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown" containerID="8416d0b32d306528894b325aa34b1abff365f9f930ea42623e003565cfd7d491" cmd=["bash","-c","LOG_ROTATE=true /kube-ovn/ovs-healthcheck.sh"]

Container:

root@kube-ovn-worker2:/# crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
8416d0b32d306       c64cb14aec49c       23 minutes ago      Running             openvswitch         0                   cfa6648337a01       ovs-ovn-wjqj5

Container processes:

containerd-shim(1166)-+-monitor(2359)---ovn-controller(2360)-+-{ovn-controller}(2364)
                      |                                      |-{ovn-controller}(2365)
                      |                                      `-{ovn-controller}(2373)
                      |-pause(1284)
                      |-tail(2384)

Container logs:

tail: '/var/log/ovn/ovn-controller.log' has become inaccessible: No such file or directory
E0516 07:34:10.829615   11031 memcache.go:265] couldn't get current server API group list: the server has asked for the client to provide credentials
E0516 07:34:10.862236   11031 memcache.go:265] couldn't get current server API group list: the server has asked for the client to provide credentials
E0516 07:34:10.886667   11031 memcache.go:265] couldn't get current server API group list: the server has asked for the client to provide credentials
E0516 07:34:10.932248   11031 memcache.go:265] couldn't get current server API group list: the server has asked for the client to provide credentials
E0516 07:34:10.973829   11031 memcache.go:265] couldn't get current server API group list: the server has asked for the client to provide credentials
error: You must be logged in to the server (the server has asked for the client to provide credentials)

…rrole/clusterrolebinding

Signed-off-by: zhangzujian <zhangzujian.7@gmail.com>
@zhangzujian zhangzujian marked this pull request as ready for review May 16, 2024 08:43
@zhangzujian
Copy link
Member Author

@oilbeater

@zhangzujian zhangzujian merged commit bb46f57 into kubeovn:master May 20, 2024
62 checks passed
@zhangzujian zhangzujian deleted the fix-cleanup branch May 20, 2024 08:01
zhangzujian added a commit that referenced this pull request May 20, 2024
…rrole/clusterrolebinding (#4035)

Signed-off-by: zhangzujian <zhangzujian.7@gmail.com>
bobz965 pushed a commit that referenced this pull request May 21, 2024
…rrole/clusterrolebinding (#4035)

Signed-off-by: zhangzujian <zhangzujian.7@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants