Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium-cni gets deleted by delayed cni-uninstall.sh invocation #11828

Closed
nebril opened this issue Jun 2, 2020 · 0 comments · Fixed by #11830
Closed

cilium-cni gets deleted by delayed cni-uninstall.sh invocation #11828

nebril opened this issue Jun 2, 2020 · 0 comments · Fixed by #11830
Assignees
Labels
kind/bug This is a bug in the Cilium logic.

Comments

@nebril
Copy link
Member

nebril commented Jun 2, 2020

Bug report

How to reproduce the issue

  1. edit quick-install.yaml:
  • increase terminationGracePeriodSeconds to 120
  • change preStop command to command: ["sh", "-c", "sleep 60 && /cni-uninstall.sh"]
  • apply quick-install.yaml
    2 kubectl apply -f install/kubernetes/quick-install.yaml
  1. wait for Cilium pods to be running
  2. kubectl delete ds cilium -n kube-system && kubectl apply -f install/kubernetes/quick-install.yaml

This causes Cilium daemonset to be deleted and recreated. This allows for two Cilium pods - one terminating from deleted daemonset, one initializing from new daemonset to coexist at the same node and can cause cni-uninstall.sh to be called after cni-install.sh causing breakage.

The timing on this race is really tight and it would not manifest often, but it happens in our CI, where we delete-recreate daemonsets pretty often.

Tinkering with terminationGracePeriodSeconds and adding sleep ensures that it will be reproduced every time.

@nebril nebril self-assigned this Jun 2, 2020
@nebril nebril added the kind/bug This is a bug in the Cilium logic. label Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant