Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to perform a rolling upgrade #53

Closed
lchdev opened this issue Aug 24, 2021 · 1 comment · Fixed by #54
Closed

Unable to perform a rolling upgrade #53

lchdev opened this issue Aug 24, 2021 · 1 comment · Fixed by #54
Labels

Comments

@lchdev
Copy link

lchdev commented Aug 24, 2021

Describe the bug
Unable to perform a rolling upgrade: new pods are never marked ready because the liveness probe fails continuously.

To Reproduce

  1. Install with helm helm upgrade --install kubernetes-secret-generator mittwald/kubernetes-secret-generator --version 3.3.2
  2. Try to perform any upgrade, e.g. by changing a value helm upgrade --install kubernetes-secret-generator mittwald/kubernetes-secret-generator --version 3.3.2 --set secretLength=50

The deployment will try to create a new pod, but the container will enter a crash loop and never become ready:

$ kubectl get po
NAME                                           READY   STATUS             RESTARTS   AGE
kubernetes-secret-generator-6f79f56667-v8n7l   0/1     CrashLoopBackOff   6          4m41s
kubernetes-secret-generator-b55758744-hts96    1/1     Running            0          6m32s
$ kubectl describe po kubernetes-secret-generator-6f79f56667-v8n7l
Name:         kubernetes-secret-generator-6f79f56667-v8n7l
(...)
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  55s                default-scheduler  Successfully assigned default/kubernetes-secret-generator-6f79f56667-v8n7l to docker-desktop
  Normal   Pulled     52s                kubelet            Successfully pulled image "quay.io/mittwald/kubernetes-secret-generator:v3.3.2" in 1.3045537s
  Normal   Started    41s (x2 over 52s)  kubelet            Started container kubernetes-secret-generator
  Normal   Pulled     41s                kubelet            Successfully pulled image "quay.io/mittwald/kubernetes-secret-generator:v3.3.2" in 1.3209008s
  Normal   Pulling    31s (x3 over 54s)  kubelet            Pulling image "quay.io/mittwald/kubernetes-secret-generator:v3.3.2"
  Warning  Unhealthy  31s (x6 over 49s)  kubelet            Liveness probe failed: Get "http://10.1.0.11:8080/healthz": dial tcp 10.1.0.11:8080: connect: connection refused
  Warning  Unhealthy  31s (x6 over 49s)  kubelet            Readiness probe failed: Get "http://10.1.0.11:8080/readyz": dial tcp 10.1.0.11:8080: connect: connection refused
  Normal   Killing    31s (x2 over 43s)  kubelet            Container kubernetes-secret-generator failed liveness probe, will be restarted
  Normal   Pulled     30s                kubelet            Successfully pulled image "quay.io/mittwald/kubernetes-secret-generator:v3.3.2" in 1.2851037s
  Normal   Created    29s (x3 over 52s)  kubelet            Created container kubernetes-secret-generator

Container logs:

$ kubectl logs kubernetes-secret-generator-6f79f56667-v8n7l
{"level":"info","ts":1629819473.2746468,"logger":"cmd","msg":"Operator Version: 0.0.1"}
{"level":"info","ts":1629819473.2746885,"logger":"cmd","msg":"Go Version: go1.15.14"}
{"level":"info","ts":1629819473.2746935,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1629819473.2746956,"logger":"cmd","msg":"Version of operator-sdk: v0.16.0"}
{"level":"info","ts":1629819473.276408,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1629819473.8457088,"logger":"leader","msg":"Found existing lock","LockOwner":"kubernetes-secret-generator-b55758744-hts96"}
{"level":"info","ts":1629819473.8563178,"logger":"leader","msg":"Not the leader. Waiting."}
{"level":"info","ts":1629819474.994521,"logger":"leader","msg":"Not the leader. Waiting."}
{"level":"info","ts":1629819477.3851807,"logger":"leader","msg":"Not the leader. Waiting."}
{"level":"info","ts":1629819481.9319456,"logger":"leader","msg":"Not the leader. Waiting."}

If I manually kill the old instance, the new pod is able to become the leader and to start successfully.

@elenz97
Copy link
Contributor

elenz97 commented Aug 30, 2021

Hey @lchdev,
the upstream helm chart now uses "Recreate" as deployment strategy which should result in upgraded deployments starting as expected.

I could successfully test this on a local cluster (using the upstream master codebase) by running:

$ helm install kubernetes-secret-generator deploy/helm-chart/kubernetes-secret-generator/.
$ helm upgrade --install kubernetes-secret-generator deploy/helm-chart/kubernetes-secret-generator --set secretLength=50
$ kubectl get po
NAME                                           READY   STATUS    RESTARTS   AGE
kubernetes-secret-generator-84946bf455-q8wjn   1/1     Running   0          30s

The deployment strategy might also be set via helm install [...] --set deploymentStrategy="Recreate".

I hope this fixes your issue! If there's anything else we can help you with, please let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants