Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statefullset does not honor podManagementPolicy=OrderedReady when restarting failed pods #82529

Open
Benjamin-Riefenstahl-mecom opened this issue Sep 10, 2019 · 11 comments

Comments

@Benjamin-Riefenstahl-mecom
Copy link

commented Sep 10, 2019

What happened:

We have a Statefullset with podManagementPolicy=OrderedReady. For testing I throw away all PODs using kubectl delete --force --grace-period=0. Kubernetes restarts the PODs all at the same time, i.e. in parallel.

What you expected to happen:

The PODs should be restarted one by one. The PODs rely on that, which is why we use OrderedReady.

How to reproduce it (as minimally and precisely as possible):

See above.

Anything else we need to know?:

If there is another method to achieve what we want, please tell.

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:37:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.12", GitCommit:"c757b93cf034d49af3a3b8ecee3b9639a7a11df7", GitTreeState:"clean", BuildDate:"2018-12-19T11:04:29Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    Self-hosted
  • OS (e.g: cat /etc/os-release):
    Ubuntu 16.04.5 LTS
  • Kernel (e.g. uname -a):
    4.4.0-140-generic
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@Benjamin-Riefenstahl-mecom

This comment has been minimized.

Copy link
Author

commented Sep 10, 2019

/sig bugs

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2019

@Benjamin-Riefenstahl-mecom: The label(s) sig/bugs cannot be appled. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other

In response to this:

/sig bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Joseph-Irving

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2019

/sig apps

@k8s-ci-robot k8s-ci-robot added sig/apps and removed needs-sig labels Sep 10, 2019

@Joseph-Irving

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2019

I've tried reproducing this on kubernetes v1.10.12 but I could not get the same behaviour
I created a basic sample statefulset

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        readinessProbe:
          tcpSocket:
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 10
        ports:
        - containerPort: 80
          name: web

Deleted all the pods

kubectl delete pods -l app=nginx --force --grace-period=0

Pods start getting recreated

kubectl get pods
NAME    READY   STATUS              RESTARTS   AGE
web-0   0/1     ContainerCreating   0          2s

Pods were being recreated in order as expected

kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          20s
web-1   0/1     Running   0          2s

Can you show your statefulset manifest kubectl get statefulset <statefulset name> -o yaml ?

Also worth noting that kubernetes v1.10.12 is EOL, the current supported versions of kubernetes are 1.13, 1.14 and 1.15. So it might be worth upgrading to see if that fixes your issue.

@Benjamin-Riefenstahl-mecom

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

@Benjamin-Riefenstahl-mecom

This comment has been minimized.

@Joseph-Irving

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

I think what's happening there is all working as expected
When you execute this command

for i in 2 1 0; do kubectl delete --force --grace-period=0 pod/test-pods-$i; done

First you delete test-pod-2, it will immediately recreate as 0 and 1 are still running, then you delete test-pod-1 which will recreate as test-pod-0 is still running, then you delete test-pod-0 which will immediately recreate. So they are correctly following order.

If you change it to

for i in 2 0 1; do kubectl delete --force --grace-period=0 pod/test-pods-$i; done

2 will immediately recreate as 1 is still running, 1 will not, as 0 was deleted before it so it needs to wait till 0 becomes ready before it will start.

@Benjamin-Riefenstahl-mecom

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

@Joseph-Irving

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

Right so my understanding of your problem is that you want to be able to delete pods in an arbitrary order and for them to be recreated in that same order but only when each pod becomes ready?
I don't believe there's any way to do this in Kubernetes and I don't think I fully understand the use case for this.

So you could open a feature request issue and explain what you're proposing with a user story for why it's needed, if enough people in the community agree with it then it could end up being built into kubernetes. Otherwise you're probably better off looking at implementing your own solution to the problem.

@Benjamin-Riefenstahl-mecom

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

@Benjamin-Riefenstahl-mecom

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

I added a feature request for this, s.a.

In the meantime, I would suggest that the problem (OrderedReady != serial in cases of error) should be noted in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.