Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale down may cause downtime #40304

Closed
caarlos0 opened this issue Jan 23, 2017 · 11 comments
Closed

Scale down may cause downtime #40304

caarlos0 opened this issue Jan 23, 2017 · 11 comments
Assignees
Labels
area/workload-api/deployment sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Milestone

Comments

@caarlos0
Copy link
Contributor

caarlos0 commented Jan 23, 2017

I created a service ops with 10 replicas, and a strategy.rollingUpdate.maxUnavailable = 1:

$ kubectl scale deployments ops --replicas 10
deployment "ops" scaled

$ kubectl get pods
NAME                  READY     STATUS    RESTARTS   AGE
ops-886507735-4h9ad   2/2       Running   0          9s
ops-886507735-9ng1p   2/2       Running   0          29s
ops-886507735-evawq   2/2       Running   0          44s
ops-886507735-ffjhx   2/2       Running   0          9s
ops-886507735-k6mmo   2/2       Running   0          9s
ops-886507735-m2nvl   2/2       Running   0          9s
ops-886507735-pke96   2/2       Running   0          9s
ops-886507735-veazd   2/2       Running   0          9s
ops-886507735-visci   2/2       Running   0          9s
ops-886507735-zdp26   2/2       Running   0          9s

If I deploy something wrong, let's suppose, a bad docker image that won't come up, or a tag that doesn't exist, some of the replicas will still be running:

$ kubectl set image deployment/ops ops=user/ops:nope
deployment "ops" image updated

$ kubectl get pods
NAME                   READY     STATUS             RESTARTS   AGE
ops-2122582723-10aep   1/2       ImagePullBackOff   0          31s
ops-2122582723-tyvqq   1/2       ImagePullBackOff   0          31s
ops-886507735-4h9ad    2/2       Running            0          1m
ops-886507735-9ng1p    2/2       Running            0          1m
ops-886507735-evawq    2/2       Running            0          1m
ops-886507735-ffjhx    2/2       Running            0          1m
ops-886507735-m2nvl    2/2       Running            0          1m
ops-886507735-pke96    2/2       Running            0          1m
ops-886507735-veazd    2/2       Running            0          1m
ops-886507735-visci    2/2       Running            0          1m
ops-886507735-zdp26    2/2       Running            0          1m

Now, if I scale down (for some reason) to 2 replicas, for example, what happens is:

$ kubectl scale deployments ops --replicas 2
deployment "ops" scaled

$ kubectl get pods
NAME                   READY     STATUS             RESTARTS   AGE
ops-2122582723-tyvqq   1/2       ImagePullBackOff   0          2m
ops-2122582723-yuruf   1/2       ErrImagePull       0          33s
ops-886507735-evawq    2/2       Running            0          4m

Which is OK, service is still up.

Now, if I scale down to 1 replica, things get ugly:

$ kubectl scale deployments ops --replicas 1
deployment "ops" scaled

$ kubectl get pods
NAME                   READY     STATUS             RESTARTS   AGE
ops-2122582723-tyvqq   1/2       ImagePullBackOff   0          3m
ops-886507735-evawq    2/2       Terminating        0          4m

Downtime!

Why don't Kubernetes let the container that was working running instead of killing it and trying to launch a new one?

Is there a way of auto-rolling back when this kind of things happen (or even better, prevent it)?

@0xmichalis
Copy link
Contributor

@caarlos0 this is how the initial pass on proportional scaling was implemented. Also when scaling down we always try to remove from old replica sets first. It's definitely desirable to enhance the deployment controller to scale down broken pods first.

@0xmichalis 0xmichalis self-assigned this Jan 23, 2017
@0xmichalis
Copy link
Contributor

Is there a way of auto-rolling back when this kind of things happen (or even better, prevent it)?

Autorollback (#23211) is yet to be implemented but in 1.5 you can use progressDeadlineSeconds and identify stuck deployments.

https://kubernetes.io/docs/user-guide/deployments/#deployment-status

@0xmichalis 0xmichalis added this to the 1.6 milestone Jan 23, 2017
@0xmichalis 0xmichalis added area/workload-api/deployment sig/apps Categorizes an issue or PR as relevant to SIG Apps. kind/enhancement and removed sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels Jan 23, 2017
@caarlos0
Copy link
Contributor Author

Got it, thanks @Kargakis 👍

@0xmichalis
Copy link
Contributor

@kubernetes/sig-apps-misc

@0xmichalis
Copy link
Contributor

@caarlos0 one suggestion for now - since it's hard to act on perma-failed errors eg. somebody may not care about ImagePullBackOff and expects the image to land at some point in the future - if you are going to scale down manually first make sure that your Deployment is healthy. In this case you should rollback kubectl rollout undo before scaling down. Eventually, we should make sure that scaling down removes broken pods first because you may use an autoscaler.

@0xmichalis
Copy link
Contributor

Anybody from @kubernetes/sig-apps-misc have time to take a stab at this one? Basically we should cleanup unhealthy pods before estimating proportions when we scale down in scale.

@caarlos0
Copy link
Contributor Author

caarlos0 commented Feb 2, 2017

If someone point me to the right direction, I can try to tackle this...

@0xmichalis
Copy link
Contributor

Ok, I just realized that trying to cleanup the new replica set will do no good. The system always tries to deploy the latest replica set so having a part of the controller scale down the new replica set (cleanup) and then another part scale up (the strategy) will drive in hotlooping of the controller. That being said I think this is an not issue, we provide you with ways/tools to diagnose failures (d.spec.progressDeadlineSeconds) and rollback (kubectl rollout undo).

@caarlos0
Copy link
Contributor Author

caarlos0 commented Feb 2, 2017

@Kargakis maybe just fail then? Showing some error message saying that is not possible to scale down because there is no healthy instances in the new version, or something like that?

Of course, the user can check before scaling down... this would be just some kind of safe guard...

@0xmichalis
Copy link
Contributor

We cannot special-case the operation, otherwise we may drive autoscalers in hotloops. There is no reason not to rollback in this case other than when you expect the new image to be imported at some point in the future.

@caarlos0
Copy link
Contributor Author

caarlos0 commented Feb 3, 2017

OK, makes sense. Thanks @Kargakis =D

@calebamiles calebamiles modified the milestones: v1.6, 1.6 Feb 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workload-api/deployment sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
None yet
Development

No branches or pull requests

3 participants