Kubernetes, disable dask scheduler and workers auto resheduling #112

VMois · 2018-11-21T19:06:56Z

Hello,

The issue is more related to Kuberntes and GCP but anyway I want to get some pieces of advice. I have created a dynamic Dask k8s cluster (using dask-kubernetes) on GCP and set up node autoscaling. An initial state of dask cluster is one scheduler pod (with KubeCluster) and one worker pod (created by scheduler). Everything is working well, but when scheduler starts to add new workers (due to high load) and GCP begin to scale up nodes, quite often I can experience a scheduler pod rescheduling. Kubernetes or GCP decides to delete scheduler and recreate it on another node. Of course, because of that all tasks are deleted, I receive an error and cluster becomes unstable. Have you ever experienced such behavior?

To tackle this problem, I have added nodeSelector to the scheduler pod definition and it working good (at least it looks like this). But, also, the same situation appears for the workers. They are deleted and recreate and you lose your results. In this situation, you cannot easily setup nodeSelector labels to dynamically create workers.

It would be really great to have a feature (property) in pod definition that says: "Don't delete/move this pod until it will fail or succeed". Is it makes sense add such functionality to k8s project?

Maybe, you have ideas for a different solution?

Thank you for your attention.

The text was updated successfully, but these errors were encountered:

jacobtomlinson · 2018-11-22T07:58:50Z

That is an interesting problem! The way to solve this would be with pod disruption budgets on your scheduler. You should set your scheduler to require a minimum of 1 pod in the budget.

VMois · 2018-11-22T19:43:32Z

I have already tried pod budgets before with minAvailable = 1 and it didn't work. Kubernetes just creates the second pod on another node and deletes the current one. Also, it doesn't solve the problem with worker rescheduling.

But, I have checked disruption budgets docs one more time and more carefully. I found that you can set maxUnavailable=0 as your budget parameter (info). I will test it tomorrow. Thanks for pointing.

jacobtomlinson · 2018-11-23T14:02:50Z

I'm going to close this as it sounds like you have a solution. But please feed back here to let us know how you get on.

VMois · 2018-11-23T18:30:06Z

maxUnavailable=0 is not working. I will try to check PDB one more time, but I doubt it will change something.

Edit:
tested with maxUnavailable=0 one more time. The result is the same, PDB is not working.

jacobtomlinson · 2018-11-23T21:19:41Z

Thats a shame! I would've expected that to work. Did you set both minAvailable and maxUnavailable at the same time?

Another option could be to explore stateful sets for the scheduler pod. But that would be a last resort.

VMois · 2018-11-23T22:43:14Z

No, I have set only maxUnavailable. I can try set both of them later.

Edit:
you cannot setup both maxUnavailable and minAvailable. I don't have an idea why maxUnavailable is not working.

VMois · 2018-11-27T13:31:56Z

Is it possible to reopen the issue? Problem is still relevant. I think it's more related to Kubernetes but still affects dask-kubernetes.

jacobtomlinson · 2018-11-27T13:47:16Z

I would like to get to the bottom of this problem, so let's reopen it.

But to be clear this isn't a problem with dask-kubernetes, as the issue you are having is with your own custom scheduler pod which is not created as part of this project.

I'm very surprised that setting maxUnavailable to 0 isn't working. Are you sure you are using the right labels, etc?

VMois · 2018-11-27T14:46:07Z

A scheduler pod is based on dask-kuberntes and my own image (link). When I tested maxUnavailable last time I have checked all labels few times and my pod is part of Deployment. Maybe, I missed something. I will try to test it one more time today and comment here results. I'm also surprised that maxUnavailable is not working as expected.

Thanks for reopening this issue.

VMois · 2018-11-27T15:42:17Z

I have just tested it. Not working. I clearly can see in GCP logs that container was rescheduled, my tasks also failed. Is it makes open an issue in the Kubernetes project? I'm sure, I have configured everything correctly. Example of pdb:

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: dask-scheduler
spec:
  maxUnavailable: 0%
  selector:
    matchLabels:
      component: scheduler
      app: dask

Part of the scheduler describe pod command:

Labels:         
  app=dask
  component=scheduler
  pod-template-hash=3761245660
  release=test

Edit:

It's also worth pointing one more time that I'm using Gcloud Kuberntes Engine with their autoscaler. In their docs they mentioned that during scaling DOWN they will respect PDB and other limitations. Maybe, because my problem occurs during scaling UP, there is a probability that they don't tolerate PDB during scaling up.

jacobtomlinson · 2018-11-27T16:18:12Z

You may want to ask this somewhere more kubernetes related. Perhaps on Stack Overflow with the kubernetes and gke tags?

VMois · 2018-11-27T16:32:59Z

Already - https://stackoverflow.com/questions/53437458/disable-auto-rescheduling-for-a-pod :)

mamoit · 2018-11-27T17:37:06Z

You really have to setup a pod disruption budget in order for your one pod deployment to not be rescheduled.

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: dask-scheduler
spec:
  maxUnavailable: 0
  selector:
    matchLabels:
      component: scheduler
      app: dask

I don't have the % don't know if it changes anything, and I'm using GKE too.
I had the same problem as you but on scale down, and it went away when I added the PDB.

The correct way to solve this though, is to change your deployment to a pod, since it doesn't seem that you are making use of any of the replicaset properties, and you are using single pod deployments.

EDIT: The % doesn't seem to change anything.

VMois · 2018-11-30T11:00:21Z

@mamoit Still the same problem, PDB is not working during scale up, only scale down. Also, there is no difference between Deployment and Pod. The only solution I have found for scheduler is to manually add nodeSelector, but the problem still exists for workers.

jacobtomlinson · 2018-11-30T17:01:01Z

There is a difference between deployment and pod. If you create a pod that is not part of a deployment the cluster scheduler will not move it around for fear of breaking it.

mamoit · 2018-12-11T11:39:39Z

@VMois The "not move it around for fear of breaking it" part that @jacobtomlinson mentioned is what I thought that could help in your case.
Did you manage to solve this?

VMois · 2018-12-14T12:19:25Z

@mamoit I have not managed to solve it with a pod, as I remember:) The only working solution is still adding manually nodeSelector. I will be able to test all your propositions one more time next week.

jacobtomlinson · 2018-12-17T09:19:42Z

As this is still ticking along and is not actually an issue with dask-kubernetes but instead a dask kubernetes use case I'm going to close this again in favor of the stack overflow question.

jacobtomlinson closed this as completed Nov 23, 2018

jacobtomlinson reopened this Nov 27, 2018

jacobtomlinson closed this as completed Dec 17, 2018

VMois mentioned this issue Mar 26, 2020

Dask scheduler/workers disruption VMois/dask-k8s-chart#2

Open

groszewn mentioned this issue Nov 20, 2020

Support optional PDB creation for remote scheduler #282

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes, disable dask scheduler and workers auto resheduling #112

Kubernetes, disable dask scheduler and workers auto resheduling #112

VMois commented Nov 21, 2018

jacobtomlinson commented Nov 22, 2018

VMois commented Nov 22, 2018 •

edited

jacobtomlinson commented Nov 23, 2018

VMois commented Nov 23, 2018 •

edited

jacobtomlinson commented Nov 23, 2018

VMois commented Nov 23, 2018 •

edited

VMois commented Nov 27, 2018

jacobtomlinson commented Nov 27, 2018

VMois commented Nov 27, 2018

VMois commented Nov 27, 2018 •

edited

jacobtomlinson commented Nov 27, 2018

VMois commented Nov 27, 2018

mamoit commented Nov 27, 2018 •

edited

VMois commented Nov 30, 2018

jacobtomlinson commented Nov 30, 2018

mamoit commented Dec 11, 2018

VMois commented Dec 14, 2018

jacobtomlinson commented Dec 17, 2018

Kubernetes, disable dask scheduler and workers auto resheduling #112

Kubernetes, disable dask scheduler and workers auto resheduling #112

Comments

VMois commented Nov 21, 2018

jacobtomlinson commented Nov 22, 2018

VMois commented Nov 22, 2018 • edited

jacobtomlinson commented Nov 23, 2018

VMois commented Nov 23, 2018 • edited

jacobtomlinson commented Nov 23, 2018

VMois commented Nov 23, 2018 • edited

VMois commented Nov 27, 2018

jacobtomlinson commented Nov 27, 2018

VMois commented Nov 27, 2018

VMois commented Nov 27, 2018 • edited

jacobtomlinson commented Nov 27, 2018

VMois commented Nov 27, 2018

mamoit commented Nov 27, 2018 • edited

VMois commented Nov 30, 2018

jacobtomlinson commented Nov 30, 2018

mamoit commented Dec 11, 2018

VMois commented Dec 14, 2018

jacobtomlinson commented Dec 17, 2018

VMois commented Nov 22, 2018 •

edited

VMois commented Nov 23, 2018 •

edited

VMois commented Nov 23, 2018 •

edited

VMois commented Nov 27, 2018 •

edited

mamoit commented Nov 27, 2018 •

edited