Skip to content
This repository has been archived by the owner on May 25, 2023. It is now read-only.

[Question] How kube batch ensure the cache consistency if when another pod is scheduled by the default kube scheduler? #922

Closed
qw2208 opened this issue Dec 4, 2019 · 16 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@qw2208
Copy link

qw2208 commented Dec 4, 2019

/kind feature

Since kube batch session open and seems that it will update its cache snapshot every 1 second (pls correct me if I didn't understand correctly), what if there are any other normal pod being scheduled by the k8s default scheduler during this time period of 1s and will the kube-batch cache still be consistent with the cluster status?

Or will kube-batch and kube-scheduler schedule pods at the same time?

Thanks in advance if there are some explanation or code reference could work as well.

@qw2208 qw2208 changed the title How kube batch ensure the consistency if when a pod is scheduled by the default kube scheduler? How kube batch ensure the cache consistency if when a pod is scheduled by the default kube scheduler? Dec 4, 2019
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 4, 2019
@qw2208 qw2208 changed the title How kube batch ensure the cache consistency if when a pod is scheduled by the default kube scheduler? How kube batch ensure the cache consistency if when another pod is scheduled by the default kube scheduler? Dec 5, 2019
@qw2208 qw2208 changed the title How kube batch ensure the cache consistency if when another pod is scheduled by the default kube scheduler? [Question] How kube batch ensure the cache consistency if when another pod is scheduled by the default kube scheduler? Dec 5, 2019
@k82cn
Copy link
Contributor

k82cn commented Dec 5, 2019

If kube-batch and default-scheduler scheduled pods to the same pods, kubelet will reject it if no more resources, and ether kube-batch or default-scheduler will re-schedule it later :)

@qw2208
Copy link
Author

qw2208 commented Dec 5, 2019

Thanks @k82cn for your reply. What if in a period of 1s, other tasks of a podgroup are scheduled but one task is rejected by the kubelet, then the gang scheduling might fail?

@xial-thu
Copy link

xial-thu commented Dec 5, 2019

my understanding is that gang scheduling will fail. it's not a big deal. we still have the next round. asynchronous binding in default scheduler may fail too and default scheduler will add that pod to backoffQ or unschedulableQ and waits for the next chance to re-schedule. if you really mind the race condition, scheduling framework proposed at community, aiming at extending the abilities of default scheduler by plugins may help. Mr @k82cn is one of the proposers. Codes are already merged in k8s.

@qw2208
Copy link
Author

qw2208 commented Dec 5, 2019

All right. I'm just trying to estimate the risk of leveraging this plugin as the batch scheduler.
So,

  1. In a large cluster, it still seems all right to co-exist the kube-scheduler as the default scheduler and kube-batch for gang scheduling?
  2. Could you pls give me some reference on the extension of default scheduler?

@xial-thu
Copy link

xial-thu commented Dec 5, 2019

  1. It depends on your scenario because resource race condition exists. Things get out of control if both schedulers are busy in a resource-limited environment. I apply and only apply kube-batch/volcano to AI jobs so CPU/memory does not bother me.
  2. https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/20180409-scheduling-framework.md

@qw2208
Copy link
Author

qw2208 commented Dec 6, 2019

Thanks @xial-thu for your reply.

  1. Is there any side effect if all normal pods are scheduled by kube-batch?
  2. For normal pods, my understanding is that if for a normal pod whose scheduler_name is kube-batch and even if we don't have a podgroup for this pod, the kube-batch will create a shadow podgroup and schedule it?

@xial-thu
Copy link

xial-thu commented Dec 6, 2019

Thanks @xial-thu for your reply.

  1. Is there any side effect if all normal pods are scheduled by kube-batch?
  2. For normal pods, my understanding is that if for a normal pod whose scheduler_name is kube-batch and even if we don't have a podgroup for this pod, the kube-batch will create a shadow podgroup and schedule it?

pods are associated with podgroup only by annotations. normal pods will be scheduled anyway. logics are in the codes.
kube-batch re-uses predicate and prioritize so all the algorithms are available. I will suggest testing the stability of kube-batch itself at first. Besides, the re-schedule mechanism between kube-batch and default-scheduler is kind of different. I cannot give the exact answer because my team hacks a lot based on kube-batch. But sometimes it crashes.

@qw2208
Copy link
Author

qw2208 commented Dec 9, 2019

Haa, it seems that kube batch can be still a feasible option for gang scheduling but not that ready.

@k82cn
Copy link
Contributor

k82cn commented Dec 9, 2019

I think you can try volcano which we already put into product environment :)

@qw2208
Copy link
Author

qw2208 commented Dec 9, 2019

Two questions then:

  1. If we use volcano, all the pods should be scheduled by volcano?
  2. Will kube batch be able to support large-scale scenarios maybe someday? Is it still updating?

@k82cn
Copy link
Contributor

k82cn commented Dec 10, 2019

  1. If we use volcano, all the pods should be scheduled by volcano?

It's up to you; it support multi-scheduler way, and it also support features in default scheduler.

  1. Will kube batch be able to support large-scale scenarios maybe someday? Is it still updating?

It dependent on how define "large-scale scenarios", it's still updating.

@k82cn
Copy link
Contributor

k82cn commented Dec 10, 2019

I cannot give the exact answer because my team hacks a lot based on kube-batch. But sometimes it crashes.

@xial-thu , any detail on that?

@xial-thu
Copy link

@xial-thu , any detail on that?

Sorry I lost the screen snapshot and logs. I remember pods are shown as something like “admissionError”, I'm not sure.

I'm planning to do some benchmark by kubemark. I'll update if I met that problem again.

@k82cn
Copy link
Contributor

k82cn commented Dec 11, 2019

I'm planning to do some benchmark by kubemark. I'll update if I met that problem again.

That's great !

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 10, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 9, 2020
@qw2208 qw2208 closed this as completed May 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants