-
Notifications
You must be signed in to change notification settings - Fork 564
[Feature] [scheduler-plugins] Support second scheduler mode #3852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Cheyu Wu <cheyu1220@gmail.com>
09c3853
to
ea71807
Compare
Hi @kevin85421, PTAL |
Why do you use single scheduler for manual test? |
cc @troychiu for review |
Hi @kevin85421 labels:
ray.io/scheduler-name: scheduler-plugins and verified via: $ kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.schedulerName}{"\n"}{end}' This setup follows the multi-scheduler setup, where I’ll revise the wording in the PR description to avoid confusion around the "single scheduler" statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, you deploy scheduler-plugins in "single scheduler" mode to replace the default scheduler. For "second scheduler" mode, you need to use the Helm chart to install scheduler-plugins in a separate Pod.
https://github.com/kubernetes-sigs/scheduler-plugins/blob/93126eabdf526010bf697d5963d849eab7e8e898/doc/install.md#as-a-second-scheduler
Ops, I have a misunderstanding. I will use the second scheduler mode instead. |
Hi @kevin85421 @troychiu, I have updated the manual testing procedure, PTAL |
I have also updated the 100 pods manual testing, and all of them are in pending status |
Have you tested for both single scheduler and second scheduler for this 100 Pods RayCluster CR? |
@@ -90,8 +90,7 @@ func (k *KubeScheduler) AddMetadataToPod(_ context.Context, app *rayv1.RayCluste | |||
if k.isGangSchedulingEnabled(app) { | |||
pod.Labels[kubeSchedulerPodGroupLabelKey] = app.Name | |||
} | |||
// TODO(kevin85421): Currently, we only support "single scheduler" mode. If we want to support | |||
// "second scheduler" mode, we need to add `schedulerName` to the pod spec. | |||
pod.Spec.SchedulerName = k.Name() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we change the name to scheduler-plugins-scheduler
to match the default name in https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/manifests/install/charts/as-a-second-scheduler/values.yaml#L6C9-L6C36?
if cluster.Labels == nil { | ||
cluster.Labels = make(map[string]string) | ||
} | ||
if tt.enableGang { | ||
cluster.Labels["ray.io/gang-scheduling-enabled"] = "true" | ||
} else { | ||
delete(cluster.Labels, "ray.io/gang-scheduling-enabled") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be cleaner?
if cluster.Labels == nil { | |
cluster.Labels = make(map[string]string) | |
} | |
if tt.enableGang { | |
cluster.Labels["ray.io/gang-scheduling-enabled"] = "true" | |
} else { | |
delete(cluster.Labels, "ray.io/gang-scheduling-enabled") | |
} | |
cluster.Labels = make(map[string]string) | |
if tt.enableGang { | |
cluster.Labels["ray.io/gang-scheduling-enabled"] = "true" | |
} |
scheduler := &KubeScheduler{} | ||
scheduler.AddMetadataToPod(context.TODO(), &cluster, "worker", pod) | ||
|
||
if tt.expectedPodGroup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simply use enableGang instead of having another parameter? I think they have similar intention.
As @kevin85421 mentioned, can you also double check if both modes work fine? |
|
@@ -21,7 +21,7 @@ import ( | |||
) | |||
|
|||
const ( | |||
schedulerName string = "scheduler-plugins" | |||
schedulerName string = "scheduler-plugins-scheduler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding a comment mentioning the source of the name? https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/manifests/install/charts/as-a-second-scheduler/values.yaml#L6C9-L6C36
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is important. I will add the comment.
@@ -69,13 +69,13 @@ logging: | |||
# | |||
# 4. Use PodGroup | |||
# batchScheduler: | |||
# name: scheduler-plugins | |||
# name: scheduler-plugins-scheduler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For user facing config, I am not sure if we should use "scheduler-plugins" or "scheduler-plugins-scheduler". Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, and it's easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, I think this is a little awkward, we cannot directly change GetPluginName
, because
case schedulerplugins.GetPluginName(): |
If we need to change batchScheduler
to scheduler-plugins
, the code will probably be
const (
schedulerName string = "scheduler-plugins"
+ defaultSchedulerName string = "scheduler-plugins-scheduler"
kubeSchedulerPodGroupLabelKey string = "scheduling.x-k8s.io/pod-group"
)
func GetPluginName() string {
return schedulerName
}
func (k *KubeScheduler) Name() string {
return defaultSchedulerName -> Is this fine to change something like this?
}
I am not sure if there is a better idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, user experience is more important so this is fine to me. However, we'll need good variable naming and comments explaining why there are two names and their corresponding responsibility.
Why are these changes needed?
Currently, KubeRay only supports scheduler plugins when it is deployed as a single scheduler.
This change adds support for using a secondary scheduler with
scheduler-plugins
Manual Testing
Common Portion
Ray operator setup
Set
helm-chart/kuberay-operator/values.yaml
'sbatchScheduler.name
toscheduler-plugins-scheduler
Testing YAML file
deploy.yaml
Single scheduler
CoScheduler setup
Follow the instruction - Reference
Some things that are different from instruction
Install vim in
kube-scheduler-kind-control-plane
Fix the permission problem in
kube-scheduler-kind-control-plane
Apply missing YAML
/etc/kubernetes/sched-cc.yaml
Apply
deploy.yaml
Run Cmd to deploy raycluster with
scheduler-plugins-scheduler
andgang-scheduling-enabled
Result
Get Status
Get scheduler Name - ray operator & ray head & ray worker
$ k get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.schedulerName}{"\n"}{end}' kuberay-operator-5f997dbf6c-gf9g2 default-scheduler raycluster-kuberay-head scheduler-plugins-scheduler raycluster-kuberay-workergroup-worker-44jd4 scheduler-plugins-scheduler raycluster-kuberay-workergroup-worker-cqbks scheduler-plugins-scheduler raycluster-kuberay-workergroup-worker-nnl79 scheduler-plugins-scheduler
Modify the
deploy.yaml
and apply itRun cmd to check is all of the pod in pending status
$ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE default kuberay-operator-5f997dbf6c-gf9g2 1/1 Running 0 51m default raycluster-kuberay-head 0/1 Pending 0 2m58s default raycluster-kuberay-workergroup-worker-2h25p 0/1 Pending 0 2m57s ... -> skip lots of pending worker pods default raycluster-kuberay-workergroup-worker-xqvzv 0/1 Pending 0 2m57s default raycluster-kuberay-workergroup-worker-z4dhc 0/1 Pending 0 2m54s default raycluster-kuberay-workergroup-worker-zbl9f 0/1 Pending 0 2m54s kube-system coredns-6f6b679f8f-4fcbz 1/1 Running 0 74m kube-system coredns-6f6b679f8f-v8fm6 1/1 Running 0 74m kube-system etcd-kind-control-plane 1/1 Running 0 74m kube-system kindnet-9p4st 1/1 Running 0 74m kube-system kube-apiserver-kind-control-plane 1/1 Running 0 74m kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 74m kube-system kube-proxy-5xv2w 1/1 Running 0 74m kube-system kube-scheduler-kind-control-plane 1/1 Running 0 57m local-path-storage local-path-provisioner-57c5987fd4-sfx5n 1/1 Running 0 74m scheduler-plugins scheduler-plugins-controller-845cfd89c6-vvg4p 1/1 Running 0 60m
Second scheduler
According to the instruction - Reference
Install the
scheduler-plugins
Check the scheduler-plugins is running
Ray operator and apply config
Set
helm-chart/kuberay-operator/values.yaml
batchScheduler.name toscheduler-plugins
Apply
deploy.yaml
Run Cmd to deploy raycluster with
scheduler-plugins-scheduler
andgang-scheduling-enabled
Result
Get Status
Get scheduler Name - ray operator & ray head & ray worker
$ k get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.schedulerName}{"\n"}{end}' kuberay-operator-5f997dbf6c-mdj8c default-scheduler raycluster-kuberay-head scheduler-plugins-scheduler raycluster-kuberay-workergroup-worker-kgjbc scheduler-plugins-scheduler raycluster-kuberay-workergroup-worker-rgml8 scheduler-plugins-scheduler raycluster-kuberay-workergroup-worker-xlpwp scheduler-plugins-scheduler scheduler-plugins-controller-845cfd89c6-f7bqt default-scheduler scheduler-plugins-scheduler-5dd667cb77-99t6x default-scheduler
Modify the
deploy.yaml
and apply itRun cmd to check is all of the pod in pending status
$ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE default kuberay-operator-5f997dbf6c-mdj8c 1/1 Running 0 13m default raycluster-kuberay-head 0/1 Pending 0 22s default raycluster-kuberay-workergroup-worker-24rss 0/1 Pending 0 15s default raycluster-kuberay-workergroup-worker-2l7cp 0/1 Pending 0 19s default raycluster-kuberay-workergroup-worker-2lrxx 0/1 Pending 0 15s ... -> skip lots of pending worker pods default raycluster-kuberay-workergroup-worker-xrw7w 0/1 Pending 0 15s default raycluster-kuberay-workergroup-worker-xz88j 0/1 Pending 0 16s default raycluster-kuberay-workergroup-worker-z98xm 0/1 Pending 0 22s default raycluster-kuberay-workergroup-worker-zl6qs 0/1 Pending 0 17s default raycluster-kuberay-workergroup-worker-zv4md 0/1 Pending 0 17s default scheduler-plugins-controller-845cfd89c6-f7bqt 1/1 Running 0 14m default scheduler-plugins-scheduler-5dd667cb77-99t6x 1/1 Running 0 14m kube-system coredns-6f6b679f8f-gjcvt 1/1 Running 0 22m kube-system coredns-6f6b679f8f-ldrt2 1/1 Running 0 22m kube-system etcd-kind-control-plane 1/1 Running 0 22m kube-system kindnet-mpbzl 1/1 Running 0 22m kube-system kube-apiserver-kind-control-plane 1/1 Running 0 22m kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 22m kube-system kube-proxy-ftjnh 1/1 Running 0 22m kube-system kube-scheduler-kind-control-plane 1/1 Running 0 22m local-path-storage local-path-provisioner-57c5987fd4-w2sx9 1/1 Running 0 22m
Related issue number
Closes #3769
Checks