-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: remove ConfigMap from ray-job.kueue-toy-sample.yaml #1976
Conversation
cc @andrewsykim |
@@ -7,7 +7,7 @@ spec: | |||
# The default value is "K8sJobMode", meaning RayJob will submit the Ray job via a submitter Kubernetes Job. | |||
# The alternative value is "HTTPMode", indicating that KubeRay will submit the Ray job by sending an HTTP request to the RayCluster. | |||
# submissionMode: "K8sJobMode" | |||
entrypoint: python /home/ray/samples/sample_code.py | |||
entrypoint: python -c "import time; [print(i) or time.sleep(1) for i in range(600)]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the example used here: https://docs.ray.io/en/master/cluster/kubernetes/getting-started/raycluster-quick-start.html#method-1-execute-a-ray-job-in-the-head-pod
Can this be
entrypoint: python -c "import ray; ray.init(); print(ray.cluster_resources())"
I think an example that uses ray library will be better, even if it's very simple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ray.cluster_resources()
example is not suitable for showcasing gang scheduling. In this doc, we create two RayJob custom resources and ensure that no Pods belonging to the second RayJob will be created because the ClusterQueue doesn't have enough resources. If the RayJob finishes very quickly, it is possible that users may not be able to observe this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewsykim does it make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. This was more of a general comment that we don't need ConfigMap for most examples. ray-job.sample.yaml for example can probably inline the example right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ray.cluster_resources() example is not suitable for showcasing gang scheduling.
Actually, even for this example, can we run ray.cluster_resources()
in between each sleep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 6093435
LGTM, my other comments are not blocking |
Why are these changes needed?
#1956 (comment)
This PR removes ConfigMap from the YAML. In this doc, we need to run
kubectl create -f ray-job.kueue-toy-sample.yaml
multiple times. If the YAML includes a ConfigMap, users may see a warning message indicating that the ConfigMap already exists. Removing the ConfigMap improves the UX.Related issue number
Checks