Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: remove ConfigMap from ray-job.kueue-toy-sample.yaml #1976

Merged
merged 2 commits into from
Mar 11, 2024

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Mar 9, 2024

Why are these changes needed?

#1956 (comment)

This PR removes ConfigMap from the YAML. In this doc, we need to run kubectl create -f ray-job.kueue-toy-sample.yaml multiple times. If the YAML includes a ConfigMap, users may see a warning message indicating that the ConfigMap already exists. Removing the ConfigMap improves the UX.

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421 kevin85421 marked this pull request as ready for review March 9, 2024 02:15
@kevin85421 kevin85421 requested a review from jjyao March 9, 2024 02:15
@kevin85421
Copy link
Member Author

cc @andrewsykim

@@ -7,7 +7,7 @@ spec:
# The default value is "K8sJobMode", meaning RayJob will submit the Ray job via a submitter Kubernetes Job.
# The alternative value is "HTTPMode", indicating that KubeRay will submit the Ray job by sending an HTTP request to the RayCluster.
# submissionMode: "K8sJobMode"
entrypoint: python /home/ray/samples/sample_code.py
entrypoint: python -c "import time; [print(i) or time.sleep(1) for i in range(600)]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the example used here: https://docs.ray.io/en/master/cluster/kubernetes/getting-started/raycluster-quick-start.html#method-1-execute-a-ray-job-in-the-head-pod

Can this be

entrypoint: python -c "import ray; ray.init(); print(ray.cluster_resources())"

I think an example that uses ray library will be better, even if it's very simple

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ray.cluster_resources() example is not suitable for showcasing gang scheduling. In this doc, we create two RayJob custom resources and ensure that no Pods belonging to the second RayJob will be created because the ClusterQueue doesn't have enough resources. If the RayJob finishes very quickly, it is possible that users may not be able to observe this behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewsykim does it make sense to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. This was more of a general comment that we don't need ConfigMap for most examples. ray-job.sample.yaml for example can probably inline the example right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ray.cluster_resources() example is not suitable for showcasing gang scheduling.

Actually, even for this example, can we run ray.cluster_resources() in between each sleep?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 6093435

@andrewsykim
Copy link
Contributor

LGTM, my other comments are not blocking

@kevin85421 kevin85421 merged commit 8c9c732 into ray-project:master Mar 11, 2024
23 checks passed
kevin85421 added a commit to kevin85421/kuberay that referenced this pull request Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants