Remove ray-cluster.without-block.yaml #675

kevin85421 · 2022-11-03T20:28:46Z

Why are these changes needed?

We did not encourage users to run ray start without --block.

Without --block, we need to append sleep infinity to the end of the ray start command to keep the container running.
With --block, when the ray process crashes, the KubeRay operator can detect the unhealthy condition in a short time because the container will exit immediately. Without --block, the unhealthy condition can still be detected by both readiness and liveness probes, but it may take more time to detect it.

Note for those who are still interested in ray-cluster.without-block.yaml

There are two bugs in ray-cluster.without-block.yaml detected by the configuration test framework #605. See the change of ray-cluster.without-block.yaml in kevin85421@04bdd77 to fix the bugs.

object-manager-port, node-manager-port: Update ray-operator documentation and image version in ray-cluster.heterogeneous.yaml #585
command / args: The original YAML file assumes that command and args will be appended after the ray start command. However, the command and args will be executed before the ray start command (See [Feature][Docs] Explain how to specify container command for head pod #651 for more details.). That is, the YAML tries to connect to the Ray cluster before starting the ray cluster.

Related issue number

Checks

I've made sure the tests are passing.
Testing Strategy
- Unit tests
- Manual tests
- This PR is not tested :(

kevin85421 · 2022-11-03T20:33:41Z

Without --block, we need to append sleep infinity to the end of the ray start command to keep the container running.

With --block, when the ray process crashes, the KubeRay operator can detect the unhealthy condition in a short time because the container will exit immediately. Without --block, the unhealthy condition can still be detected by both readiness and liveness probes, but it may take more time to detect it.

@DmitriGekhtman Hope to double-check whether I have any misunderstanding about the reasons why we did not encourage users to run ray start without --block. Thank you!

DmitriGekhtman · 2022-11-03T21:11:45Z

Your understanding is correct.
We prefer to separate the process of deploying a Ray cluster (kubectl apply -f raycluster.yaml) and submitting work (e.g. ray job submit stuff.py)

For users who need custom entry-points, we had another discussion. The conclusion was that custom entrypoints should be supported in the obvious way (if an entrypoint is specified, honor it, otherwise format the relevant ray start command).

kevin85421 · 2022-11-03T21:13:32Z

Your understanding is correct. We prefer to separate the process of deploying a Ray cluster (kubectl apply -f raycluster.yaml) and submitting work (e.g. ray job submit stuff.py)

For users who need custom entry-points, we had another discussion. The conclusion was that custom entrypoints should be supported in the obvious way (if an entrypoint is specified, honor it, otherwise format the relevant ray start command).

Got it. Thank you!

DmitriGekhtman · 2022-11-03T21:14:03Z

Later, we can consider inject the "block" automatically.

We did not encourage users to run ray start without --block. This PR removes the example yaml that demonstrates that workflow.

remove

96e50fa

kevin85421 requested a review from DmitriGekhtman November 3, 2022 20:29

DmitriGekhtman approved these changes Nov 3, 2022

View reviewed changes

DmitriGekhtman merged commit 7850773 into ray-project:master Nov 3, 2022

kevin85421 mentioned this pull request Nov 3, 2022

[Feature] Test sample RayCluster YAMLs to catch invalid or out of date ones #678

Merged

4 tasks

This was referenced Feb 20, 2023

[Feature][Docs] Explain how to specify container command for head pod #912

Merged

[Feature] Inject the --block option to ray start command automatically #915

Closed

Yicheng-Lu-llll mentioned this pull request Feb 26, 2023

Inject the --block option to ray start command automatically #932

Merged

Yicheng-Lu-llll mentioned this pull request May 3, 2023

Add a document to outline the default settings for rayStartParams in Kuberay #1057

Merged

lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023

Remove ray-cluster.without-block.yaml (ray-project#675)

291c9cf

We did not encourage users to run ray start without --block. This PR removes the example yaml that demonstrates that workflow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove ray-cluster.without-block.yaml #675

Remove ray-cluster.without-block.yaml #675

kevin85421 commented Nov 3, 2022

kevin85421 commented Nov 3, 2022

DmitriGekhtman commented Nov 3, 2022

kevin85421 commented Nov 3, 2022

DmitriGekhtman commented Nov 3, 2022

Remove ray-cluster.without-block.yaml #675

Remove ray-cluster.without-block.yaml #675

Conversation

kevin85421 commented Nov 3, 2022

Why are these changes needed?

Note for those who are still interested in ray-cluster.without-block.yaml

Related issue number

Checks

kevin85421 commented Nov 3, 2022

DmitriGekhtman commented Nov 3, 2022

kevin85421 commented Nov 3, 2022

DmitriGekhtman commented Nov 3, 2022