Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues in heterogeneous sample #45

Merged
merged 4 commits into from
Oct 2, 2021

Conversation

anencore94
Copy link
Contributor

@anencore94 anencore94 commented Sep 30, 2021

  1. update ray-cluster.mini.yaml
  1. update ray-cluster.heterogeneous.yaml
  • The original ray-cluster.heterogeneous.yaml doesn't work in raw. So I changed some fields to make it work.
  • I'll specify each line with some screenshots.

Please take a look and give some feedback :)

Comment on lines -49 to +51
command: ["python"]
command: ["sleep"]
args:
- '/opt/code.py'
- 'infinity'
Copy link
Contributor Author

@anencore94 anencore94 Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hetero-autoscaler-image-ray-version

using this configmap doesn't work with rayproject/autoscaler image as above error. (maybe ray version in rayproject/autoscaler:latest image is weird)
So I deprecate to do not run python /opt/code.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason you meet this failure is this requires user to create configmap.yaml ealier.

@akanso WDYT? Does Ray core has example code to use so we don't need to mount our own codes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason you meet this failure is this requires user to create configmap.yaml ealier.

No, actually it isn't. I've already meet that fail case when I didn't create configmap.yaml file earlier. In that case, raycluster-heterogeneous-head pod stuck at CONTAINERCREATING with following error:
cm없을때

So, As you can see the earlier screenshot's heterogeneous-head pod status changed from RUNNING to ERROR, so I could see log above. (If pod stuck at CONTAINERCREATING, I couldn't see pod's log at all.)

There If configmap wasn't mounted, the error should be as above. And if the configmap.yaml was something wrong, the error should be something different such as 'there is no such file at /opt/code.py'. That's why I suspect the ray version in rayproject/autoscaler:latest image is weird.

I used the configmap.yaml file from here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sound good to me.

@Jeffwan Jeffwan changed the title Update samples Fix issues in heterogeneous sample Oct 2, 2021
@Jeffwan
Copy link
Collaborator

Jeffwan commented Oct 2, 2021

@anencore94 Thanks for the contribution. We are still in early stage. Feel free to raise more feature request to meet your need.

@Jeffwan Jeffwan merged commit e565902 into ray-project:master Oct 2, 2021
@anencore94 anencore94 deleted the feature/update-samples branch October 7, 2021 04:16
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
* [feat]: update heterogeneous sample

* [feat]: update ray-cluster.mini.yaml

* update to use ports

* rollback the rayVersion field
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants