-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues in heterogeneous sample #45
Conversation
command: ["python"] | ||
command: ["sleep"] | ||
args: | ||
- '/opt/code.py' | ||
- 'infinity' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason you meet this failure is this requires user to create configmap.yaml ealier.
@akanso WDYT? Does Ray core has example code to use so we don't need to mount our own codes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason you meet this failure is this requires user to create configmap.yaml ealier.
No, actually it isn't. I've already meet that fail case when I didn't create configmap.yaml file earlier. In that case, raycluster-heterogeneous-head pod stuck at CONTAINERCREATING with following error:
So, As you can see the earlier screenshot's heterogeneous-head pod status changed from RUNNING to ERROR, so I could see log above. (If pod stuck at CONTAINERCREATING, I couldn't see pod's log at all.)
There If configmap wasn't mounted, the error should be as above. And if the configmap.yaml was something wrong, the error should be something different such as 'there is no such file at /opt/code.py'. That's why I suspect the ray version in rayproject/autoscaler:latest image is weird.
I used the configmap.yaml file from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sound good to me.
@anencore94 Thanks for the contribution. We are still in early stage. Feel free to raise more feature request to meet your need. |
* [feat]: update heterogeneous sample * [feat]: update ray-cluster.mini.yaml * update to use ports * rollback the rayVersion field
ray-cluster.mini.yaml
ray-cluster.mini.yaml
. So, I've changed to specify name for each port and add 10001 port to use external python clientray-cluster.heterogeneous.yaml
ray-cluster.heterogeneous.yaml
doesn't work in raw. So I changed some fields to make it work.Please take a look and give some feedback :)