-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Ray address env. #388
Add Ray address env. #388
Conversation
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
// if worker, use the service name of the head | ||
ip.Value = svcName | ||
} | ||
ip := v1.EnvVar{Name: RAY_IP, Value: rayIP} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what RAY_IP or RAY_PORT envs are for.
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
@@ -605,8 +607,10 @@ func (r *RayClusterReconciler) buildWorkerPod(instance rayiov1alpha1.RayCluster, | |||
podName := strings.ToLower(instance.Name + common.DashSymbol + string(rayiov1alpha1.WorkerNode) + common.DashSymbol + worker.GroupName + common.DashSymbol) | |||
podName = utils.CheckName(podName) // making sure the name is valid | |||
svcName := utils.GenerateServiceName(instance.Name) | |||
podTemplateSpec := common.DefaultWorkerPodTemplate(instance, worker, podName, svcName) | |||
pod := common.BuildPod(podTemplateSpec, rayiov1alpha1.WorkerNode, worker.RayStartParams, svcName, instance.Spec.EnableInTreeAutoscaling) | |||
headPort := common.GetHeadPort(instance.Spec.HeadGroupSpec.RayStartParams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could try to refactor to deduplicate the buildWorkerPod and and buildHeadPod logic.
The fact that HeadGroupSpec and WorkerGroupSpec are different types makes that a bit awkward, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, once we switch to Go 1.18 we could possibly use generics for this if they support this use case :)
Kind of overwhelmed today. I will help review the change tomorrow morning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks a lot for doing this! I left some small suggestions :)
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Will let GCS HA #294 get merged first to avoid blocking it with a merge conflict. |
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
@DmitriGekhtman HA PR has been merged. This one is unblocked now. Seem it does have few conflicts. |
The nightly compatibility test has been unhappy lately. |
Adds RAY_ADDRESS env variable for Ray containers. Allows using the natural-looking ray.init() to connect to Ray from within the cluster, rather than the mystical ray.init("auto") Fixes a bug: currently workers look to their own RayStartParams for the Ray GCS server port. They should look at the head's RayStartParams instead. This bug fix required modifying function signatures a bit.
Why are these changes needed?
Adds
RAY_ADDRESS
env variable for Ray containers.Allows using the natural-looking
ray.init()
to connect to Ray from within the cluster, rather than the mysticalray.init("auto")
Fixes a bug: currently workers look to their own RayStartParams for the Ray GCS server port. They should look at the head's RayStartParams instead. This bug fix required modifying function signatures a bit.
Related issue number
Closes #373
Checks
I tested connecting to Ray with
ray.init()
on the head node.