-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support using proxy subresources when connecting to Ray head node #1980
Conversation
b599778
to
530e237
Compare
0117668
to
1a03e48
Compare
@@ -218,9 +222,9 @@ func main() { | |||
ctx := ctrl.SetupSignalHandler() | |||
exitOnError(ray.NewReconciler(ctx, mgr, rayClusterOptions).SetupWithManager(mgr, config.ReconcileConcurrency), | |||
"unable to create controller", "controller", "RayCluster") | |||
exitOnError(ray.NewRayServiceReconciler(ctx, mgr, utils.GetRayDashboardClient, utils.GetRayHttpProxyClient).SetupWithManager(mgr), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better if we pass configapi.Configuration
to the NewRayServiceReconciler
and NewRayJobReconciler
functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary at the moment since we're only using a single field there. But we should revisit this in the future and when config API graduates to beta. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recently worked on an open-source event for newbies. This would be a good first issue for the event. I will open an issue later.
r.dashboardURL = "http://" + url | ||
} | ||
|
||
func (r *RayDashboardClient) WithKubernetesServiceProxy(svcNamespace, svcName string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function initializes the dashboard client. Perhaps we could update InitClient instead of creating a new function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point -- I'll try to update InitClient
to support both proxy / non-proxy clients and see if it make sense to combine them.
@@ -640,6 +664,21 @@ func (r *RayJobReconciler) getOrCreateRayClusterInstance(ctx context.Context, ra | |||
return rayClusterInstance, nil | |||
} | |||
|
|||
func (r *RayJobReconciler) getRayClusterInstance(ctx context.Context, rayJobInstance *rayv1.RayJob) (*rayv1.RayCluster, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we need this function. Perhaps we could directly use RayJobRayClusterNamespacedName
and r.Get(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, will update to just call RayJobRayClusterNamespacedName
Timeout: 20 * time.Millisecond, | ||
} | ||
} | ||
|
||
func (r *RayHttpProxyClient) WithKubernetesPodProxy(podNamespace, podName string, port int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function initializes the client. Perhaps we could update InitClient instead of creating a new function?
34b43a6
to
df8a3aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevin85421 addressed your comments, please take another look
@@ -218,9 +222,9 @@ func main() { | |||
ctx := ctrl.SetupSignalHandler() | |||
exitOnError(ray.NewReconciler(ctx, mgr, rayClusterOptions).SetupWithManager(mgr, config.ReconcileConcurrency), | |||
"unable to create controller", "controller", "RayCluster") | |||
exitOnError(ray.NewRayServiceReconciler(ctx, mgr, utils.GetRayDashboardClient, utils.GetRayHttpProxyClient).SetupWithManager(mgr), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary at the moment since we're only using a single field there. But we should revisit this in the future and when config API graduates to beta. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Note for myself
GetHTTPClient()
: controller-runtimeGetConfig().Host
: client-go
httpProxyURL string | ||
|
||
mgr ctrl.Manager | ||
useProxy bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It's better to use UseKubernetesProxy
for consistency.
No need to update this in this PR. I recently worked on an open-source event for newbies. This would be a good first issue for the event. I will open an issue later.
} | ||
|
||
func (r *RayHttpProxyClient) InitClient() { | ||
r.client = http.Client{ | ||
r.client = &http.Client{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly because GetHTTPClient() returns a pointer
dashboardURL string | ||
} | ||
|
||
func GetRayDashboardClient() RayDashboardClientInterface { | ||
return &RayDashboardClient{} | ||
func GetRayDashboardClientFunc(mgr ctrl.Manager, useProxy bool) func() RayDashboardClientInterface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to change from GetRayDashboardClient()
to GetRayDashboardClientFunc()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because the function signature is changed to now return a func that returns a RayDashboardClientInterface
func (r *RayDashboardClient) InitClient(url string) { | ||
r.client = http.Client{ | ||
func (r *RayDashboardClient) InitClient(url, svcNamespace, svcName string) { | ||
if r.useProxy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It's better to use UseKubernetesProxy
for consistency.
No need to update this in this PR. I recently worked on an open-source event for newbies. This would be a good first issue for the event. I will open an issue later.
@@ -1028,8 +1030,13 @@ func (r *RayServiceReconciler) updateStatusForActiveCluster(ctx context.Context, | |||
return err | |||
} | |||
|
|||
headSvcName, err := utils.GenerateHeadServiceName(utils.RayServiceCRD, rayClusterInstance.Spec, rayClusterInstance.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a more elegant method to initialize the dashboard client. Calling GenerateHeadServiceName every time we want to initialize the client makes the codebase complex and hard to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I thought this was kind of ugly too. Maybe we can pass the entire RayCluster to InitClient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can pass the entire RayCluster to InitClient?
I can make this change in this PR if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make this change in this PR if you want
Thanks! It is helpful! Maybe we can consider adding the head service's name to RayCluster's status in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated InitClient to receive a RayCluster
so we can call GenerateHeadServiceName from inside InitClient
. This requires InitClient to now return an error
df8a3aa
to
a44271e
Compare
…ad node Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
a44271e
to
6dc84d3
Compare
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
6dc84d3
to
ef5db3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks! I will test it manually before I merge this PR.
I tested this PR manually.
|
thanks @kevin85421! I'll look into adding an e2e test as well |
…#1980 Signed-off-by: TingYi <a75896453@gmail.com>
Signed-off-by: TingYi <a75896453@gmail.com>
Signed-off-by: TingYi <a75896453@gmail.com>
…#1980 Signed-off-by: TingYi <a75896453@gmail.com>
Signed-off-by: TingYi <a75896453@gmail.com>
Signed-off-by: TingYi <a75896453@gmail.com>
Why are these changes needed?
There are some cases where Kuberay may not be able to directly connect to a Ray head node. For example, there might be a NetworkPolicy disallowing ingress from all Pods or KubeRay is running on a network with no connectivity to Pods. This PR allows Kuberay to use the
services/proxy
subresource to proxy HTTP requests to the Ray head node. This allows Kuberay to make requests to the head node without every connecting to it directly.Here are some sample HTTP requests in apiserver from my testing using the proxy subresource:
Checks