Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enrich RayCluster status with head IPs #468

Merged
merged 1 commit into from
Nov 4, 2022

Conversation

davidxia
Copy link
Contributor

@davidxia davidxia commented Aug 13, 2022

like so. Missing IPs will be excluded.

status:
  head:
    podIP: string
    serviceIP: string

Motivation for this change is so that users can get the head IP from the
RayCluster CRD directly without knowing the underlying implementation
details of kuberay.

In a later commit we update the API server to parse the status
and return the IPs for each cluster.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@ulfox
Copy link
Contributor

ulfox commented Aug 13, 2022

Hi,

Not sure how the ApiServer is communicating with the head nodes, but after viewing this PR I thought that the headNode's service could be a safer choice instead of the headNode's IP. The Kubernetes service is resolvable automatically within the Kubernetes network domain and it should guarantee delivery to the pod since it is a resource managed by Kubernetes itself

@akanso
Copy link
Collaborator

akanso commented Aug 13, 2022

I am not sure that adding the IP address of the headnode as an annotation is a good practice. Especially that IP address are ephemeral and can change. Plus the IP address is already part of the pod status

status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-08-09T09:17:46Z"
    status: "True"
    type: Initialized
  hostIP: 10.1.16.39
  phase: Running
  podIP: 10.244.12.74
  podIPs:
  - ip: 10.244.12.74

Why not as Christos mentioned use the service name. Plus if a component can read the annotations is there a reason why it cannot read the status as well and get the IP address.

Finally the IP is assigned at a later stage after the pod is created by the Ray Operator. So the Ray Operator might not be the best entity to read the IP and annotate the pod with it (unless we poll the API-Server to get the IP).

Can you please elaborate more on why this change is needed?

Thanks!

@davidxia
Copy link
Contributor Author

Thanks for reviewing. I just want the API to include the head IP when getting or listing clusters so that users don’t have to do it themselves which requires knowledge of K8s.

Should the API itself do all the work of getting the IP by looking for the head Pod instead of the controller? If so, is it OK that the API does this work each time it lists or gets clusters or should it save the IP somewhere?

regarding pod vs service IP, my use case is GKE and clients that will connect to the head pod from outside the K8s cluster. In this case the Service IP is not routable outside the cluster. We need to connect directly to the Pod IP. Any ideas how to accommodate this use case?

@davidxia
Copy link
Contributor Author

davidxia commented Aug 13, 2022

Any ideas how to accommodate this use case?

Can we make API return both Pod and Service IP?

@akanso
Copy link
Collaborator

akanso commented Aug 14, 2022

@davidxia for the GKE case, when you need external access to the cluster, an alternative is to use a K8s Service that is of type load-balancers or NodePort. Exposing an individual pod IP address externally is a very unusual case. Of all the customers I worked with at IBM and Microsoft, we never had a single client exposing a pod externally directly.
You can do it through ingress, service or a cloud load balancer/gateway.

I am not exactly sure what you mean by wanting the API to expose the IP address. Would something like kubectl get pod -owide -l=rayNodeType=head work in your case, or do you need more?

@davidxia
Copy link
Contributor Author

I am not exactly sure what you mean by wanting the API to expose the IP address. Would something like kubectl get pod -owide -l=rayNodeType=head work in your case, or do you need more?

Yes that kubectl command is what I’m doing now. But I’m hoping kuberay can abstract away more of K8s implementation details from user. In particular I wanted the API endpoints like this https://github.com/ray-project/kuberay/tree/master/apiserver to return the head IP in cluster entities. Does that make sense?

Exposing an individual pod IP address externally is a very unusual case.

At Spotify we use a single GCP VPC and GKE clusters connected to that VPC. So all Pod IPs are routable from within the VPC including offices and VPN. We don’t use Service with load balancer because for a long time GCP didn’t support internal load balancers whose IPs are routable globally. They were either routable globally but exposed to public Internet or internal but regional. Because of how we set up our project permissions, creating Service with internal load balancers now for many RayClusters is difficult and also I’m not sure the KubeRay operator supports this. That’s why I’m hoping you’re open to adding both Service and Pod IP here. Otherwise, we can’t really use this feature even if it’s accepted.

happy to provide more context!

@DmitriGekhtman
Copy link
Collaborator

DmitriGekhtman commented Sep 12, 2022

@davidxia I'm just going over the open PRs.
Are you still interested potentially in merging this?

My two cents:
To me, it seems reasonable to include the head pod's ip under the RayCluster CR's status field.
Putting it in annotations I agree is a little odd.

cc @akanso @Jeffwan for thoughts

@akanso
Copy link
Collaborator

akanso commented Sep 12, 2022

@DmitriGekhtman , I agree, the IP is a runtime generated value, and having it in the status field makes sense to me.

@davidxia
Copy link
Contributor Author

Great I can rework this to use the status field. How does the following look? Mind if I include both Pod and service IP for reasons described above?

status:
  head:
    podIP: string
    serviceIP: string

@DmitriGekhtman
Copy link
Collaborator

That schema looks good to me.

@davidxia davidxia force-pushed the patch1 branch 3 times, most recently from f4338b4 to ab1301b Compare October 19, 2022 21:36
@davidxia
Copy link
Contributor Author

@DmitriGekhtman I just updated to use .status.head instead of annotation. Lmk if this draft looks OK and I'll polish it up and mark it ready for review by more people.

@akanso
Copy link
Collaborator

akanso commented Oct 19, 2022

can we change the PR name, from annotate to enrich cluster status with head pod ip?

@davidxia davidxia changed the title feat: annotate RayCluster with head node IP feat: enrich RayCluster status with head IPs Oct 19, 2022
@davidxia
Copy link
Contributor Author

Thanks for the initial review. Shall I go ahead and write tests? I also noticed the compatibility test fail. Is that expected?

like so. Missing IPs will be excluded.

```
status:
  head:
    podIP: string
    serviceIP: string
```

Motivation for this change is so that users can get the head IP from the
RayCluster CRD directly without knowing the underlying implementation
details of kuberay.

In a later commit we update the API server to parse the status
and return the IPs for each cluster.
@davidxia davidxia marked this pull request as ready for review November 1, 2022 17:16
@davidxia
Copy link
Contributor Author

davidxia commented Nov 1, 2022

Added unit tests. Ready for review.

@DmitriGekhtman DmitriGekhtman merged commit c1d303e into ray-project:master Nov 4, 2022
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
like so. Missing IPs will be excluded.

```
status:
  head:
    podIP: string
    serviceIP: string
```

Motivation for this change is so that users can get the head IP from the
RayCluster CRD directly without knowing the underlying implementation
details of kuberay.

In a later commit we update the API server to parse the status
and return the IPs for each cluster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants