Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][raycluster controller] No error if head ip cannot be determined. #701

Merged

Conversation

DmitriGekhtman
Copy link
Collaborator

@DmitriGekhtman DmitriGekhtman commented Nov 9, 2022

Signed-off-by: Dmitri Gekhtman dmitri.m.gekhtman@gmail.com

Why are these changes needed?

#468 has the RayCluster controller throw an error if head ip can't be determined while updating RayCluster status
That is problematic for two reasons:

  1. Head pod ip is not determined while the head pod is pending placement.
  2. Status updating logic should be best-effort. If there's a brief inconsistency leading to multiple or no head pods listed, the reconciler should take appropriate action, but the status processing logic doesn't need to fail.

Throwing an error in the first situation might be leading to test failures:
https://github.com/ray-project/kuberay/actions/runs/3424423296/jobs/5706807752

This PR modifies the headIP status logic to be best effort -- if the IP cannot be determined, it is recorded as an empty string.

cc @davidxia

Related issue number

#468

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @DmitriGekhtman for this contribution!

Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Copy link
Contributor

@davidxia davidxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

ray-operator/controllers/ray/raycluster_controller.go Outdated Show resolved Hide resolved
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
@DmitriGekhtman DmitriGekhtman merged commit 0579c90 into ray-project:master Nov 10, 2022
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
ray-project#701)

This PR modifies headIP status logic to be best effort -- if the IP cannot be determined, it is recorded as an empty string.

Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants