Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RayService object's Status is being updated due to frequent reconciliation #1065

Merged
merged 8 commits into from
May 8, 2023

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented May 5, 2023

Why are these changes needed?

The experiment result in #1062 shows that r.Status().Update() occurred 48 times during the RayCluster initialization phase and 239 times after the RayCluster completed the initialization process within 10 minutes. Hence, this PR mainly focuses on relieving the frequent status update caused by r.Status().Update() Final status update for any CR modification..

The root cause is that r.Status().Update() is called at the end of the function Reconcile no matter whether it should be updated or not. The frequent status update will hurt the performance of the Kubernetes cluster.

Refer to the source code of ReplicaSet and StatefulSet. They will check the difference between the original status and the new status to determine whether update the status or not. Hence, this PR implements a similar function inconsistentRayServiceStatuses to determine whether update status or not.

Implementation of inconsistentRayServiceStatuses

  • inconsistentRayServiceStatuses does not check LastUpdateTime and HealthLastUpdateTime:

    • If the only differences between the old and new status are the LastUpdateTime and HealthLastUpdateTime fields, the status update will not be triggered.
    • RayService controller will use HealthLastUpdateTime to determine whether the RayService is healthy or not (example). If the RayService is unhealthy, the controller will trigger a new RayCluster preparation. The controller only uses HealthLastUpdateTime when the status is not HEALTHY. Hence, it is fine to ignore the status update of HealthLastUpdateTime.
  • inconsistentRayServiceStatuses ignores RayClusterStatus.

    • rayservice_controller.go does not use any information in RayClusterStatus. It is totally for observability.

This solution is not good enough because the status is too complicated, so I list some items in the follow-up section.

Related issue number

Closes #1061
#1062

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

I performed an experiment similar to #1062, and the only difference is this experiment is 15 mins and #1062 is 10 mins. The result is (see this gist for more details):

Let the number of reconciliations be N.

  • 47 times status updates in the RayCluster initialization. => O(1) -> O(1)
  • 2 times status updates after the initialization finishes. => O(N) -> O(1)
    • The second one is caused by the inconsistency between the stale informer cache and Kubernetes API Server. This is common and protected by Kubernetes OCC.

Follow-up

  • The status of RayService is too complicated. We should revisit the status design based on the principle of Kubernetes API conventions.

  • Add observedGeneration to the logic of inconsistentRayServiceStatuses.

  • "47 times status updates in the RayCluster initialization." => This is definitely not good enough. I will optimize it.

  • Currently, RayService determines whether to trigger a new RayCluster preparation or not based on data plane logic (i.e. Ray). This is pretty weird and makes everything very complex because Ray has already had similar functionalities. For me, the better solution will be:

    • Use readiness probe/healthiness probe on the Ray head Pod to monitor the data plane.
    • RayCluster will handle the control plane logic (i.e. unhealthy Pods)
    • GCS fault tolerance will handle the data plane logic after the Pod is up again.
  • The key of the cache may not be unique in some special cases. In addition, Kubernetes operator should be stateless, but the cache means that the operator is stateful.

@kevin85421 kevin85421 changed the title WIP RayService object's Status is being updated due to frequent reconciliation May 8, 2023
@kevin85421 kevin85421 marked this pull request as ready for review May 8, 2023 06:06
@kevin85421
Copy link
Member Author

cc @msumitjain would you mind giving some feedbacks? Thanks!

@architkulkarni architkulkarni self-assigned this May 8, 2023
Copy link
Contributor

@shrekris-anyscale shrekris-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for the detailed PR description. The behavior looks good to me. I left few suggestions around wording.

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
Copy link
Contributor

@msumitjain msumitjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@shrekris-anyscale shrekris-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Thanks for responding to all my comments.

}

return ctrl.Result{RequeueAfter: ServiceDefaultRequeueDuration}, nil
}

func (r *RayServiceReconciler) inconsistentRayServiceStatus(oldStatus rayv1alpha1.RayServiceStatus, newStatus rayv1alpha1.RayServiceStatus) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add func comment to list what is decided to be inconsistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. 36f1050

@kevin85421 kevin85421 merged commit 795db0d into ray-project:master May 8, 2023
19 checks passed
@msumitjain
Copy link
Contributor

cc @msumitjain would you mind giving some feedbacks? Thanks!

Just a quick question : The frequency of reconcile function itself is too high. Is this Reconcile function gets called on some event ? Regardless should we limit the frequency by which this function can be called per sec maybe like a RateLimiter ?

@kevin85421
Copy link
Member Author

Just a quick question : The frequency of reconcile function itself is too high. Is this Reconcile function gets called on some event ? Regardless should we limit the frequency by which this function can be called per sec maybe like a RateLimiter ?

@msumitjain would you mind performing a similar experiment as #1062? We can add a RateLimiter, and there is an issue tracking progress #820.

lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
…ation (ray-project#1065)

RayService object's Status is being updated due to frequent reconciliation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RayService object's Status is being updated due to frequent reconciliation
5 participants