-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug][k8s compatibility] k8s v1.20.7 ClusterIP svc do not updated under RayService #1110
[Bug][k8s compatibility] k8s v1.20.7 ClusterIP svc do not updated under RayService #1110
Conversation
@@ -118,6 +118,11 @@ func (in *DashboardStatus) DeepCopy() *DashboardStatus { | |||
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. | |||
func (in *HeadGroupSpec) DeepCopyInto(out *HeadGroupSpec) { | |||
*out = *in | |||
if in.HeadService != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open an issue: #1111
if err != nil { | ||
return err | ||
} | ||
raySvc.Name = utils.CheckName(raySvc.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions common.BuildHeadServiceForRayService
and common.BuildServeServiceForRayService
utilize utils.GenerateServiceName
and utils.GenerateServeServiceName
, respectively, which internally invoke CheckName
. Hence, delete this line.
if newSvc.Spec.ClusterIP == "" { | ||
newSvc.Spec.ClusterIP = oldSvc.Spec.ClusterIP | ||
} | ||
oldSvc.Spec = *newSvc.Spec.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to consider the updates of ObjectMeta
in subsequent PRs. We currently only take Spec
into consideration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a TODO in the code comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 3119e67
cc @jamm1985 |
It works fine. Is it possible to release v0.5.1 fix? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just a few minor questions (non-blocking)
r.Log.V(1).Info("reconcileServices update service") | ||
if updateErr := r.Update(ctx, rayService); updateErr != nil { | ||
r.Log.Error(updateErr, "raySvc Update error!", "raySvc.Error", updateErr) | ||
// Only update the service if the RayCluster switches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "Only update the service if the RayCluster switches" a new behavior change in this PR? What's the reasoning behind it? If the reason is easy to state it might be worth adding it to this code comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "Only update the service if the RayCluster switches" a new behavior change in this PR?
Yes
What's the reasoning behind it? If the reason is easy to state it might be worth adding it to this code comment.
Similar to #1065, the redundant Update
invocations will cause an unnecessary burden on the Kubernetes API Server.
if newSvc.Spec.ClusterIP == "" { | ||
newSvc.Spec.ClusterIP = oldSvc.Spec.ClusterIP | ||
} | ||
oldSvc.Spec = *newSvc.Spec.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a TODO in the code comment?
@kevin85421 ok. Thank you a lot for the quick fix. I'm looking forward to the new v0.6.0 release and wish you a pleasant work on it. |
…er RayService (ray-project#1110) [Bug][k8s compatibility] k8s v1.20.7 ClusterIP svc do not updated under RayService
Why are these changes needed?
As mentioned in #1105, in Kubernetes 1.20.7, the selector in the Serve service of RayService cannot be updated, but Kubernetes 1.23.0 does not have this issue.
The root cause of this issue is the immutability of the
ClusterIP
field. If a service is created without specifying theClusterIP
, Kubernetes automatically assigns it. Therefore, in thereconcileServices
function, theoldSvc
object retrieved from the Kubernetes cluster contains theClusterIP
value.Without this PR (as shown in the following code snippet),
raySvc
is derived from the CR spec, so the ClusterIP is""
. However, theClusterIP
in therayService
variable is assigned by Kubernetes. Therefore, this update is invalid since theClusterIP
field is immutable.kuberay/ray-operator/controllers/ray/rayservice_controller.go
Lines 804 to 830 in 7fe4050
Optional: Why does Kubernetes 1.23.0 not have this issue?
I used binary search to identify that the issue was resolved between Kubernetes v1.21.2 and Kubernetes v1.21.10.
Trace code (Kubernetes v1.21.10):
Kubernetes Service
Update
RESTful API (code) -> CallBeforeUpdate()
(code) -> Callstrategy.PrepareForUpdate
to prepare for update (code)Kubernetes Service's
PrepareForUpdate
strategy (code) -> CallpatchAllocatedValues
(code)patchAllocatedValues
: if the new service does not specify a ClusterIP, the function will assign the ClusterIP of the old service to the new one. Hence, the update will not updateClusterIP
from non-empty to empty.The function
patchAllocatedValues
is added in Kubernetes v1.21.5.Related issue number
Closes #1105
Checks