Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add apply configurations to generated client #1818

Merged
merged 4 commits into from
Jan 18, 2024

Conversation

astefanutti
Copy link
Contributor

Why are these changes needed?

This PR generates apply configurations for the Kuberay API client.

Related issue number

Closes #1811.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421 kevin85421 self-requested a review January 9, 2024 18:03
@kevin85421 kevin85421 self-assigned this Jan 9, 2024
@astefanutti
Copy link
Contributor Author

@kevin85421 the new e2e test, that covers SSA, currently fails because of the creationTimestamp fields from the corev1.PodTemplateSpec fields that are embedded into KubeRay API.

I need to review the possible solutions, but I might request your feedback on this if I fail to find a good solution.

@kevin85421
Copy link
Member

@astefanutti Do you mind giving me some keywords that I need to understand in order to have enough context to review this PR? Thanks!

@astefanutti
Copy link
Contributor Author

@kevin85421 you'll find a very good introduction about SSA here https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/.

Here is the breakdown of the changes introduced by this PR:

  • Adapt update-codegen.sh to generate apply configurations, note generate-groups.sh cannot be used, because it lacks support for external apply configurations (such as PodTemplateSpec), so the code-generator binaries are used directly
  • Re-generate the client, with the new apply configuration packages
  • Add an e2e test using SSA

)

var (
TestApplyOptions = metav1.ApplyOptions{FieldManager: "kuberay-test", Force: true}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this article and it said "Controllers typically should unconditionally set all the fields they own by setting Force: true in the ApplyOptions.". I can't understand what "the fields they own" means. Would you mind explaining why we need to add Force: true and what's the difference with and without Force: true? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Force option is used to control the resolution logic when conflicts occur. The possible strategies are documented at:

https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts

In the case of e2e tests here, the recommendation to use Force: true applies well. The e2e tests impersonate / emulate a real client / end-user, and what the tests specify using SSA apply configurations (e.g. RayCluster / RayJob spec) is what's owned by that client / end-user.

}
}

func headPodTemplateApplyConfiguration() *corev1ac.PodTemplateSpecApplyConfiguration {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we can create a RayJob custom resource by SSA using an applyconfiguration (e.g., headPodTemplateApplyConfiguration()), or we can directly construct the CR using headPodTemplate(). Should we consider gradually updating existing tests to use SSA for creating RayJob CRs in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be my suggestion / recommendation too. I find apply configurations fluent API style and SSA conflict management very convenient.

My original intention with the e2e tests was to use SSA, but that needed the work introduced by that PR first. We could gradually update the existing tests as you suggest.

Note the benefits of SSA are also applicable to the operator itself, but that would require more substantial work.

ray-operator/test/support/core.go Outdated Show resolved Hide resolved
ray-operator/test/support/core.go Outdated Show resolved Hide resolved

type labelSelector string

var _ Option[*metav1.ListOptions] = (*labelSelector)(nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChatGPT tells me that *"This line asserts that labelSelector implements the Option interface for metav1.ListOptions. It's a common way in Go to ensure at compile time that a type satisfies an interface.". Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's exactly that :)


// nolint: unused
// To be removed when the false-positivity is fixed.
func (l labelSelector) applyTo(to *metav1.ListOptions) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind sharing more details about "To be removed when the false-positivity is fixed."? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root issue is dominikh/go-tools#1294.

There are still a number of false positive w.r.t. generics support: https://github.com/dominikh/go-tools/issues?q=is%3Aopen+label%3Afalse-positive+generic.

ray-operator/test/support/meta.go Outdated Show resolved Hide resolved
Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q1: I haven't reviewed the files under pkg/client. Are they generated files?

I want to double-check the reason why we decided to use applyconfiguration.

  • There are two field managers, the KubeRay operator and kuberay-test, in e2e tests.
  • Prior to this PR, we used Update without SSA, but it may have led to some 409 conflicts.
  • In this PR, we use SSA Apply with Force: true. The Apply request will always succeed.

Is my understanding correct? Thanks!

@astefanutti
Copy link
Contributor Author

Q1: I haven't reviewed the files under pkg/client. Are they generated files?

Yes, these files are generated, as the other ones that already exist in pkg/client.

They are "validated" by the added e2e test.

I want to double-check the reason why we decided to use applyconfiguration.

  • There are two field managers, the KubeRay operator and kuberay-test, in e2e tests.
  • Prior to this PR, we used Update without SSA, but it may have led to some 409 conflicts.
  • In this PR, we use SSA Apply with Force: true. The Apply request will always succeed.

Is my understanding correct? Thanks!

Yes, your understanding is correct 👍🏼.

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I learned a lot from reviewing this PR. Thank you!

@kevin85421 kevin85421 merged commit 966d9b3 into ray-project:master Jan 18, 2024
24 checks passed
@astefanutti astefanutti deleted the pr-08 branch January 18, 2024 16:42
@andrewsykim
Copy link
Contributor

FYI I think CI is currently broken because the generated applyconfigurtaion was not updated after this change #1839

is missing. Running 'go mod download'.
>> Using /home/runner/go/pkg/mod/k8s.io/code-generator@v0.28.4
diffing ./hack/../pkg against freshly generated codegen
diff -Naupr ./hack/../pkg/client/applyconfiguration/ray/v1/rayservicestatus.go ./hack/../_tmp/pkg/client/applyconfiguration/ray/v1/rayservicestatus.go
--- ./hack/../pkg/client/applyconfiguration/ray/v1/rayservicestatus.go	2024-01-18 17:47:09.837[5](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:6)[6](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:7)3[7](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:8)13 +0000
+++ ./hack/../_tmp/pkg/client/applyconfiguration/ray/v1/rayservicestatus.go	2024-01-1[8](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:9) 17:46:26.866143286 +0000
@@ -6,6 +6,7 @@ package v1
 // with apply.
 type RayServiceStatusApplyConfiguration struct {
 	Applications     map[string]AppStatusApplyConfiguration `json:"applicationStatuses,omitempty"`
+	DashboardStatus  *DashboardStatusApplyConfiguration     `json:"dashboardStatus,omitempty"`
 	RayClusterName   *string                                `json:"rayClusterName,omitempty"`
 	RayClusterStatus *RayClusterStatusApplyConfiguration    `json:"rayClusterStatus,omitempty"`
 }
@@ -30,6 +31,14 @@ func (b *RayServiceStatusApplyConfigurat
 	return b
 }
 
+// WithDashboardStatus sets the DashboardStatus field in the declarative configuration to the given value
+// and returns the receiver, so that objects can be built by chaining "With" function invocations.
+// If called multiple times, the DashboardStatus field is set to the value of the last call.
+func (b *RayServiceStatusApplyConfiguration) WithDashboardStatus(value *DashboardStatusApplyConfiguration) *RayServiceStatusApplyConfiguration {
+	b.DashboardStatus = value
+	return b
+}
+
 // WithRayClusterName sets the RayClusterName field in the declarative configuration to the given value
 // and returns the receiver, so that objects can be built by chaining "With" function invocations.
 // If called multiple times, the RayClusterName field is set to the value of the last call.
diff -Naupr ./hack/../pkg/client/applyconfiguration/utils.go ./hack/../_tmp/pkg/client/applyconfiguration/utils.go
--- ./hack/../pkg/client/applyconfiguration/utils.go	2024-01-18 17:47:0[9](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:10).849563549 +0000
+++ ./hack/../_tmp/pkg/client/applyconfiguration/utils.go	2024-01-18 17:46:26.866[14](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:15)3286 +0000
@@ -[17](https://github.com/ray-project/kuberay/actions/runs/7573949808/job/20627230668?pr=1822#step:5:18),6 +17,8 @@ func ForKind(kind schema.GroupVersionKin
 		return &rayv1.AppStatusApplyConfiguration{}
 	case v1.SchemeGroupVersion.WithKind("AutoscalerOptions"):
 		return &rayv1.AutoscalerOptionsApplyConfiguration{}
+	case v1.SchemeGroupVersion.WithKind("DashboardStatus"):
+		return &rayv1.DashboardStatusApplyConfiguration{}
 	case v1.SchemeGroupVersion.WithKind("HeadGroupSpec"):
 		return &rayv1.HeadGroupSpecApplyConfiguration{}
 	case v1.SchemeGroupVersion.WithKind("HeadInfo"):
./hack/../pkg is out of date. Please run hack/update-codegen.sh

@andrewsykim
Copy link
Contributor

andrewsykim commented Jan 18, 2024

(I'll open a PR to fix it)

@andrewsykim
Copy link
Contributor

#1847

@andrewsykim
Copy link
Contributor

@astefanutti I've been finding that runing ./hack/update-codegen.sh is really slow. Is that just for me or is it expeted?

@astefanutti
Copy link
Contributor Author

@astefanutti I've been finding that runing ./hack/update-codegen.sh is really slow. Is that just for me or is it expeted?

@andrewsykim it takes around 2 minutes to complete on my setup. How long does that take on your side?

I'm not that surprised, compared to the data points I have from other projects. It may be embedding the PodTemplateSpec struct in multiple places contributes to increasing the generation time. It's a very large struct, which also increases the size of the CRDs significantly.

I don't see anything obvious we could do to optimise that generation time. That'd probably require to look deeper at code-generator. There is kubernetes/code-generator#69 open that could be related as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature][client] Generate apply configurations
3 participants