Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor][RayCluster] RayClusterHeadPodsAssociationOptions and RayClusterWorkerPodsAssociationOptions #2023

Merged

Conversation

rueian
Copy link
Contributor

@rueian rueian commented Mar 15, 2024

Why are these changes needed?

Introduce the following association functions to centralize many ad-hoc MatchingLabels that are scattered around:

  • RayClusterHeadPodsAssociationOptions
  • RayClusterWorkerPodsAssociationOptions
  • RayClusterGroupPodsAssociationOptions
  • RayClusterAllPodsAssociationOptions

These functions return options of a new type AssociationOptions that can convert the options into either a []client.ListOption or a []client.DeleteAllOfOption so that we can use them like this:

filters := common.RayClusterWorkerPodsAssociationOptions(instance)

r.List(ctx, &corev1.Pod{}, filters.ToListOptions()...)

r.DeleteAllOf(ctx, &corev1.Pod{}, filters.ToDeleteOptions()...)

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@rueian rueian force-pushed the refactor-raycluster-association-options branch 2 times, most recently from 4870479 to f073326 Compare March 15, 2024 17:12
@rueian rueian force-pushed the refactor-raycluster-association-options branch from f073326 to 6b8f9e6 Compare March 15, 2024 17:16
@rueian rueian marked this pull request as ready for review March 16, 2024 03:08
@rueian
Copy link
Contributor Author

rueian commented Mar 16, 2024

Hi @kevin85421,

What do you think about the new AssociationOptions that allows us to convert associations to either a slice of ListOption or DeleteAllOfOption?

I'd like to have your feedback, thanks!

@kevin85421
Copy link
Member

What do you think about the new AssociationOptions that allows us to convert associations to either a slice of ListOption or DeleteAllOfOption?

LGTM

@kevin85421 kevin85421 self-requested a review March 16, 2024 06:07
@kevin85421 kevin85421 self-assigned this Mar 16, 2024
@rueian
Copy link
Contributor Author

rueian commented Mar 18, 2024

Hi @evalaiyc98, could you also help review this?

@evalaiyc98
Copy link
Contributor

Sure!

@@ -33,6 +33,66 @@ func RayClusterHeadlessServiceListOptions(instance *rayv1.RayCluster) []client.L
}
}

type AssociationOption interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not important: maybe we can use generic.

@@ -149,6 +149,113 @@ func TestRayClusterHeadlessServiceListOptions(t *testing.T) {
}
}

func TestRayClusterHeadPodsListOptions(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests can't ensure that the functions work as expected. It is very easy to forget to update listOptions below if we update labels. How about create a test in raycluster_controller_test.go to test whether the functions can get the correct number of Pods? The test will create a RayCluster with 2 worker groups so that we can test the four association functions above.

Copy link
Contributor Author

@rueian rueian Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating a new test, how about replacing the usages of MatchingLabels with the new AssociationOption in existing tests? I found the existing tests can cover all the four association functions above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the new AssociationOption to associate anything in the codebase (not necessary to include all of them in this PR).

I found the existing tests can cover all the four association functions above.

Which existing tests are you referring to? Do you mean envtest or others? I think envtest is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the existing envtests in the raycluster_controller_test.go.
I refactored them in the latest commit 8e7a7c3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

workerFilterLabels := client.MatchingLabels{utils.RayClusterLabelKey: rayCluster.Name, utils.RayNodeGroupLabelKey: rayCluster.Spec.WorkerGroupSpecs[0].GroupName}
headFilterLabels := client.MatchingLabels{utils.RayClusterLabelKey: rayCluster.Name, utils.RayNodeGroupLabelKey: utils.RayNodeHeadGroupLabelValue}
workerFilters := common.RayClusterGroupPodsAssociationOptions(rayCluster, rayCluster.Spec.WorkerGroupSpecs[0].GroupName).ToListOptions()
headGroupFilters := common.RayClusterGroupPodsAssociationOptions(rayCluster, utils.RayNodeHeadGroupLabelValue).ToListOptions()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to have both headGroupFilters and headFilters? The primary goal of this refactoring is to maintain consistency in the association method by always using the same function to associate two Kubernetes resources. If there is no special reason, we should remove headGroupFilters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kevin85421, actually the original headFilterLabels were equivalent to headGroupFilters. The headFilters are newcomers.

I found that the ray.io/group is still needed for a head Pod due to this:

// RAY_NODE_TYPE_NAME is used by Ray Autoscaler V2 (alpha). See https://github.com/ray-project/kuberay/issues/1965 for more details.
nodeGroupNameEnv := corev1.EnvVar{
Name: utils.RAY_NODE_TYPE_NAME,
ValueFrom: &corev1.EnvVarSource{
FieldRef: &corev1.ObjectFieldSelector{
FieldPath: fmt.Sprintf("metadata.labels['%s']", utils.RayNodeGroupLabelKey),
},
},
}

Shouldn't we still test ray.io/group for a head Pod? We can leave a warning that we should normally use headFilters instead of headGroupFilters to do associations. The headGroupFilters are only meant to test the ray.io/group=headgroup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we still test ray.io/group for a head Pod?

We have already had unit tests for ray.io/group. It's better to remove headGroupFilters from the env tests.

@evalaiyc98
Copy link
Contributor

I've noticed that the logic used here still resembles the original approach. I believe it would be beneficial to maintain consistency with the previous refactor part. What are your thoughts on this?

headFilterLabels := client.MatchingLabels{
utils.RayClusterLabelKey: rayClusterName,
utils.RayNodeTypeLabelKey: string(rayv1.HeadNode),
}

@kevin85421 kevin85421 merged commit 83f8dcc into ray-project:master Mar 21, 2024
23 checks passed
kevin85421 pushed a commit to kevin85421/kuberay that referenced this pull request Mar 22, 2024
kevin85421 added a commit that referenced this pull request Mar 22, 2024
…ons and RayClusterWorkerPodsAssociationOptions (#2023) (#2035)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants