Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
This PR is used to improve #946. This PR description will discuss the reasons why I make the change and why I do not make the change. This may be more memory-efficient (I do not have a chance to benchmark it.)
Indexer
This PR is to optimize indexer (cache) based on kubebuilder's tutorial (tutorial, code). In #946, we create a local cache for all Pods, including Ray Pods and other Pods. Currently, we only index Ray Pods.
resource event handler (not update)
Initially, I thought that we should implement all filter logic in the resource event handler (i.e.
CreateFunc
) and return true only when an event is issued by a Ray Pod. With this approach, we can make the functioneventReconcile
only handle event processing and not worry about event filtering. Therefore, the labelcommon.RayNodeLabelKey
is not necessary.However, I have since realized that the owners of the unhealthy events are Pods, which may include Ray Pods and other Pods. Therefore, we cannot use
GetControllerOf
to get the owner and determine whether the event belongs to a Ray Pod or not by checking whether the owner's kind is "RayCluster" or not.Hence, the implementation will be more complex. If we want to distinguish whether an event belongs to a Ray Pod or not, we need to
GetControllerOf(pod)
to get the owner, and check whether the kind is RayCluster or not.It will make the function eventReconcile easier to understand, but we need to have more I/O (read local cache) which may cause performance overhead. In summary, we should use
common.RayNodeLabelKey
to distinguish Ray Pods in eventReconcile until we understand the cost of the look-up operation.Fake client limitation
The fake client included in controller-runtime does not support the use of
client.MatchingFields
with ListOptions. For more details, see kubernetes-sigs/controller-runtime#866. We need to figure out all limitations of the fake client in the future, and may be we should figure out a better solution.Related issue number
Checks
Pass the test in #946.