-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor search reconciling logic #3670
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kindly add e2e case to cover this scenario.
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## master #3670 +/- ##
==========================================
+ Coverage 55.68% 55.70% +0.01%
==========================================
Files 222 223 +1
Lines 21195 21233 +38
==========================================
+ Hits 11803 11828 +25
- Misses 8765 8774 +9
- Partials 627 631 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Thanks @tedli @whitewindmills |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks~
Still reviewing.
|
||
c.queue.Add(cls) | ||
clusterSet := sets.New[string]() | ||
for _, registry := range []*searchv1alpha1.ResourceRegistry{newRR, oldRR} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This processing may result in some redundancy. For certain clusters in rr, there have been no changes and we can focus only on the ones with differences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This controller reconcile the relationship between clusters and registries, what's more the key of the controller's queue is cluster name.
At this point, we don't know cluster changed or not, the only thing we know is if reflect.DeepEqual(oldRR.Spec, newRR.Spec)
not true
, the registry must changed.
So it's need to do some reconcile. But, as I mentioned above, the key of the controller queue is cluster name, this leave us only 2 choices, 1 is to queue the cluster name, do reconcile at cluster level, that's what I did; 2 is to introduce another worker, which the key of queue is registry name, then we can reconcile at registry level.
However, if we decice to do using method 2, then we also have to retrieve clusters of old registry spec, and clusters of new registry spec, to find out what changed, then reconcile changed clusters, because finally reconciling multi cluster informer, is what we want.
It seemd not possible to reduce the redunancy, since if cluster-registry relationship not changed, the resources may also be changed, like previously we want to cache Pod
, but currently we no longer want caching Pod
, the cluster-registry relation not changed, but the resource did. That's why I use diff
logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
defer func() { | ||
updated = cr | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cr
a pointer type, do we need to handle this step separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we not have to do this.
But by doing this we have 2 benefits, 1 is that the input cr arg could be nil
; 2 is that at the callee level we know the cr may be changed by this function, which is more expressive.
That's some like append
, a = append(a, obj)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thx~
pkg/search/controller.go
Outdated
for _, name := range added { | ||
cr.registries[name] = struct{}{} | ||
addedResources = c.getRegistryAddedResources(registries[name], cr, addedResources) | ||
} | ||
for _, name := range removed { | ||
delete(cr.registries, name) | ||
for resource, previousRegistries := range cr.resources { | ||
delete(previousRegistries, name) | ||
if len(previousRegistries) == 0 { | ||
removedResources = append(removedResources, resource) | ||
} | ||
} | ||
for _, resource := range removedResources { | ||
delete(cr.resources, resource) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to handle deletions first before handling additions, because the added elements may contain deleted ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't got you, did you mean it's better to handle removed
(L215) before added
(L211)?
Either is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I mean is to delete first and then add, so as to prevent the elements from being cleared when deleting and adding elements that contain the same content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
for _, registry := range matched { | ||
addedResources = c.getRegistryAddedResources(registry, cr, addedResources) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This processing step feels redundant. All the matched RRs should have already been processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there may be added ones, and removed ones. But still remaining ones may have resources changed.
Like, registry target cluster is aaa=bbb
, resources, Pod
, Node
, cluster label changed, and resources also changed, like adding Event
.
So we not only need to handle registries added or removed, but also need to handle resources changes.
cr = &crv | ||
addedRegistries, removedRegistries = util.DiffKey(cr.registries, matchedRegistries) | ||
} | ||
if len(addedRegistries) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we omit these judgments to improve the code readability and reduce cyclomatic complexity? Same as below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at first I thought it's not need to stop the informer, just remove uneeded resources. So I did diff to find out what are added, what are removed.
But it's not work, we have to stop a informer if something changed, to init a new one. Unless we do a huge refactor of informer.
if resourcesChanged { | ||
klog.Infof("Create new informer for cluster %s", cluster) | ||
c.InformerManager.Stop(cluster) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to stop the informer before making changes to the resources, or can we directly modify the current informer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, like the comment above, I firstly came the same idea as you, so I did diff
to find out what are added and what are removed. But it just not work.
You can find the first commit of this PR, I did the exact same thing, modifing the current informer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
karmada/pkg/util/fedinformer/genericmanager/single-cluster-manager.go
Lines 19 to 22 in a9df865
type SingleClusterInformerManager interface { | |
// UnwatchResource removes all existing event handlers from the informer of the given | |
// resource. | |
UnwatchResource(resource schema.GroupVersionResource) |
Here, the first commit of this PR, I tried to add a UnwatchResource(resource schema.GroupVersionResource)
, without success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thx~
} | ||
) | ||
|
||
func TestDiffKey(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this test achieve condition coverage? Consider the boundary conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Everything has been reviewed once, and only some testing issues remain.
test/e2e/search_test.go
Outdated
var proxyList *corev1.NodeList | ||
gomega.Eventually(func(g gomega.Gomega) { | ||
var err error | ||
proxyList, err = proxyClient.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current modifications made in PR should not involve any changes in the search/proxy functionality. This test example may have been placed in the wrong location and can be moved to this side of line 344.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/e2e/search_test.go
Outdated
@@ -710,5 +710,72 @@ var _ = ginkgo.Describe("[karmada-search] karmada search testing", ginkgo.Ordere | |||
}) | |||
}) | |||
}) | |||
ginkgo.Describe("update cluster testing", ginkgo.Ordered, func() { | |||
ginkgo.Context("cluster labeled after registry created", ginkgo.Ordered, func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't touch this test case, it's provided by other commiter previously. It seemed that the diff algo of github get confused, it marks this line green treated as modification of this pr, but it's not.
pkg/util/common_test.go
Outdated
t.Errorf("removed = %v, want []string{\"kubefed\"}", removed) | ||
} | ||
}) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo lint errors here:
Installing golangci-lint v1.[5](https://github.com/karmada-io/karmada/actions/runs/5387138616/jobs/9778049938?pr=3670#step:5:6)2.2
golangci/golangci-lint info checking GitHub for tag 'v1.52.2'
golangci/golangci-lint info found version: 1.52.2 for v1.52.2/linux/amd[6](https://github.com/karmada-io/karmada/actions/runs/5387138616/jobs/9778049938?pr=3670#step:5:7)4
golangci/golangci-lint info installed /home/runner/go/bin/golangci-lint
Error: pkg/util/common_test.go:[8](https://github.com/karmada-io/karmada/actions/runs/5387138616/jobs/9778049938?pr=3670#step:5:9)6:2: S1023: redundant `return` statement (gosimple)
return
^
Please review the above warnings.
If the above warnings do not make sense, feel free to file an issue.
Error: Process completed with exit code 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/e2e/search_test.go
Outdated
@@ -710,5 +756,42 @@ var _ = ginkgo.Describe("[karmada-search] karmada search testing", ginkgo.Ordere | |||
}) | |||
}) | |||
}) | |||
ginkgo.Describe("update cluster testing", ginkgo.Ordered, func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new add test block can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/e2e/search_test.go
Outdated
m1Dynamic = framework.GetClusterDynamicClient(member1) | ||
gomega.Expect(m1Dynamic).ShouldNot(gomega.BeNil()) | ||
proxyConfig := *restConfig | ||
proxyConfig.Host += "/apis/search.karmada.io/v1alpha1/proxying/karmada/proxy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tedli, this e2e might not be written correctly. The test coverage does not include the modified content. In this revision, we have added handling for cluster modification events, but this only applies to search cache and not search proxy capability.
Have you tested the new processing logic locally? If it passes the local test, please share the test results.
And then we can submit an E2E test for the modification in a subsequent PR to promote the integration of the current refactoring code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi reminder that @tedli might not be familiar with E2E tests. Can we go with a simple PR first? If you think we need an E2E test, open an issue to track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying that because this PR has been blocked for weeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi all,
Sorry for late reply. I will remove e2e test case, and I'm glad to take a deeper dive with karmada e2e test, and then finish #3731
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tedli
I think we can merge now :) |
Signed-off-by: lizhen6 <lizhen6@360.cn>
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: RainbowMango The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #3669
Special notes for your reviewer:
Does this PR introduce a user-facing change?: