-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE-1037: operator - simplify controllers #4
CFE-1037: operator - simplify controllers #4
Conversation
193a482
to
4fa33c9
Compare
// Create a new cache for tracking the DNSNameResolver resources in | ||
// the DNSNameResolverNamespace. | ||
dnsNameResolverCache, err := cache.New(mgr.GetConfig(), cache.Options{ | ||
Scheme: mgr.GetScheme(), | ||
DefaultNamespaces: map[string]cache.Config{ | ||
config.DNSNameResolverNamespace: {}, | ||
}, | ||
}) | ||
if err != nil { | ||
return nil, nil, err | ||
} | ||
|
||
// Create a new cache to track the EndpointSlices corresponding to the | ||
// CoreDNS pods. | ||
corednsEndpointsSliceCache, err := cache.New(mgr.GetConfig(), cache.Options{ | ||
Scheme: mgr.GetScheme(), | ||
DefaultNamespaces: map[string]cache.Config{ | ||
config.OperandNamespace: {}, | ||
}, | ||
DefaultLabelSelector: labels.SelectorFromSet(labels.Set{ | ||
discoveryv1.LabelServiceName: config.ServiceName, | ||
}), | ||
}) | ||
if err != nil { | ||
return nil, nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest we still use the independent caches, rather than the mgr.GetCache()
and mgr.GetClient()
.
If we don't use dnsNameResolverCache and enable watching of the config.DNSNameResolverNamespace
by default by the manager's cache, then in CDO it'll always watch for the resources in that namespace even if the DNSNameResolver feature gate is not enabled.
Regarding using the client and not the cache for the coredns endpoint slice, I am a little bit sceptical about it. The minimum TTL for DNS resolution is 5 seconds and in that scenario the client will be sending requests to the api-server every 5 seconds. On the other hand the coredns pods' ips may not change that frequently. Thus it'll be more optimal to use the cache rather than the client.
PLMKyour thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't use dnsNameResolverCache and enable watching of the config.DNSNameResolverNamespace by default by the manager's cache, then in CDO it'll always watch for the resources in that namespace even if the DNSNameResolver feature gate is not enabled.
Why creating this controller if the feature gate is disabled?
Regarding using the client and not the cache for the coredns endpoint slice, I am a little bit sceptical about it. The minimum TTL for DNS resolution is 5 seconds and in that scenario the client will be sending requests to the api-server every 5 seconds. On the other hand the coredns pods' ips may not change that frequently. Thus it'll be more optimal to use the cache rather than the client.
The default controller-runtime's client is a split one and it does use the cache for the reads. As a matter of fact, it would be more consistent to use it everywhere.
Upd: updated to use controller runtime's client
everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't use dnsNameResolverCache and enable watching of the config.DNSNameResolverNamespace by default by the manager's cache, then in CDO it'll always watch for the resources in that namespace even if the DNSNameResolver feature gate is not enabled.
Why creating this controller if the feature gate is disabled?
We won't create the controller when the feature gate is disabled, however as we are using the manager's cache it'll anyway get the events for the config.DNSNameResolverNamespace
resources. If we use the specific caches then only when the controller is created at that time the cache will be created in CDO.
Regarding using the client and not the cache for the coredns endpoint slice, I am a little bit sceptical about it. The minimum TTL for DNS resolution is 5 seconds and in that scenario the client will be sending requests to the api-server every 5 seconds. On the other hand the coredns pods' ips may not change that frequently. Thus it'll be more optimal to use the cache rather than the client.
The default controller-runtime's client is a split one and it does use the cache for the reads. As a matter of fact, it would be more consistent to use it everywhere.
This will still have the issue regarding the cache getting events for the controller when it is not created.
@alebedev87 should we just stick to removing the dnsnameresolver-crd controller in this PR? We can take up changes to the dnsnameresolver controller in a different PR. We can discuss regarding the pros and cons in that PR. I also want to have @Miciah's opinion on this matter. This is something that was previously discussed and Miciah had shared the discussion that we had here (#2 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't create the controller when the feature gate is disabled, however as we are using the manager's cache it'll anyway get the events for the config.DNSNameResolverNamespace resources.
config.DNSNameResolverNamespace
is a namespace you can skip adding it to the default namespaces of the manager's cache if the feature gate is disabled. Then you won't create the controller therefore no Watch
will be used, no events either.
should we just stick to removing the dnsnameresolver-crd controller in this PR?
Then how would we resolve the dilemma in C-D-O?
I also want to have @Miciah's opinion on this matter. This is something that was previously discussed and Miciah had shared the discussion that we had here (#2 (comment)).
Miciah is on PTO till the end of May, we have take a decision on our own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just stick to removing the dnsnameresolver-crd controller in this PR?
Then how would we resolve the dilemma in C-D-O?
This is the dilemma specific for the helper controller (dnsnameresolver-crd controller). It's not related to the different caches being used in dnsnameresolver controller. We can still have multiple caches as can be seen here in cluster-ingress-operator which can be added to the manager when the controller is created: https://github.com/openshift/cluster-ingress-operator/blob/5b4f6283f48046b85f60bf9d75fa5c9222329cd1/pkg/operator/controller/route-metrics/controller.go#L43-L51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. Yes, we can follow up on the cache logic later. Let's keep this PR related to the removal of the crd controller.
@@ -1,6 +1,6 @@ | |||
module github.com/openshift/coredns-ocp-dnsnameresolver/operator | |||
|
|||
go 1.19 | |||
go 1.21 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also update this in the README as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
4fa33c9
to
87b9271
Compare
// want the controller to start if we need it. The dnsnameresolvercrd | ||
// controller starts it and the caches after it creates the | ||
// DNSNameResolver CRD. | ||
dnsNameResolverController, dnsNameResolverControllerCaches, err := |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are removing the caches, can we make the manager's cache to watch for only the OperandNamespace
and DNSNameResolverNamespace
namespaces?
87b9271
to
32afcda
Compare
if err := r.dnsNameResolverCache.Get(ctx, request.NamespacedName, dnsNameResolverObj); err != nil { | ||
if err := r.client.Get(ctx, request.NamespacedName, dnsNameResolverObj); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be reverted then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -40,18 +39,19 @@ type Config struct { | |||
|
|||
// reconciler handles the actual DNSNameResolver reconciliation logic in response to events. | |||
type reconciler struct { | |||
dnsNameResolverCache cache.Cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache needs to be added back to the reconciler struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
32afcda
to
6660746
Compare
- Remove dnsnameresolver-crd controller - Make dnsnameresolver managed - CRDs should be installed as prerequisite
6660746
to
e235150
Compare
@alebedev87: This pull request references CFE-1037 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@alebedev87: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arkadeepsen The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@snarayan-redhat @melvinjoseph86 @CFields651 adding the labels as the changes in this PR will be imported into openshift/cluster-dns-operator#394 /label docs-approved |
@alebedev87: This pull request references CFE-1037 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
The PR aims at removing
dnsnameresolver-crd
controller and making thednsnameresolver
controller managed. The required CRDs are installed as prerequisite so we don't need to ensure them at runtime. Thednsnameresolver
controller can be started by the controller runtime's manager.