New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator: Fix logic used to sync Cilium's IngressClass on startup #28663
operator: Fix logic used to sync Cilium's IngressClass on startup #28663
Conversation
186fc04
to
d1f8d79
Compare
0e2d78d
to
b116bc7
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I've done a first pass (skipped the tests, for now) and left a question to better understand the reason behind the changes.
b116bc7
to
9844695
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this @learnitall
Some initial questions & proposals.
9844695
to
2b49098
Compare
@pippolo84 @mhofstetter I made some changes to address your feedback. Let me know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I like the way you changed the Run
method in the ingress controller, I think this is effective and simpler!
I've left some nits about the test, but nothing blocking. Only thing left to address is the check on the ingressClassEvents
channel.
This commit introduces changes to the ingress class manager piece of the ingress controller, in order to address bugs impacting the proper synching of Cilium's IngressClass during start up. The following changes are made: * Replace use of an Informer with Resource[T] for IngressClass. This helps simplify the logic used to perform the initial sync. * Move the responsibility of tracking if Cilium should act as the default IngressClass into the ingress class manager, rather than having the ingress controller track this itself when processing IngressClass events. After the ingress class manager is constructed, the ingress controller will be able to determine if Cilium is the default IngressClass for a cluster through the ingress class manager. The ingress controller no longer has to wait to process an event for Cilium's IngressClass to learn if Cilium should be the default. Before this commit, the ingress controller would process all Ingress resources before processing IngressClass resources. This is because the Ingress resource informer would be started before the ingress class manager, so all events related to Ingress resources would appear in the ingress controller's event queue before events relating to IngressClass resources. This presented a problem, because the ingress controller would always believe that it was not the default IngressClass for a cluster on startup while processing each Ingress resource for the first time. This could lead to the following situation: 1. The ingress controller processes all Ingress resources. 2. The ingress controller processes IngressClass resources, and learns that it should act as the default IngressClass for the cluster. 3. A resync of Ingress resources is triggered. This double-sync overhead can act as a problem for large-scale clusters. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
2b49098
to
218ecae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks again!
/test |
The ingressClassManager emits a warning log while handling an Upsert event on IngressClasses, however, this log should be at the debug level. This log was set to the warning level while testing cilium#28663 and never changed back to debug. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
The ingressClassManager emits a warning log while handling an Upsert event on IngressClasses, however, this log should be at the debug level. This log was set to the warning level while testing cilium#28663 and never changed back to debug. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.
Please see the commit for more details. The TL;DR here is that the Ingress Controller in the Cilium Operator will always assume that Cilium is the non-default IngressClass on startup. This is because the Ingress Controller processes events from its Ingress resource Informer before processing events from its IngressClass Informer (which is a part of the
ingressClassManager
).This bug may cause two full reconciliations of Ingress resources on the startup of the operator: one after the Ingress resource Informer is synced, and one after the ingress controller learns Cilium should act as the default IngressClass in the cluster.