-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
watchers: Fix BGP subscriber potentially getting skipped #16341
watchers: Fix BGP subscriber potentially getting skipped #16341
Conversation
e510039
to
d3ad4dc
Compare
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How reliable was the bug? Any chance we could have caught it with the right CI coverage?
@joestringer If we had CI coverage with more advanced configuration like setting node-selectors, then we would have very likely caught this. I'll open a PR to add this configuration to the existing test, just so we have regression coverage. |
It is possible for the BGP speaker subscriber to be skipped in a K8s node event if the host endpoint is not yet created. This can happen at the very early stages of Cilium startup, as a K8s node add event is sent to the K8s watchers as one of the first events It is also often sent before any endpoints have been generated. If that happends, then the consequence is that the MetalLB integration is not seeded with the node labels, which can prevent peering with the BGP routers if the user has node-selectors defined in their BGP configuration. In other words, the MetalLB integration would try to match the selectors against empty labels, which will always fail. The short term fix is to move the BGP speaker logic slightly above where the host endpoint logic can return so that it is guaranteed to always be executed. Longer term, we have an issue cilium#15471 to refactor the subscribers so that they are executed separately, rather than bundled into one function like (*K8sWatcher).updateK8sNodeV1(). That would have prevented this bug. Fixes: d8dbb82 ("daemon, bgp, watchers: Implement LB IP announcement via BGP") Fixes: cilium#16340 Signed-off-by: Chris Tarazi <chris@isovalent.com>
d3ad4dc
to
b9eb68c
Compare
We have approving reviews and passing CI. Marked as ready to merge. |
We recently had a regression (cilium#16340) that occurred when the user specified node-selectors in their BGP configmap. The node-selectors were not picked up due to the bug that was fixed in cilium#16341. This commit is to add regression testing for the BGP integration. Signed-off-by: Chris Tarazi <chris@isovalent.com>
We recently had a regression (#16340) that occurred when the user specified node-selectors in their BGP configmap. The node-selectors were not picked up due to the bug that was fixed in #16341. This commit is to add regression testing for the BGP integration. Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit 8ba6c28 ] We recently had a regression (cilium#16340) that occurred when the user specified node-selectors in their BGP configmap. The node-selectors were not picked up due to the bug that was fixed in cilium#16341. This commit is to add regression testing for the BGP integration. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 8ba6c28 ] We recently had a regression (#16340) that occurred when the user specified node-selectors in their BGP configmap. The node-selectors were not picked up due to the bug that was fixed in #16341. This commit is to add regression testing for the BGP integration. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
It is possible for the BGP speaker subscriber to be skipped in a K8s
node event if the host endpoint is not yet created. This can happen at
the very early stages of Cilium startup, as a K8s node add event is sent
to the K8s watchers as one of the first events It is also often sent
before any endpoints have been generated. If that happends, then the
consequence is that the MetalLB integration is not seeded with the node
labels, which can prevent peering with the BGP routers if the user has
node-selectors defined in their BGP configuration. In other words, the
MetalLB integration would try to match the selectors against empty
labels, which will always fail.
The short term fix is to move the BGP speaker logic slightly above where
the host endpoint logic can return so that it is guaranteed to always be
executed. Longer term, we have an issue
#15471 to refactor the
subscribers so that they are executed separately, rather than bundled
into one function like (*K8sWatcher).updateK8sNodeV1(). That would have
prevented this bug.
Fixes: d8dbb82 ("daemon, bgp, watchers: Implement LB IP announcement via
BGP")
Fixes: #16340
Signed-off-by: Chris Tarazi chris@isovalent.com