-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BGP route not withdrawn, if shared IP, externalTrafficPolicy=Local and no endpoints #295
Comments
probably related: #263 |
Thanks for the bug report! This sounds pretty terrible... And even more confusing because according to MetalLB's logs, it's doing exactly the right thing! I see a log entry where it triggers a withdraw of the BGP announcement for reason Digging into this, thanks. |
I think I have found the issue!
If I remove the check for |
Thanks for digging. Yeah, the refcount is important for shared IPs, because we only want to stop announcing if all services have gone away, otherwise we can end up with weird race conditions because of the lack of guarantees k8s gives us about event ordering. However, the key problem you've identified here is: when ref=0, |
But there is a seperate entry for every service in svcAds and if DeleteBalancer is called only this service is removed from svcAds, as long as another service uses this IP it will still be advertised, so no refcount would be needed. The bgp_controller.go contains the following comment, which I based my assumption on: |
Doh! Okay, I see the problem now. When I refactored everything for IP sharing, I used the internal model of layer2 mode as a reference, where I must refcount externally and only delete when the last service goes away. On the other hand, in BGP mode, because of the comment you noted, you have to delete an equal number of times as you create, otherwise announcements stay behind. So... Some tweaks needed, obviously :). I can't just delete the refcounting or it'll break layer2 mode. Thinking... |
Okay, the fix here is simply to move the refcounting into internal/layer2, and unconditionally SetBalancer in the main speaker code. It's late here, so I'll write that patch tomorrow and release 0.7.3 with the fix. Thanks for getting all the way to the root cause! |
BGP mode does its own "refcounting" through announcement deduplication. Fixes #295
Is this a bug report or a feature request?:
bug
What happened:
Created a deployment with a scale of 1 and two services with a shared IP.
Initially the BGP announcements are correct, but if I set the scale to 2 and then back to 1, the routes from the second node are not withdrawn.
What you expected to happen:
The same behaviour like non-shared IPs: The routes should be withdrawn if there are no local endpoints.
How to reproduce it (as minimally and precisely as possible):
(needs min. 2 nodes)
Create deployment and two services which share the same IP.
Set the scale to 2.
Set the scale to 1.
Check the BGP peer.
Anything else we need to know?:
When you kill the speaker container on the node who falsely announces the prefixes, the problem is temporary fixed.
Log of the affected speaker container:
announce:
withdraw:
Deployment:
Service 1:
Service 2:
Environment:
uname -a
):The text was updated successfully, but these errors were encountered: