-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
local-port annotation not respected by Cilium BGP #24737
Comments
I have reproduced this bug with cilium-agent built from commit 7755f39. When level=info msg="type:STATE peer:{conf:{local_asn:65341 neighbor_address:\"172.18.0.2\" peer_asn:65340} state:{local_asn:65341 neighbor_address:\"172.18.0.2\" peer_asn:65340 session_state:ESTABLISHED router_id:\"172.18.0.2\"} transport:{local_address:\"172.18.0.3\" local_port:36577 remote_port:179}}" When the node annotation is level=info msg="Cilium BGP Control Plane Controller woken for reconciliation" component=Controller.Run subsys=bgp-control-plane
level=debug msg="Successfully listed CiliumBGPPeeringPolicies" component=Controller.Reconcile count=2 subsys=bgp-control-plane
level=debug msg="Comparing BGP policy node selector with node's labels" component=PolicySelection nodeLabels="beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=kind-control-plane,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node.kubernetes.io/exclude-from-external-load-balancers=" policyNodeSelector="kubernetes.io/hostname=kind-worker" subsys=bgp-control-plane
level=debug msg="Comparing BGP policy node selector with node's labels" component=PolicySelection nodeLabels="beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=kind-control-plane,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node.kubernetes.io/exclude-from-external-load-balancers=" policyNodeSelector="kubernetes.io/hostname=kind-control-plane" subsys=bgp-control-plane
level=debug msg="Asking configured BGPRouterManager to configure peering" component=Controller.Reconcile subsys=bgp-control-plane
level=debug msg="Reconciling new CiliumBGPPeeringPolicy" component=manager.ConfigurePeers diff="Registering: [65340] Withdrawing: [] Reconciling: []" subsys=bgp-control-plane
level=info msg="Registering BGP servers for policy with local ASN 65340" component=manager.registerBGPServer subsys=bgp-control-plane
level=debug msg="Preflight for virtual router with ASN 65340 not necessary, first instantiation of this BgpServer." component=manager.preflightReconciler subsys=bgp-control-plane
level=debug msg="Begin reconciling peers for virtual router with local ASN 65340" component=manager.neighborReconciler subsys=bgp-control-plane
level=info msg="Reconciling peers for virtual router with local ASN 65340" component=manager.neighborReconciler subsys=bgp-control-plane
level=info msg="Adding peer 172.18.0.3/32 65341 to local ASN 65340" component=manager.neighborReconciler subsys=bgp-control-plane
level=info msg="Add a peer configuration" Key=172.18.0.3 Topic=Peer asn=65340 component=gobgp.BgpServerInstance subsys=bgp-control-plane
level=info msg="Done reconciling peers for virtual router with local ASN 65340" component=manager.neighborReconciler subsys=bgp-control-plane
level=debug msg="Begin reconciling pod CIDR advertisements for virtual router with local ASN 65340" component=manager.exportPodCIDRReconciler subsys=bgp-control-plane
level=debug msg="pod CIDR advertisements disabled for virtual router with local ASN 65340" component=manager.exportPodCIDRReconciler subsys=bgp-control-plane
level=info msg="Successfully registered GoBGP servers for policy with local ASN 65340" component=manager.registerBGPServer subsys=bgp-control-plane
level=debug msg="Successfully completed reconciliation" component=Controller.Run subsys=bgp-control-plane
level=debug msg="IdleHoldTimer expired" Duration=0 Key=172.18.0.3 Topic=Peer asn=65340 component=gobgp.BgpServerInstance subsys=bgp-control-plane
level=debug msg="state changed" Key=172.18.0.3 Topic=Peer asn=65340 component=gobgp.BgpServerInstance new=BGP_FSM_ACTIVE old=BGP_FSM_IDLE reason=idle-hold-timer-expired subsys=bgp-control-plane
level=info msg="type:STATE peer:{conf:{local_asn:65340 neighbor_address:\"172.18.0.3\" peer_asn:65341} state:{local_asn:65340 neighbor_address:\"172.18.0.3\" peer_asn:65341 session_state:IDLE router_id:\"<nil>\"} transport:{local_address:\"<nil>\"}}"
level=info msg="type:STATE peer:{conf:{local_asn:65340 neighbor_address:\"172.18.0.3\" peer_asn:65341} state:{local_asn:65340 neighbor_address:\"172.18.0.3\" peer_asn:65341 session_state:ACTIVE router_id:\"<nil>\"} transport:{local_address:\"<nil>\"}}" This is b/c the speaker continues using port 179 to connect to the configured peer but the peer refuses the connection since it's listening on 42424: level=debug msg="try to connect" Key=172.18.0.3 Topic=Peer asn=65340 component=gobgp.BgpServerInstance subsys=bgp-control-plane
level=debug msg="failed to connect" Error="dial tcp 0.0.0.0:0->172.18.0.3:179: connect: connection refused" Key=172.18.0.3 Topic=Peer asn=65340 component=gobgp.BgpServerInstance subsys=bgp-control-plane It appears that CiliumBGPNeighbor requires a field to set the neighbor port when the neighbor listens on a port other than 179. Thoughts @squeed @christarazi? |
@danehans, I think your issue is different from the original one. I don't have any objection to make such a configuration knob. Please feel free to make an issue or submit the PR. |
@YutaroHayakawa thanks for the feedback. I created ^ to track the issue that I discovered while triaging this issue. I've been able to reproduce this issue by having one speaker use port Version: $ kubectl exec po/cilium-t5qwd -c cilium-agent -n kube-system -- cilium version
Client: 1.13.1 a6be57eb 2023-03-15T19:39:01+01:00 go version go1.19.6 linux/arm64
Daemon: 1.13.1 a6be57eb 2023-03-15T19:39:01+01:00 go version go1.19.6 linux/arm64 Node annotations: $ kubectl get node/kind-worker -o yaml | grep bgp-virtual-router
cilium.io/bgp-virtual-router.65341: local-port=42424,router-id=172.18.0.3
$ kubectl get node/kind-control-plane -o yaml | grep bgp-virtual-router
cilium.io/bgp-virtual-router.65340: local-port=179,router-id=172.18.0.2 Since node
Logs from the cilium-agent running on node
|
I see the same behavior as ^ using cilium-agent built from commit c8598f8. |
behaviour I mentioned above fixed in 1.13.3 and with the introduction of #24914 However configured port opening on node while cilium-agent still using random port is still there with 1.13.3 |
@YutaroHayakawa I have reproduced the issue and now have a more complete understanding of the situation. The Logs from my reproducer:
|
/assign |
Is there an existing issue for this?
What happened?
When using node annotations to set the local-port, these values do not seem to be respected by Cilium. Looks similar to #23155.
Given these annotations:
In the Cilium logs a different port is selected by the VirtualRouter:
The router gives the following error when connecting on the port annotated on the node:
Cilium Version
cilium-cli: v0.13.2 compiled with go1.20.2 on linux/amd64
cilium image (default): v1.13.1
cilium image (stable): v1.13.1
cilium image (running): v1.13.1
Kernel Version
Linux steamroller2 5.15.0-69-generic #76-Ubuntu SMP Fri Mar 17 17:19:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.7+k3s1", GitCommit:"f7c20e237d0ad0eae83c1ce60d490da70dbddc0e", GitTreeState:"clean", BuildDate:"2023-03-10T22:16:07Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.7+k3s1", GitCommit:"f7c20e237d0ad0eae83c1ce60d490da70dbddc0e", GitTreeState:"clean", BuildDate:"2023-03-10T22:16:07Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Sysdump
No response
Relevant log output
Anything else?
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: