Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting passive mode for BGP peers #1603

Open
danderson opened this issue Jan 15, 2018 · 12 comments
Open

Allow setting passive mode for BGP peers #1603

danderson opened this issue Jan 15, 2018 · 12 comments

Comments

@danderson
Copy link

In metallb/metallb#114, I explored how to make my k8s BGP load-balancer interoperate gracefully with Calico clusters that peer with external BGP routers. I've documented my findings at https://master--metallb.netlify.com/configuration/calico/ and metallb/metallb#114 (comment)

Current Behavior

In my setup, I'm trying to peer Calico with another BGP speaker running on localhost. The peer does not listen on any ports, so Calico should just wait for an incoming session. Currently, there is no way to tell Calico to treat a peer passively, so Calico always eagerly tries to connect to 127.0.0.1:179... which is itself. This causes repeated session establishment failures, and BIRD goes into error backoff. This makes it increasingly hard/impossible for the real peer to connect, there's a short window of just a few seconds when the error backoff resets, before the failed connection attempts force it back into backoff.

Expected Behavior

Calico should have a way to specify that a bgpPeer is passive, i.e. Calico should not try to connect to it, but instead just wait for a matching incoming connection.

BIRD supports this, with the passive keyword. It's just not plumbed into the bgpPeer object.

Context

I am trying to make Calico and MetalLB integrate nicely with each other, by setting up a BGP topology like the one I documented for Romana integration. Basically, I want Calico to peer with the outside world, but also with another node agent that pushes routes into Calico for redistribution.

Setting up BGP sessions to/from localhost is notoriously tricky, but with the right set of options, it's possible. Lack of passive mode is one problem I encountered with Calico.

Your Environment

  • Calico version: 2.6.3
  • Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes 1.9.1
  • Operating System and version: Debian testing
  • Link to your project (optional): https://github.com/google/metallb
@caseydavenport
Copy link
Member

Exposing the passive keyword on the BGPPeer resource should be straightforward and seems sensible enough to me.

We'd need to:

@stevegaossou
Copy link
Contributor

stevegaossou commented Mar 6, 2019

Hey @danderson, we're revisiting this issue and trying to reproduce the problem (i.e. error backoff due to repeated session establishment failures) using the latest versions of Calico and MetalLB with the minikube setup from the MetalLB tutorial.

The goal is to reproduce this problem first as a validation step before getting the "passive" mode added to bgp peer.

However, using the setup below, I wasn't able to reproduce the error backoff that leads to difficulty with real peer connection establishment.

Your Environment

  • Calico version: v3.5.2
  • Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes v1.13.3 using kubeadm
  • Operating System and version: minikube v0.34.1 on Darwin 18.2.0

Cluster Setup

  • single node cluster
  • test-bgp-router (w/ BIRD and Quagga)
  • calico-node
    • with peering to BIRD / Quagga
    • with peering to 127.0.0.1
  • metallb controller
  • metallb speaker
    • with peering to 127.0.0.1

The sequence for the setup was:

  • add test routers (BIRD/Quagga)
  • add calico-node
  • peer calico-node with the routers
  • metallb controller / speaker (no config yet)
  • peer calico-node with 127.0.0.1
  • peer metallb speaker with 127.0.0.1

Before configuring metalb speaker with peering to 127.0.0.1, I had the calico-node peering to 127.0.0.1 to simulate the problem of calico-node peering with itself, i.e. "repeated session establishment failures".

However, I couldn't seem to reproduce an error backoff in the calico-node. Connection retries were evident from the calico-node logs:

...
bird: BGP: Unexpected connect from unknown address 10.0.2.15 (port 10506)
bird: BGP: Unexpected connect from unknown address 10.0.2.15 (port 12421)
2019-03-06 08:31:15.638 [INFO][43] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}
bird: BGP: Unexpected connect from unknown address 10.0.2.15 (port 1489)
2019-03-06 08:31:17.096 [INFO][43] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}
bird: BGP: Unexpected connect from unknown address 10.0.2.15 (port 22676)
...

However there was no indication from the logs that BIRD had entered error backoff.

Afterwards, initiating a peering from the metallb speaker resulted in an established connection with calico-node without any issues.

> calicoctl node status
Calico process is running.

IPv4 BGP status
+--------------+---------------+-------+----------+-------------+
| PEER ADDRESS |   PEER TYPE   | STATE |  SINCE   |    INFO     |
+--------------+---------------+-------+----------+-------------+
| 10.96.0.100  | node specific | up    | 07:53:36 | Established |
| 10.96.0.101  | node specific | up    | 07:58:38 | Established |
| 127.0.0.1    | node specific | up    | 08:40:12 | Established |
+--------------+---------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

Let me know if I'm missing something important in reproducing this.

@stevegaossou
Copy link
Contributor

Just a heads up to anyone reading. The most recent relevant discussion is here:
metallb/metallb#114 (comment)

@kfox1111
Copy link

kfox1111 commented Mar 8, 2020

Any updates on this?

@demonsked
Copy link

Really need that functionality.

@psavva
Copy link

psavva commented Jun 5, 2020

I am also interested in this.
+1 from me

@sstubbs
Copy link

sstubbs commented Sep 4, 2020

I would also like to see this issue resolved.

@caseydavenport
Copy link
Member

Still looking for someone to work on this! Would love to review.

The first PR to add the new configuration option would look similar to this one: https://github.com/projectcalico/libcalico-go/pull/1262/files

@darkrift
Copy link

Looking at #160, projectcalico/libcalico-go#886 and metallb/metallb#114 it seems there is some way to make this work.

Is this still an issue ?

@caseydavenport
Copy link
Member

@darkrift correct - we've implemented an integration with MetalLB that bypasses the need for explicitly setting passive mode on the BGP peers. So, this issue doesn't block MetalLB integration any more.

We don't yet have an option to set passive mode explicitly per-peer, but the use-case that this was meant to cover works now without it. I've left it open for now in case anyone wants to tackle adding this for another use-case, but the original MetalLB scenario is fixed 👍

@cyclinder
Copy link
Contributor

@caseydavenport I am interested in this, but I need to know what else needs to be done here, can you tell me? Thanks.

@caseydavenport
Copy link
Member

@cyclinder so far as implementing a passive mode option, this is a similar type of PR, and would look similar to what this would entail: #5736

Summary is probably something like this:

  • Update BGPPeer object with a new field to allow setting passive mode (e.g., connectMode: Passive | Active, would need some agreement on the name but shouldn't be too controversial)
  • Plumb the new field through to confd (responsible for writing BIRD templates)
  • Update bird templates to read the new field on each peer, and configure passive on if told to do so.
  • Add a test or two

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants