New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unidirectional BGP connections in the node-to-node mesh #79

Merged
merged 2 commits into from Oct 15, 2018

Conversation

Projects
None yet
3 participants
@fasaxc
Copy link
Member

fasaxc commented Oct 12, 2018

Description

Works around a bug in the BGP graceful restart protocol:
if a node is gracefully restarting, and its peer opens a connection
(due to its connection retry timer) before the restart has completed,
the graceful restart is aborted. This results in a route flap.

This change makes it so the BGP connection is only opened in one direction.
If a node is peering with another node that has a (lexicographically)
lesser IP address, we make the connection passive.

Since we're now relying on the connection timer, tune the timer values
since the defaults are very high (minutes).

Todos

  • Tests
  • Documentation
  • Release note

Release Note

To avoid collisions, which can cause graceful restart to fail, Calico now marks one peer in each node-to-node mesh peering as passive.  Passive nodes will listen for inbound BGP connections but will not initiate outbound BGP connections.

@fasaxc fasaxc requested a review from caseydavenport Oct 12, 2018

fasaxc added some commits Oct 12, 2018

Use unidirectional BGP connections in the node-to-node mesh.
Works around a bug in the BGP graceful restart protocol:
if a node is gracefully restarting, and its peer opens a connection
(due to its connection retry timer) before the restart has completed,
the graceful restart is aborted. This results in a route flap.

This change makes it so the BGP connection is only opened in one direction.
If a node is peering with another node that has a (lexicographically)
lesser IP address, we make the connection passive.

Tigera issue number: CNX-4867.

@fasaxc fasaxc force-pushed the fasaxc:passive-mesh branch from 968e129 to 21b6641 Oct 15, 2018

@neiljerram
Copy link
Member

neiljerram left a comment

This looks good to me. I'd just note that we won't get the benefit of this if someone has configured explicit peerings (e.g. with selectors) between the Calico nodes, and it would be nice if we could have unidirectionality in that case too (when we know that both ends of a peering are Calico nodes).

@fasaxc fasaxc merged commit b6c3690 into projectcalico:master Oct 15, 2018

2 checks passed

license/cla Contributor License Agreement is signed.
Details
semaphoreci The build passed on Semaphore.
Details

@caseydavenport caseydavenport added this to the Calico v3.4.0 milestone Oct 15, 2018

@fasaxc fasaxc referenced this pull request Oct 22, 2018

Merged

Port the unidirectional mesh changes to bird6.cfg. #86

0 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment