New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bgp: fix detection of new BGP sessions #6606
Conversation
Fix a bug introduced in refactoring from batfish#6251, where some routes would not be exchanged on BGP links that come up after the snapshot starts. Fairly rare in datacenters, because BGP links in datacenters are usually ebgp singlehop. When sending routes to new links, we need to use the main RIB from the previous round rather than the current RIB, because depending on iteration order of the current round the current RIB may include routes that were not yet advertised in the previous round. This adds some memory overhead, but pointers to the routes are much smaller than the routes themselves, so this should be less than 10%. Add a test.
Codecov Report
@@ Coverage Diff @@
## master #6606 +/- ##
=========================================
Coverage 73.41% 73.42%
- Complexity 35826 35830 +4
=========================================
Files 2842 2842
Lines 144595 144615 +20
Branches 17498 17504 +6
=========================================
+ Hits 106157 106178 +21
+ Misses 30021 30011 -10
- Partials 8417 8426 +9
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 14 files reviewed, 1 unresolved discussion (waiting on @corinaminer and @dhalperi)
a discussion (no related file):
(haven't looked at the code yet, but I have some questions about the description)
depending on iteration order of the current round the current RIB may include
routes that were not yet advertised in the previous round
This sounds like we need to advertise the current RIB, but you say we need to advertise the previous round's main RIB. Can you clarify that for me?
Also, I'm curious what kind of variance there is in the "iteration order". Is it possible to fix that order, and would doing so allow us to avoid this overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @corinaminer and @dhalperi)
* make invariants more clear and enforced in only one place * not track the previous entire RIB unless necessary The main thing missing was to defer processing of external advertisements to the first round of BGP instead of doing it in the neutral zone between IGP and EGP computations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 1 unresolved discussion (waiting on @anothermattbrown and @corinaminer)
a discussion (no related file):
Revised and added invariants per discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 8 files at r2.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @corinaminer)
Fix a bug introduced in refactoring from #6251, where
some routes would not be exchanged on BGP links that come up after
the snapshot starts. Fairly rare in datacenters, because BGP links in
datacenters are usually ebgp singlehop.
When sending routes to new links, we need to use the main RIB from
the previous round rather than the current RIB, because depending on
iteration order of the current round the current RIB may include
routes that were not yet advertised in the previous round. This adds
some memory overhead, but pointers to the routes are much smaller
than the routes themselves, so this should be less than 10%.
Add a test.