New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve gratuitous ARP compatibility #172

Closed
danderson opened this Issue Feb 28, 2018 · 0 comments

Comments

Projects
None yet
1 participant
@danderson
Copy link
Member

danderson commented Feb 28, 2018

In #123, @mleklund pointed to https://ihrachyshka.com/2017/05/26/the-failure-part-4-summary/ , a blog series from OpenStack folks on debugging problems with their ARP-based failover implementation. They found a whole bunch of things, and some of them are relevant to our interests.

Short version: Linux kernels before ~4.11 have a whole set of different bugs that break gratuitous ARP in various ways. Kernels after that have been fixed, so the gratuitous ARP that MetalLB is doing right now should work perfectly, and update all ARP cache entries within 1-2s of failover.

To improve compatibility in the face of buggy kernels, we can do 2 things

  • Broadcast gratuitous ARP requests in addition to responses, so that buggy clients with arp_accept=1 correctly update, and non-linux kernels that selectively implement the RFC update as well.
  • Set the inter-frame broadcast delay to 1.1s, so that if packet 1 lands in the middle of the ARP cache's locktime, subsequent packets only arrive after the locktime expires.

This bug is to track those changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment