Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design: BGP FRR integration #832

Merged
merged 1 commit into from
Jul 21, 2021
Merged

Conversation

markdgray
Copy link
Contributor

This design document discusses integration of FRR as an
alternative BGP agent implementation for MetalLB.

Signed-off-by: Mark Gray mark.d.gray@redhat.com

design/0001-frr.md Outdated Show resolved Hide resolved
@champtar
Copy link
Contributor

champtar commented Apr 7, 2021

Hi @markdgray

I went through the document quickly, for me this integration is a great occasion to create a separate BGP controller instead of a keeping the idea of BGP speaker around:

  1. you don't need to run FRR/BGP on each nodes
  2. you don't need hostNetwork=true
  3. you don't need the added Linux capabilities

Having some BGP controllers running only on the control-plane would be great for security

My 2 cents

design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
@markdgray
Copy link
Contributor Author

markdgray commented Apr 16, 2021

I have added a section to discuss evaluation of alternative routing stacks and the motivation for selecting FRR. I also updated the "non-goals" appropriately.

design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
Copy link
Member

@johananl johananl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there @markdgray.

First of all, let me say this is an excellent design proposal 👏 It's readable, well-rationalized and makes a lot of sense. I'd say we can use it as a template for future proposals.

Second, your overview of the MetalLB codebase looks great. I'd consider converting this part of the proposal to a developer-oriented document. Would love to get a PR for that if you're up for it.

Overall I like the proposal a lot. I do have some concerns and reservations which I've expressed as inline comments. My biggest concern is the lack of strategy around long-term maintenance of the BGP stacks. My experience tells me we will very quickly abandon the old stack because FRR has more features compared to the native stack. This could be a good thing, but I think we should be more explicit about our intentions here.

Happy to hear your thoughts, and would love to see more from you in this project 🙂

design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
BGP. There is an experimental gRPC interface. This interface may need to be
productionized through the FRR community. When this has been satisfactorily
achieved, we can start [Story 3](#story-3). Until that time, in order to
mitigate this risk, we can wrap the FRR ‘vtysh’ command line interface in Go
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does wrapping vtysh entail automating an interactive CLI using Go code? This sounds fragile to me. I'd opt for updating a config file if FRR supports live config reloading until we have e.g. gRPC support.

Copy link
Contributor Author

@markdgray markdgray May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't considering wrapping the interactive vtysh shell but rather send commands using vtysh which should be less fragile. i.e. vtysh -c "command".

Using a config file is an option. There are some limitations with that that would be preferable to address if we were to go this route. (FRRouting/frr#2128)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdgray oh, good point. If the config file option is limited (people asking for the reload script to be smarter) and this is already a risk mitigation option, IMHO it sounds ok to go with the vtysh -c cmd option.

I don't have experience with either of these options, so while I think maybe cfg file seems simpler (no "state" to keep, probably simpler to reproduce getting to some broken state, etc.), I'm ok if you think vtysh is the way to go here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, @markdgray why did you mark this as resolved? Not sure I follow what was the reasoning here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I update the document to state the preferred method (e.g. vtysh -c cmd)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @johananl ?

@markdgray
Copy link
Contributor Author

Hey there @markdgray.

First of all, let me say this is an excellent design proposal clap It's readable, well-rationalized and makes a lot of sense. I'd say we can use it as a template for future proposals.

Great!

Second, your overview of the MetalLB codebase looks great. I'd consider converting this part of the proposal to a developer-oriented document. Would love to get a PR for that if you're up for it.

I was originally planning something like that so I can refactor the document as you have suggested.

Overall I like the proposal a lot. I do have some concerns and reservations which I've expressed as inline comments. My biggest concern is the lack of strategy around long-term maintenance of the BGP stacks. My experience tells me we will very quickly abandon the old stack because FRR has more features compared to the native stack. This could be a good thing, but I think we should be more explicit about our intentions here.

I will look through these comments and address them in the next couple of days or maybe next week.

Happy to hear your thoughts, and would love to see more from you in this project slightly_smiling_face

Thanks @johananl !

@champtar
Copy link
Contributor

champtar commented May 4, 2021

I haven't taken the time to read the proposal in full, so sorry if the response in in the proposal (just did some quick search).
One possible behavior change is that current BGP code doesn't listen on port 179 on the host, making it possible to run along side calico in BGP mode (default if I remember correctly).
Will FRR listen on port 179 ?

@markdgray
Copy link
Contributor Author

I haven't taken the time to read the proposal in full, so sorry if the response in in the proposal (just did some quick search).
One possible behavior change is that current BGP code doesn't listen on port 179 on the host, making it possible to run along side calico in BGP mode (default if I remember correctly).
Will FRR listen on port 179 ?

It can be configured (https://docs.frrouting.org/en/latest/bgp.html#starting-bgp). It looks like that is a configuration parameter in MetalLB so we would just need to implement that?

@markdgray
Copy link
Contributor Author

Second, your overview of the MetalLB codebase looks great. I'd consider converting this part of the proposal to a developer-oriented document. Would love to get a PR for that if you're up for it.

I added this part as a separate file. Will I split it out into a separate file?

@russellb
Copy link
Collaborator

Overall I'm quite happy with this and am ready to give my +1. I'd like to resolve the remaining open question on whether FRR would run within the metallb speaker pod or not, and if it's separate, how we would secure that communication.

@markdgray markdgray force-pushed the feat/frr_design branch 2 times, most recently from e0ca7b7 to eecfe6a Compare May 21, 2021 09:01
@uablrek
Copy link
Contributor

uablrek commented Jun 10, 2021

At Ericsson we use the metallb controller but have developed an own speaker with the functions we need, for example;

  • IPv6
  • BFD
  • Static routes (with BFD)
  • Traffic separation (separate peers)

At a meeting with RedHat we were asked to contribute with our speaker configuration as input to this PR, perhaps even as a definition of done;

@markdgray
Copy link
Contributor Author

@rata Thanks for the comprehensive review. I had a few follow-up questions before I do another revision of this PR.

@rata
Copy link
Contributor

rata commented Jun 16, 2021

@markdgray I think I answer them all, thanks again!

@uablrek Cool. Not sure how to read your message, though: do you mean you would prefer to contribute your speaker implementation instead of using FRR? Or are you saying that MetalLB going to FRR is beneficial and you'll be able to drop the custom speaker and use metallb upstream speaker implementation?

@russellb
Copy link
Collaborator

@uablrek Cool. Not sure how to read your message, though: do you mean you would prefer to contribute your speaker implementation instead of using FRR? Or are you saying that MetalLB going to FRR is beneficial and you'll be able to drop the custom speaker and use metallb upstream speaker implementation?

Based on some offline conversation, I think this is an expression of support of the general direction we're taking. They have an alternative implementation of it, but are not able to open source it at this time (and maybe that could change in the future). In the meantime, their configuration examples offer some inspiration for further extensions we may want to make for additional features they find important in their own implementation. In other words, take it as input for some future target features in the implementation.

@markdgray
Copy link
Contributor Author

Is there any consensus? I think everything has been addressed.

@rata
Copy link
Contributor

rata commented Jul 6, 2021

I don't think I'll have time to have another look soon, but don't blovk the PR on my final ACK. It was quite close to ready IMHO when I reviewed the last time, also :)

Thanks again for the PR!

Copy link
Member

@johananl johananl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the effort @markdgray.

Please don't be overwhelmed by the number of comments 🙂 These are mostly cosmetic nitpicks. I took the time to add them since I believe being lenient about details in one part of a codebase has a tendency to lower the quality bar for other areas (such as actual code), too. Please treat all the language/phrasing/capitalization-related comments as non blockers.

Following is a summary of the only "big" concerns I have left (more info inline):

  • Configuring FRR imperatively using vtysh poses IMO a risk of running into unexpected state problems between MetalLB and FRR.
  • I'm worried about the emphasis on managing "external" agents in this proposal.
  • Are we good to "embed" FRR inside MetalLB in terms of OSS licenses?

design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/speaker.md Outdated Show resolved Hide resolved
design/speaker.md Outdated Show resolved Hide resolved
design/speaker.md Outdated Show resolved Hide resolved
design/speaker.md Outdated Show resolved Hide resolved
design/speaker.md Outdated Show resolved Hide resolved
@markdgray markdgray force-pushed the feat/frr_design branch 2 times, most recently from 14e8ea5 to 1e4fe71 Compare July 14, 2021 14:51
Copy link
Member

@johananl johananl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @markdgray. This is an easy read now and makes a lot of sense to me.

The remaining comments are all language-related nitpicks so I'm already LGTM-ing the PR and will leave it up to you to decide whether to fix them.

design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Outdated Show resolved Hide resolved
design/0001-frr.md Show resolved Hide resolved
This design document discusses integration of FRR as an
alternative BGP implementation for MetalLB.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Co-authored-by: Johannes Liebermann <johanan.liebermann@gmail.com>
@markdgray
Copy link
Contributor Author

Great work @markdgray. This is an easy read now and makes a lot of sense to me.

The remaining comments are all language-related nitpicks so I'm already LGTM-ing the PR and will leave it up to you to decide whether to fix them.

Thanks @johananl. All the changes seem reasonable. I have accepted your suggestions, squashed the commits and rebased against main. I think this is done now?

@johananl
Copy link
Member

LGTM @markdgray.
@daxmc99 @rata I'll merge this later today so last chance to weigh in 🙂

@johananl johananl merged commit f9a3876 into metallb:main Jul 21, 2021
@fedepaol
Copy link
Member

At Ericsson we use the metallb controller but have developed an own speaker with the functions we need, for example;

* IPv6

* BFD

* Static routes (with BFD)

* Traffic separation (separate peers)

At a meeting with RedHat we were asked to contribute with our speaker configuration as input to this PR, perhaps even as a definition of done;

* [ECFE_Configuration_Examples.md](https://github.com/metallb/metallb/files/6631459/ECFE_Configuration_Examples.md)

* [ECFE_Configuration_Examples.pdf](https://github.com/metallb/metallb/files/6631462/ECFE_Configuration_Examples.pdf)

@uablrek sorry for getting back on this so late.

What would be the use case for static BFD without BGP?

Your static-bfd-peers example states that the addressPool is still using BGP, but no BGP peers are configured. I assume this is to keep the example shorter, but it will still be use BGP. What is the use case for adding a BGP peer which is not the same as the BGP peer?

Thanks!

@uablrek
Copy link
Contributor

uablrek commented Jul 28, 2021

I have only passed the config so I might be wrong, but I think the metallb contoller requires a "protocol" entry so it is there but not used. Our setup uses the metallb controller with one modification; the yaml parsing is not "strict", so we can add things but must comply with existing syntax.

@uablrek
Copy link
Contributor

uablrek commented Jul 28, 2021

The remove-strict commit Nordix@13b6cca

@fedepaol
Copy link
Member

I have only passed the config so I might be wrong, but I think the metallb contoller requires a "protocol" entry so it is there but not used. Our setup uses the metallb controller with one modification; the yaml parsing is not "strict", so we can add things but must comply with existing syntax.

So, if I am getting your comment right, in the "static-bfd" context you are not using bgp at all (which kind of resonated with the "static-bfd" naming). In that case, how is the advertisement of the virtual ip performed?

@uablrek
Copy link
Contributor

uablrek commented Jul 29, 2021

in the "static-bfd" context you are not using bgp at all

Right. The route is statically configured in the GW-router by the site owner. Sometimes the site owner does not allow routing protocols.

@fedepaol
Copy link
Member

in the "static-bfd" context you are not using bgp at all

Right. The route is statically configured in the GW-router by the site owner. Sometimes the site owner does not allow routing protocols.

So how does the GW-router node knows which node(s) own the ip address associated with the lb service in this case? Or to put in other words, where that static route is set towards?
Because that ip can move from one (or more) node to others.

@uablrek
Copy link
Contributor

uablrek commented Jul 29, 2021

own the ip address associated with the lb service

No node "owns" the lb-address. But if externalTrafficPolicy: local is used only nodes where a server pods run can be used. I don't know if there is some clever use of BFD to not respond on some nodes (similar to BGP). Let me ask around ...

@fedepaol
Copy link
Member

own the ip address associated with the lb service

No node "owns" the lb-address. But if externalTrafficPolicy: local is used only nodes where a server pods run can be used. I don't know if there is some clever use of BFD to not respond on some nodes (similar to BGP). Let me ask around ...

On top of that, there's the bfd + node selector section where only a subset of nodes are enabled.

I'm still not sure I get how the routes are configured. Are those something like ecmp static rules using the nodes (all of them) as gateways for the vips or something like that?

@champtar
Copy link
Contributor

BFD is per node, BGP let you advertise and retract multiple IPs, so this static-bfd is only used for 1 IP I guess ?

@mandydydy
Copy link

own the ip address associated with the lb service

No node "owns" the lb-address. But if externalTrafficPolicy: local is used only nodes where a server pods run can be used. I don't know if there is some clever use of BFD to not respond on some nodes (similar to BGP). Let me ask around ...

static-bfd doesn't care about the externalTrafficPolicy. We do not have additional mechanism regarding externalTrafficPolicy:local. If users want to use extrernalTrafficPolicy:local with static-bfd, they need to configure the Gateway Router to forward the traffic to the desired worker nodes themselves.
static-bfd only takes down the static routes on Gateways Routers. So if the bfd session of a worker node is down, the traffic won't be forwarded to this worker node from the Gateway Routers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.