New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BGP support to the Antrea Agent #5948
Comments
I feel no need to restrict Pod IP advertisement to noEncap only. We can support encap mode too? For LoadBalancer IP we should be able to enable ECMP too. |
I edited the issue to remove the reference to noEncap. It was left over from a previous draft I was working on.
Yes that's a good point. I guess in this case all cluster Nodes advertise the LoadBalancer IP (or at least all cluster Nodes running at least one backend Pod for the Service) with the same "cost", with kube-proxy / AntreaProxy being responsible for the last traffic hop. This would be quite different from how |
Use case 2 would be extremely beneficial for some use cases we have. We are really interestes in this for pod ips/cidr and also for egress. We dont use antrea for LB but that would be a good option to have as well |
Use case 2 is very interesting for our setup, however that is limited to service ips. Could there be a feature switch/selector to enable/disable what and when you should advertise? For instance I may want to advertise some service ips for some namespaces or all but no pod ip's etc? |
@ColonelBundy I definitely wanted to have the ability to disable advertising Pod CIDRs. |
Pod CIDRs can be allocated from multiple non overlapping IPPools (IPAM noencap) , it is evident that only when Pod CIDR is allocated from IPPool, routes should be advertised otherwise it might not be required to be advertised, We might want to include multiple IPPool support for BGP. |
@ColonelBundy Question: I understand that you want to advertise the Pod CIDR or Service IP to another AS. Do you want the routes advertised from another AS to be distributed and installed on K8s Nodes? |
For our use case we only want to advertise service ips. To put it simply, we're looking to not to have to use metallb for l3 external ips. Having the option to select which ippool to advertise and to which peer would be a killer feature. |
Got that. If so, I think that a client in another AS should be reachable via the default route of your cluster Nodes, so that the reply packets from the connection, which is originated from another AS and destined to a Service in the cluster, can be forwarded back where it is originated. Is that your setup? @ColonelBundy |
Yea that sounds good |
It would be a very powerful feature if we could solve use case 2 with something out of the box, as it would add great flexibility when using Egress and ServiceExternalIP. As mentioned, there were ways to solve it by using static routes etc. But maintaining static routes is troublesome when nodes are decommissioned, new ones are provisioned, and the interfaces are moved between nodes. Using BGP would solve this and dynamically update the routes on-demand. I created a post last year using Daemonset to install and configure FRR to get past this, probably not the prettiest way, but it gave me what I wanted: https://blog.andreasm.io/2023/02/20/antrea-egress/. |
@andreasm80 that's a nice blogpost, is it ok if I link to it from the https://antrea.io website? |
Thanks @antoninbas. Yes, that is ok by me. |
@ColonelBundy Hello, I saw the case that
Thanks |
Pretty spot on, except we currently have no use case to advertise pod cidrs to an external peer right now. But that may change with time. And also to clarify, a selector for which pod cidrs to advertise would also be very handy. |
Do you mean that using a selector to select target K8s Nodes and advertising their Pod CIDRs? |
More along the lines of which pods. I might wish to advertise some pods in some namespaces to some peers. |
Thanks for the suggestion. We will keep that in mind. Could you tell me the reason why advertise Pods directly in some Namespaces instead of using Service IPs within those Namespaces? Has such case been employed in a production environment? |
We don't have such a case at the moment. And I do agree that advertising service ips should be the priority if you ever want to advertise individual services. But then again, we don't have this specific use case as of this moment, so feel free to dismiss this idea if it's not within scope. |
Describe the problem/challenge you have
Over the years we have had a few requests to add BGP speaker capabilities to the Antrea Agent. The purpose of this issue is to collect the use cases that we would like to cover with this capability.
Note that while it is possible to meet some of these use cases by deploying kube-router in "BGP mode" alongside Antrea, having this capability available OOTB means potentially a better integration with Antrea features, and doesn't require users to deploy yet another DaemonSet in their cluster.
I believe that there are 3 main use cases for BGP in K8s with Antrea:
1 & 3 are not very interesting IMO, because they just provide alternative implementations to what we already support, and there is no clear benefit. However, we can add some value with 2 for on-prem users who want to make K8s endpoints routable by their BGP fabric.
As a side note, Calico and kube-router support both 1 & 2, while Cilium has added support for 2.
Describe the solution you'd like
I believe that our support should focus on use case 2:
Each Antrea Agent should run a BGP speaker and advertise local IPs to a list of configured BGP peers. The AS number (ASN) for the Antrea Agent should be configurable (all Agents may use the same local ASN or not). The list of advertised local IPs should be configurable from this list:
Egress
feature) Egress IPs - with this capability, it will be possible for routes to be automatically configured in the physical network for "return" Egress trafficServiceExternalIP
feature) LoadBalancer Service IPs - on-prem users with a BGP fabric will be able to easily expose K8s Services to the rest of their network. At the moment, theServiceExternalIP
feature requires LoadBalancer IPs to be allocated from the Node network (or requires adding static routes to the physical network).A note on Egress IP advertisement:
EgressSeparateSubnet
feature but L3 / BGP approach vs L2 approach?Anything else you would like to add?
While the exact API is yet to be decided, BGP peering should ideally be configurable using CRD(s).
cc @jianjuns @tnqn
The text was updated successfully, but these errors were encountered: