Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTPROT_KERNEL Causing IPv4PodCIDR Route Discovery Failure in Quagga and FRR #31243

Open
1 task done
tkgeng opened this issue Mar 8, 2024 · 6 comments
Open
1 task done
Labels
area/bgp kind/cfp kind/community-report This was reported by a user in the Cilium community, eg via Slack. kind/feature This introduces new functionality. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/kernel Requires upstream work in the Linux kernel.

Comments

@tkgeng
Copy link

tkgeng commented Mar 8, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

We are using the Native-Routing mode with FRR OSPF protocol to announce pod subnets. This setup worked well until we upgraded from Cilium version 1.13 to 1.14. After the upgrade, I found that pod network announcements failed due to the addition of the proto kernel field in the pod subnet routes. While these routes are visible using ip r on the system but absent from Zebra's table (not displayed with show ip route in vtysh).

While I understand that proto kernel is meant to address this issue #24288 , would it be helpful to introduce a configuration option to selectively enable it?

Of course the direct reason for FRR's failure to detect proto kernel routes is concerning, I have noticed that this discussion has been raised before but remains unanswered FRRouting/frr#14731.

Cilium Version

1.14+

@YutaroHayakawa
Copy link
Member

YutaroHayakawa commented Mar 8, 2024

Yeah, FRR ignores the proto kernel route, and there's no way to disable that behavior. I recently learned that some people rely on the route installed by Cilium (the one that redirects traffic to the cilium_host device) to advertise PodCIDR with routing protocols (I know one more case using BGP).

As far as I know, Cilium never considers it to be used like that, so backward compatibility is never guaranteed. Also, this only works for certain routing types (for example, I think it doesn't work with endpointRoutes.enabled=true. If you wish to use it, you need to make a consensus to keep it as a stable interface first. 

@YutaroHayakawa YutaroHayakawa added sig/kernel Requires upstream work in the Linux kernel. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. kind/feature This introduces new functionality. kind/community-report This was reported by a user in the Cilium community, eg via Slack. area/bgp kind/cfp labels Mar 8, 2024
@oblazek
Copy link
Contributor

oblazek commented Mar 8, 2024

hey @tkgeng we are using this patch locally :) because on some envs we have bird which has a similar problem..

+++ b/pkg/datapath/loader/loader.go
@@ -18,7 +18,6 @@ import (
        "github.com/cilium/cilium/pkg/bpf"
        "github.com/cilium/cilium/pkg/byteorder"
        "github.com/cilium/cilium/pkg/datapath/link"
-       "github.com/cilium/cilium/pkg/datapath/linux/linux_defaults"
        "github.com/cilium/cilium/pkg/datapath/linux/route"
        "github.com/cilium/cilium/pkg/datapath/loader/metrics"
        datapath "github.com/cilium/cilium/pkg/datapath/types"
@@ -100,7 +99,6 @@ func upsertEndpointRoute(ep datapath.Endpoint, ip net.IPNet) error {
                Prefix: ip,
                Device: ep.InterfaceName(),
                Scope:  netlink.SCOPE_LINK,
-               Proto:  linux_defaults.RTProto,
        }

I believe we could figure something out so that this is configurable, but keep the current behavior.

@tkgeng
Copy link
Author

tkgeng commented Mar 8, 2024

@oblazek Thank you very much. Seeing that you're already using it like this, I don't have to worry about any side effects. I will also give it a try.

@networkop
Copy link
Contributor

another option that doesn't require a patch would be to peer Cilium's BGP with FRR over a loopback address. The latest versions of FRR support allow-reserved-ranges which should allow you to specify 127.0.0.1 as the peer address. Then on the FRR you would just redistribute everything into OSPF and re-advertise it down into the network

@tkgeng
Copy link
Author

tkgeng commented Mar 25, 2024

@networkop Good idea! it will become one of the options we will choose next.

@robertvolkmann
Copy link

Thanks @networkop, it worked with FRR 10 on my side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bgp kind/cfp kind/community-report This was reported by a user in the Cilium community, eg via Slack. kind/feature This introduces new functionality. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/kernel Requires upstream work in the Linux kernel.
Projects
None yet
Development

No branches or pull requests

5 participants