Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8 Node Transport Interface Detection #6273

Open
rajnkamr opened this issue Apr 29, 2024 · 9 comments
Open

K8 Node Transport Interface Detection #6273

rajnkamr opened this issue Apr 29, 2024 · 9 comments
Assignees
Labels
action/release-note Indicates a PR that should be included in release notes. kind/design Categorizes issue or PR as related to design.

Comments

@rajnkamr
Copy link
Contributor

rajnkamr commented Apr 29, 2024

Describe what you are trying to solve

A node can have multiple interfaces, packets can be routed differently due to iptable rules. However, It is important to find the actual transport interface for Egress traffic on the node.
Describe the solution you have in mind

Node has two network interfaces: eth0 connected to Network A and eth1 connected to Network B. Additionally, there's a custom iptables rule that marks packets with a specific packet mark when they are destined for a particular IP address range.

Here's a simplified representation of the scenario:

Network A: 192.168.1.0/24
Network B: 10.0.0.0/24
Custom iptables rule: Mark packets destined for IP addresses in the range 10.0.0.0/24 with a specific packet mark.
Now, let's say we want to determine the outgoing interface for a packet destined for an IP address in Network B (e.g., 10.0.0.1) using ip route get. The expected outcome would be to see the outgoing interface as eth1, as that's the interface connected to Network B.

However, due to the custom iptables rule that marks packets destined for Network B, the actual routing decision might be influenced by this rule. If the rule alters the packet's marking before it reaches the routing table lookup stage, ip route get might not accurately reflect the actual outgoing interface.

In this case, even though eth1 is the correct outgoing interface based on traditional routing table lookup, the presence of the custom iptables rule could lead to the packet being routed differently, potentially resulting in ip route get incorrectly identifying the outgoing interface as eth0.

Describe how your solution impacts user flows

N/A
Describe the main design/architecture of your solution

Use Go library for packet processing to inspect packets and determine the outgoing interface

Alternative solutions that you considered

N/A
Test plan

N/A
Additional context

#6099 #5832

@rajnkamr rajnkamr added the kind/design Categorizes issue or PR as related to design. label Apr 29, 2024
@luolanzone
Copy link
Contributor

luolanzone commented Apr 30, 2024

I think we always use the primary interface for Egress IP, so not sure if this is a real issue or not. @tnqn should have more insights on this.

@rajnkamr
Copy link
Contributor Author

I think we always use the primary interface for Egress IP, so not sure if this is a real issue or not. @tnqn should have more insights on this.

@luolanzone , we had a discussion related to identifying actual transport interface on node #6099 ,where we wanted to include egress node ip in traceflow while doing packet tracing to destination, currently there is no way to identify whether the interface on node is management interface or actual traffic interface or both.

@antoninbas
Copy link
Contributor

I think we always use the primary interface for Egress IP, so not sure if this is a real issue or not. @tnqn should have more insights on this.

We don't know which interface it will be actually, users could configure whichever routing they want for external IPs.

That being said, we have to think carefully about whether we want to implement this feature. I suppose invoking ip route get is simple enough, but anything more complex does not seem worth it to me.

@Atish-iaf
Copy link
Contributor

Atish-iaf commented May 9, 2024

There was a discussion regarding transport interface name for all observations and not only egress observations. Transport interface name can be used whenever packet is leaving the Node.
If we use ip route get, then i think we can provide transport interface name only for egress specific observation on egress Node and not any other observation. It is because -

  1. Inter-Node-Pod-to-Pod
    dst IP of Packet on source Node is dst PodIP and using this dst IP with ip route get will provide antrea-gw0. But
    actually packet doesn't go through this interface, packet goes through tunnel interface (encap mode).

  2. Remote Egress
    dst IP of Packet on source Node is an externalIP and using this dst IP with ip route get will provide any Node
    interface(based on routing on that Node) other than tunnel interface. But actually packet goes through tunnel interface of Source Node to Remote/Egress Node (encap mode).

So, the interface name got from ip route get in above cases is not correct.

If we use ip route get for transport interface name, we can do it only for

  • egress observations on egress Node.
  • Pod-to-external (without Egress applied on Pod)

@antoninbas
Copy link
Contributor

antoninbas commented May 9, 2024

@tnqn originally I thought that there was no guarantee that Egress traffic would leave the Egress Node through the transport interface, but I am not so sure anymore. At least today, this is the only interface through which we advertise the IP, and it seems unlikely that using any other outgoing interface (e.g., if the default route uses a different outgoing interface) would be a valid scenario. Am I missing something? I imagine that this situation may be a bit different with BGP support though.

Edit:

At least today, this is the only interface through which we advertise the IP

Actually this is only for the initial (gratuitous) advertisement, or if arp_ignore > 0.
With arp_ignore == 0 (default), we will reply to ARP requests on any interface. So maybe what I describe above is still a valid scenario, even in L2 mode.

@tnqn
Copy link
Member

tnqn commented May 13, 2024

Actually this is only for the initial (gratuitous) advertisement, or if arp_ignore > 0.
With arp_ignore == 0 (default), we will reply to ARP requests on any interface. So maybe what I describe above is still a valid scenario, even in L2 mode.

Right, there are two cases. In most cases where arp_ignore=0, ARP works very well; in the other cases, users need to configure transportInterface to make it work. Besides, if a subnet has VLAN, the outgoing interface will only be the transport interface.

Custom iptables rule: Mark packets destined for IP addresses in the range 10.0.0.0/24 with a specific packet mark.
Now, let's say we want to determine the outgoing interface for a packet destined for an IP address in Network B (e.g., 10.0.0.1) using ip route get. The expected outcome would be to see the outgoing interface as eth1, as that's the interface connected to Network B.

However, due to the custom iptables rule that marks packets destined for Network B, the actual routing decision might be influenced by this rule. If the rule alters the packet's marking before it reaches the routing table lookup stage, ip route get might not accurately reflect the actual outgoing interface.

It doesn't need to be so complicated. In Antrea datapath case, we only use policy routing for VLAN subnet, in which case the transport interface is always the "transportInterface", otherwise the transport interface is determined by the default route tables.

@rajnkamr
Copy link
Contributor Author

If we only use policy routing for vlan subnet, then Identifying the traffic that is destined for the VLAN subnets could be based on the destination IP address range (subnet) associated with the VLAN. After configuring the policy routing rules and specifying the transport interface for VLAN subnet traffic, since these rules take precedence over the default routing rules and rightly transport interface could be configured one only, otherwise default route is good enough, we can make use of same.
Also need to include scenarios, if there are multiple transport interface configured in case of multiple Vlan subnets or otherwise.

@antoninbas
Copy link
Contributor

It doesn't need to be so complicated. In Antrea datapath case, we only use policy routing for VLAN subnet, in which case the transport interface is always the "transportInterface", otherwise the transport interface is determined by the default route tables.

I think the concern is that users can use whichever custom routing policies they want on their Nodes for outgoing traffic, even though that's an unlikely scenario.

@rajnkamr
Copy link
Contributor Author

It doesn't need to be so complicated. In Antrea datapath case, we only use policy routing for VLAN subnet, in which case the transport interface is always the "transportInterface", otherwise the transport interface is determined by the default route tables.

I think the concern is that users can use whichever custom routing policies they want on their Nodes for outgoing traffic, even though that's an unlikely scenario.

For the first phase, we can go ahead without custom routing policies support and document it .

@rajnkamr rajnkamr added the action/release-note Indicates a PR that should be included in release notes. label May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. kind/design Categorizes issue or PR as related to design.
Projects
None yet
Development

No branches or pull requests

5 participants