Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Egress support in Traceflow #6125

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Atish-iaf
Copy link
Contributor

@Atish-iaf Atish-iaf commented Mar 20, 2024

  • Add EgressNodeIP field in Traceflow observations.

  • Add EgressNode field in observations from Egress Node as well when Egress Node is different from source Node. Previously, EgressNode field was available only in observations from source Node.

For #6099

@Atish-iaf Atish-iaf added the area/ops/traceflow Issues or PRs related to the Traceflow feature label Mar 20, 2024
@Atish-iaf Atish-iaf marked this pull request as ready for review March 20, 2024 07:01
@Atish-iaf Atish-iaf requested review from rajnkamr and tnqn March 20, 2024 07:01
@rajnkamr rajnkamr added this to the Antrea v2.0 release milestone Mar 21, 2024
ob.EgressNode = egressNode
ob.EgressNode = egressNodeName
ob.EgressNodeIP = egressNodeIP
ob.SrcPodIP = srcPodIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a new field named SrcPodIP, it doesn't make sense to only set it for Egress observation as the name is very generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the new field name is generic and we can use it for other observations as well. I didn't do it in this PR as it is specific for Egress observations.
When implementing SrcPodIP for other observations we can have some discussions as well like add DstPodIP also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The partial support would look like a bug and confuse users who see it in one scenario but don't see it in another scenario when the field applies to both. We need to think about the whole when adding something generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove SrcPodIP field from this PR and can implement it in another one with DstPodIP field and support for other observations as well. We can have more discussions on that if required in separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tnqn
I have removed SrcPodIP field from this PR and I plan to create another PR for SrcPodIP field for all types of observations after merge of this PR.
PTAL, thanks

@Atish-iaf Atish-iaf requested a review from tnqn March 27, 2024 16:21
@tnqn tnqn added the api-review Categorizes an issue or PR as actively needing an API review. label Apr 2, 2024
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @jianjuns and @antoninbas to check their opinions on this new field as well.

}
obEgress := getEgressObservation(true, egressIP, egressName, egressNode)
obEgress := getEgressObservation(true, egressIP, egressName, egressNodeName, c.nodeConfig.NodeIPv4Addr.IP.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will panic in a IPv6 cluster.

@antoninbas
Copy link
Contributor

cc @jianjuns and @antoninbas to check their opinions on this new field as well.

I don't really understand the value of this field. The Node is already identified by its name, which we include in the observation.
A Node also can have multiple IPs. Here the IP being used is the "management IP" used to connect to the K8s control plane, which I guess is fine, but it's not really relevant in the context of Egress?

@rajnkamr
Copy link
Contributor

rajnkamr commented Apr 3, 2024

cc @jianjuns and @antoninbas to check their opinions on this new field as well.

I don't really understand the value of this field. The Node is already identified by its name, which we include in the observation. A Node also can have multiple IPs. Here the IP being used is the "management IP" used to connect to the K8s control plane, which I guess is fine, but it's not really relevant in the context of Egress?

the egress node IP refers to the IP address of the node through which the egress traffic is routed. Unlike the egress IP, which can be dynamically allocated from a pool, the egress node IP typically corresponds to the IP address of the specific node that handles the egress traffic. In this case, traceflow will be performed from platform managing cluster, more info #6099

@jianjuns
Copy link
Contributor

jianjuns commented Apr 3, 2024

I still did not get why egress Node IP is useful in the Traceflow results. Node name is not enough for user to identify the Node?

@Atish-iaf
Copy link
Contributor Author

Atish-iaf commented Apr 3, 2024

I still did not get why egress Node IP is useful in the Traceflow results. Node name is not enough for user to identify the Node?

Node name is enough for user to identify the Node.
egressIP sometimes can be equal to egressNodeIP (static egress case), and in other cases egreesIP may not be equal to egressNodeIP. This info we cannot get from Node name but we can get it if both egressIP and egressNodeIP are visible in Traceflow results. #6099 (comment)

@jianjuns
Copy link
Contributor

jianjuns commented Apr 3, 2024

So you meant the intention is for users to know Egress IP is Node IP or not? Why that is useful?
We have Egress name in the Traceflow results too and users can already know the applied Egress.

@antoninbas
Copy link
Contributor

@rajnkamr

the egress node IP refers to the IP address of the node through which the egress traffic is routed

The Node can have many IP addresses. As I pointed out, this is just one IP address reported by K8s. I would call this IP the management IP, but it can be different from the transport IP, etc. This IP address is never used by the Egress traffic at any point, which is why it doesn't really seem related to the Egress feature in any way?

@rajnkamr
Copy link
Contributor

rajnkamr commented Apr 4, 2024

As we know Egress IP addresses are used to ensure that traffic from pods to external has a consistent source IP.(Preferably static)
However, there are many external devices and software that use IP based access control lists to restrict incoming traffic for security reasons. These access control lists outside k8s cluster will block packets, which causes a connectivity issue and in this case only solution is to configure static egress ip.
Specially in above case while debugging, main motivation is to let user know that Egress IP and Egress Node IP is different which could help user identify the issue and may adopt back to use static egress ip.

@tnqn
Copy link
Member

tnqn commented Apr 4, 2024

As we know Egress IP addresses are used to ensure that traffic from pods to external has a consistent source IP.(Preferably static)
However, there are many external devices and software that use IP based access control lists to restrict incoming traffic for security reasons.

This is the motivation of the Egress feature, not why Egress Node IP needs to be in Traceflow result.

These access control lists outside k8s cluster will block packets, which causes a connectivity issue and in this case only solution is to configure static egress ip.

This is not correct. Static egress IP is not the only solution, any type of egress IP can be the solution. The only difference between HA egress and static egress is how Egress IP is assigned to a Node, by Antrea or by users. In production we always recommend the former as it provides HA.

Specially in above case while debugging, main motivation is to let user know that Egress IP and Egress Node IP is different which could help user identify the issue and may adopt back to use static egress ip.

I don't quite get what the explaination means. The point is, Egress Node IP plays no role in the whole trace and the datapath of such scenario, regardless of whether it's the same as the Egress IP or not. If users encounter an external connectity issue and they trace the packet, they should check whether the Egress IP is whitelisted, and never need to know the Egress Node IP.

@rajnkamr
Copy link
Contributor

rajnkamr commented Apr 5, 2024

This is the motivation of the Egress feature, not why Egress Node IP needs to be in Traceflow result.

Explained in Egress context since we are doing traceflow and actual packet will egress using egress node ip.
Mainly Egress Node ip could be helpful when egress node ip and egress ip are different
For example for traceflow case, As Actual traffic exits from egress node ip and in case traceflow to a destination is problematic via egress ip, letting user know about egress node ip can be helpful .

This is not correct. Static egress IP is not the only solution, any type of egress IP can be the solution. The only difference between HA egress and static egress is how Egress IP is assigned to a Node, by Antrea or by users. In production we always recommend the former as it provides HA.

Egress ip can be assigned to a dummy interface, wherein egress node ip will always be the actual interface of node so the information can be helpful in above context. Also in HA case wherein there could be multiple nodes and multiple interfaces(transport and management), in that case egress node ip will be the interface ip where traffic is actually exiting.

I don't quite get what the explaination means. The point is, Egress Node IP plays no role in the whole trace and the datapath of such scenario, regardless of whether it's the same as the Egress IP or not. If users encounter an external connectity issue and they trace the packet, they should check whether the Egress IP is whitelisted, and never need to know the Egress Node IP.

when trying to do traceflow with Egress IP where in node interface is not reachable and still traceflow to a destination could work but actual traffic will be blocked due egress node ip interface not reachable .

@antoninbas
Copy link
Contributor

Egress ip can be assigned to a dummy interface, wherein egress node ip will always be the actual interface of node so the information can be helpful in above context. Also in HA case wherein there could be multiple node interfaces(transport and management), in that case egress node ip will be the interface ip where traffic is actually existing.

I think you mean exiting and not existing?

This is not what Quan was referring to when he was talking about Egress HA. Egress HA is the ability to fail over an Egress IP to a different Node if the first Node fails. This requires using ExternalIPPools (as opposed to static Egress IPs).

But saying "egress node ip will be the interface ip where traffic is actually exiting" is not correct. The current implementation uses c.nodeConfig.NodeIPv4Addr. As I pointed out before, this is the "management" IP of the Node in the context of K8s. There is no guarantee that the Egress traffic will exit the Node through the interface to which this IP is assigned. This is determined by host routing on the Node. Based on what the destination IP is, traffic can exit the Node through multiple possible interfaces, and these interfaces will have different IPs (which are different from the Egress IP and potentially different from c.nodeConfig.NodeIPv4Addr). Antrea doesn't even know which interface it will be.

It may be easier to discuss this in person at the next Antrea community meeting if there is confusion.

@rajnkamr
Copy link
Contributor

rajnkamr commented Apr 8, 2024

@antoninbas ,
Most of time in actual deployment, Highly likely we can always find Egress Node IP as Management IP of cluster !

Egress Node IP can be used as the Management IP address of the cluster(not node) !
Although there are other places in platform software, where we can get the Node IP of each k8 node, however Egress Node IP will always be the management IP of the cluster from the node Egress traffic is exiting and management IP could be the only IP which might be exposed externally for the management of the cluster . It might make sense to keep it in that context !
We might want to include it during community meet.
+@tschwaller

@rajnkamr rajnkamr added the action/release-note Indicates a PR that should be included in release notes. label Apr 16, 2024
- Add "EgressNodeIP" field in Traceflow observations.

- Add "EgressNode" field in observations from Egress Node as well when
  Egress Node is different from source Node. Previously, "EgressNode" field
  was available only in observations from source Node.

Fixes antrea-io#6099

Signed-off-by: Kumar Atish <kumar.atish@broadcom.com>
@luolanzone
Copy link
Contributor

Synced with @tnqn, we may need another way to support this kind of feature, move it to next release.

@rajnkamr
Copy link
Contributor

@luolanzone , Egress Node IP is relevant wrt management ip of the cluster, however we would like to see the actual traffic path in traceflow

@rajnkamr rajnkamr removed this from the Antrea v2.1 release milestone May 3, 2024
@rajnkamr rajnkamr removed action/release-note Indicates a PR that should be included in release notes. api-review Categorizes an issue or PR as actively needing an API review. labels May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops/traceflow Issues or PRs related to the Traceflow feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants