Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Pod traffic control in Antrea #3324

Closed
5 tasks done
tnqn opened this issue Feb 16, 2022 · 9 comments
Closed
5 tasks done

Support Pod traffic control in Antrea #3324

tnqn opened this issue Feb 16, 2022 · 9 comments
Labels
kind/design Categorizes issue or PR as related to design.

Comments

@tnqn
Copy link
Member

tnqn commented Feb 16, 2022

Describe what you are trying to solve

Security and visibility services like IDS, NSM require receving packets sent from/to Pods to analyse. There was an issue opened for such requirement: #3008. Having the capacity of Pod traffic control will be useful for these services as Antrea can be configured to redirect/mirror specific Pods' traffic to specific destination, from which the services can capture traffic.

Describe the solution you have in mind

We propose to add a traffic control API using K8s CRD. The traffic control API accepts client requests and controls the container traffic with OpenFlow rules. The API is designed to be generic, providing a mechanism to specify the Pods whose traffic should be selected, the direction of the traffic, whether the traffic should be mirrored or redirected, and the network device port to redirect or mirror to.

type TrafficControl struct {
	metav1.TypeMeta `json:",inline"`
	// Standard metadata of the object.
	metav1.ObjectMeta `json:"metadata,omitempty"`

	// Specification of the desired behavior of TrafficControl.
	Spec TrafficControlSpec `json:"spec"`
}

type TrafficControlSpec struct {
	// AppliedTo selects Pods to which the traffic control configuration will be applied.
	AppliedTo AppliedTo `json:"appliedTo"`

	// The direction of traffic that should be matched. It can be Ingress, Egress, or Both.
	Direction Direction `json:"direction,omitempty"`

	// The action that should be taken for the traffic. It can be Redirect or Mirror.
	Action Action `json:"action,omitempty"`

	// The destination that the traffic should be redirected or mirrored to.
	Destination TrafficDestination `json:"destination,omitempty"`
}

type TrafficDestination struct {
	// The name of the traffic destination, used to identify the port name in OVS.
	Name string
	// Port represents a port that is attached to the OVS bridge.
	// It can be an OVS internal port, a physical NIC, or a veth device.
	// +optional
	Port *PortTrafficDestination
	// VXLAN represents a VXLAN tunnel that is created on the Node.
	// +optional
	VXLAN *TunnelTrafficDestination
	// GENEVE represents a GENEVE tunnel that is created on the Node.
	// +optional
	GENEVE *TunnelTrafficDestination
	// GRE represents a GRE tunnel that is created on the Node.
	// +optional
	GRE *TunnelTrafficDestination
	// ERSPAN represents a ERSPAN tunnel that is created on the Node.
	// +optional
	ERSPAN *ERSPANTrafficDestination
}

// PortTrafficDestination represents a port that is attached to the OVS bridge.
// It can be an OVS internal port, a physical NIC, or a veth device.
type PortTrafficDestination struct {
	// Internal represents whether this is an OVS internal port.
	// Antrea will create the port if it's internal and missing. Otherwise the port must already exist.
	Internal bool
	// PeerName represents the name of the peer device from which the traffic
	// will be sent back to OVS. It should only be set for Redirect action.
	PeerName string
}

// TunnelTrafficDestination represents a tunnel that is created on the Node.
type TunnelTrafficDestination struct {
	// The remote IP of the tunnel.
	RemoteIP string
	// The ID of the tunnel.
	TunnelID int64
}

// ERSPANTrafficDestination represents an ERSPAN tunnel that is created on the Node.
type ERSPANTrafficDestination struct {
	RemoteIP   string
	TunnelID   int64
	Version    int8
	Index      int32
	Dir        int8
	HardwareID int8
}

As an example, the TrafficControl resource “mirror-web-app” shown below declares all ingress traffic to Pods with “app=web” in all Namespaces should be redirected to a remote collector running on 10.10.0.2 via GRE tunnel :

apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  destination:
    name: gre0
    gre:
      remoteIP: 10.10.0.2

The Antrea Agent is responsible for realizing the traffic control request. It watches the TrafficControl resources from the K8s API server, and manages the container traffic with OpenFlow rules. Specifically, the agent executes the following steps for a TrafficControl resource:

  • Use label selectors to filter Pods running on this Node.
  • Translate the selected Pods to OVS ports, which will be used to filter traffic that should be mirrored or redirected.
  • Translate the target device to OVS port, which will be used as the target port the traffic should be mirrored or redirected to.
  • Install OpenFlow rules calculated using the above arguments.

Describe how your solution impacts user flows

Describe the main design/architecture of your solution

Alternative solutions that you considered

Test plan

Additional context

Work breakdown:

@tnqn tnqn added the kind/design Categorizes issue or PR as related to design. label Feb 16, 2022
@wenqiq
Copy link
Contributor

wenqiq commented Mar 17, 2022

A question about the traffic control API, why not use AppliedTo instead of PodSelector and NamespaceSelector in the TrafficControlSpec?

@tnqn
Copy link
Member Author

tnqn commented Mar 17, 2022

A question about the traffic control API, why not use AppliedTo instead of PodSelector and NamespaceSelector in the TrafficControlSpec?

It's good idea. I didn't think about it too much. Use AppliedTo struct makes sense to me. Will update.

@wenqiq
Copy link
Contributor

wenqiq commented Mar 21, 2022

If AppliedTo field was used in the trafficControl API, I think unifying AppliedToGroup of NetworkPolicy/Egress/trafficControl will improve the appliedToGroup processing efficiency because the same appliedToGroup only need to be processed one time.
I have tried to make a PR to implement the traffic control API, which may include the first and second work breakdown items.
Hope it will be helpful to solve this issue.
#3487

@tnqn
Copy link
Member Author

tnqn commented Apr 8, 2022

If AppliedTo field was used in the trafficControl API, I think unifying AppliedToGroup of NetworkPolicy/Egress/trafficControl will improve the appliedToGroup processing efficiency because the same appliedToGroup only need to be processed one time. I have tried to make a PR to implement the traffic control API, which may include the first and second work breakdown items. Hope it will be helpful to solve this issue. #3487

The group calculation is delegated to grouping.Interface, which doesn't care the kind of the group. If a group created for Egress has same label selector as a group created for NetworkPolicy, they will share the selector in grouping.Intetface and get same result, so there should be no duplicate workload. There was a discussion about whether Egress should have its own Egress group or share the AppliedToGroup with NetworkPolicy. It was implemented in the former way considering Egress may have a different way to calculate its span if SNAT IP as the tunnel destination does not work for a scenario, and the cache of AppliedToGroup is NetworkPolicy specific, which may need a refactor before it can be shared with another controller. There wasn't a very good reason to unify to single AppliedToGroup for NetworkPolicy and Egress, and I think it's still the case, not to mention that there are more points need to consider now: how to handle upgrade and version skew (just removing an API breaks it), how to handle cleanup logic when an AppliedToGroup could be created for NetworkPolicy or Egress, etc. Lastly, refactoring something existing while implementing a new feature makes a PR really hard to review and manage its scope. Even we want to unify the groups in the future, it should be a separate PR focusing on the purpose.

For traffic control, I think antrea-controller doesn't need to be involved as the workflow descibes. There is no point to process anything in a centralized way because each antrea-agent should be responsible for Pods on its own.

@antoninbas
Copy link
Contributor

@tnqn I have the following related questions:

  • For the redirect case, can the same port be used for both traffic directions (i.e. peerName == name)? Or does the implementation require 2 different OVS ports?
  • Can tunnel destinations be used for the Redirect action? I assume that if the answer to my previous question is "No", then the answer is also "No" here.

@tnqn
Copy link
Member Author

tnqn commented Apr 14, 2022

@antoninbas

  • For the redirect case, can the same port be used for both traffic directions (i.e. peerName == name)? Or does the implementation require 2 different OVS ports?

From antrea side, it doesn't really care whether the ports are same. It will just make sure traffic received from peer port can be forwarded to traffic's original destination without being stuck in a loop (be redirected again). However, AFAIK, network firewalls such as suricata and snort require two different interfaces when working inline mode. I tested suricata, it cannot start when setting two same interfaces.

  • Can tunnel destinations be used for the Redirect action? I assume that if the answer to my previous question is "No", then the answer is also "No" here.

Technically yes. I tried to start a suricata instance on another node, create two tunnels using different tunnel IDs between the K8s Node and the external Node, and redirect intra-Node Pod traffic to external Node via one tunnel and send it back via another tunnel, the firewall works as expected.
I also discussed this with @jianjuns offline, we think peerDevice can be moved to the same level as Destination. Then port struct itself doesn't have the "peer" concept and we don't limit redirect must use local port or tunnel port. The API would be like the following (haven't figured out a good name for the return device and the struct "TrafficDestination", please suggest if you have good names):

type TrafficControlSpec struct {
	// AppliedTo selects Pods to which the traffic control configuration will be applied.
	AppliedTo AppliedTo
	// The direction of traffic that should be matched. It can be Ingress, Egress, or Both.
	Direction Direction 
	// The action that should be taken for the traffic. It can be Redirect or Mirror.
	Action Action 
	// The destination that the traffic should be redirected or mirrored to.
	Destination TrafficDestination 
	// PeerName represents the name of the peer device from which the traffic
	// will be sent back to OVS. It should only be set for Redirect action.
	XXXX *TrafficDestination 
}

@jianjuns
Copy link
Contributor

TrafficControlPort (if all types of destination will have a port)?

@tnqn
Copy link
Member Author

tnqn commented Apr 15, 2022

TrafficControlPort (if all types of destination will have a port)?

Thanks @jianjuns for the suggestion. I tried to call it Port, however we have port of "Port" type and "Tunnel" type, which may lead to a struct name like "PortTrafficControlPort". I used "Device" for the struct and "PortDevice" and "TunnelDevice" for the specific types in #3644, could you check if it makes sense to you?

tnqn added a commit to tnqn/antrea that referenced this issue Apr 15, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    name: vxlan0
    tunnel:
      type: VXLAN
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    name: firewall0
    local:
      internal: true
  returnPort:
    name: firewall1
    local:
      internal: true
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 15, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    name: vxlan0
    tunnel:
      type: VXLAN
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    name: firewall0
    local:
      internal: true
  returnPort:
    name: firewall1
    local:
      internal: true
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 18, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    name: vxlan0
    type: VXLAN
    tunnelConfig:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    name: firewall0
    type: Internal
  returnPort:
    name: firewall1
    type: Internal
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 18, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    name: vxlan0
    type: VXLAN
    tunnelConfig:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    name: firewall0
    type: Internal
  returnPort:
    name: firewall1
    type: Internal
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 26, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    vxlan:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    ovsInternal:
      name: firewall0
  returnPort:
    ovsInternal:
      name: firewall1
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 26, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    vxlan:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    ovsInternal:
      name: firewall0
  returnPort:
    ovsInternal:
      name: firewall1
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 26, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    vxlan:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    ovsInternal:
      name: firewall0
  returnPort:
    ovsInternal:
      name: firewall1
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to tnqn/antrea that referenced this issue Apr 26, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    vxlan:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    ovsInternal:
      name: firewall0
  returnPort:
    ovsInternal:
      name: firewall1
```

For antrea-io#3324

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit that referenced this issue Apr 27, 2022
TrafficControl is a feature which allows mirroring or redirecting the
traffic Pods send or receive. It enables users to monitor and analyze
Pod traffic, and to enforce custom network protections for Pods with
fine-grained control over network traffic.

This patch adds types and CRD for TrafficControl API.

Examples:

1. Mirror Pods (web=app) ingress traffic to a VXLAN tunnel
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: mirror-web-app
spec:
  appliedTo:
    podSelector:
      matchLabels:
        app: web
  direction: Ingress
  action: Mirror
  targetPort:
    vxlan:
      remoteIP: 1.1.1.1
```

2. Redirect Pods (web=app) traffic in both direction to OVS internal
port firewall0 and expect the traffic to re-enter OVS via another OVS
internal port firewall1 if they are not dropped.
```
apiVersion: crd.antrea.io/v1alpha2
kind: TrafficControl
metadata:
  name: redirect
spec:
  appliedTo:
    podSelector:
      matchLabels:
        role: web
  direction: Ingress
  action: Redirect
  targetPort:
    ovsInternal:
      name: firewall0
  returnPort:
    ovsInternal:
      name: firewall1
```

For #3324

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn
Copy link
Member Author

tnqn commented Jun 14, 2022

All patches have been merged, closing this issue.

@tnqn tnqn closed this as completed Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design.
Projects
None yet
Development

No branches or pull requests

4 participants