Support running Antrea in clusters created with Kind #14

antoninbas · 2019-11-04T23:27:42Z

Depends on #13.

This could be very convenient for CI and enable us to run e2e tests as part of a public CI service.

antoninbas · 2019-11-08T01:11:09Z

Tentative support in #32

antoninbas · 2019-11-16T01:36:29Z

With the current support, the Antrea components (agent, controller) can come up. However, there is no connectivity between Nodes. At this time, I believe that this is because using VXLAN tunnels in OVS userspace mode requires some special configuration: http://docs.openvswitch.org/en/latest/howto/userspace-tunneling/. I will work on this.

antoninbas · 2019-11-22T02:13:29Z

It seems that using OVS in userspace mode also requires to explicitly disable TX offloading for each Pod's eth0 interface. Otherwise all TCP traffic going through the gateway is dropped (and probably Pod to Pod traffic as well). This is something I observed when working on support for Kind. All the packets going through the gateway were dropped and Pods couldn't reach the K8s API server.

Found this link, which is not part of OVS documentation: https://arthurchiao.github.io/blog/ovs-deep-dive-5-datapath-tx-offloading/

antoninbas · 2019-11-22T22:55:40Z

Found a good reference for the checksum issue: https://bugzilla.redhat.com/show_bug.cgi?id=1685616

The issue here is that OVS netdev datapath doesn't
support TX checksum offloading (this is not easy task with arguable profit).
i.e. if packet arrives with bad/no checksum it will be sent to the output port
with same bad/no checksum. Everything works in case of kernel datapth because
the packet doesn't leave the kernel space. In case of netdev datapath some
information (like CHECKSUM_VALID skb flags) is lost while receiving via
socket in userspace and subsequently kernel expects valid checksum while
receiving the packet from userspace because TX offloading is not enabled.

This kind of issues usually mitigated by disabling TX offloading on the
"right*" interfaces, or by setting iptables to fill the checksums like this:

iptables -A POSTROUTING -t mangle -p udp -m udp -j CHECKSUM --checksum-fill

Some related OpenStack bug: https://bugs.launchpad.net/neutron/+bug/1244589

Also, note that this happens only for virtual interfaces like veth/tap because
kernel always tries to delay checksum calculation/validation as much as possible.
Correct packets received from the wire will always have correct checksums.

The following changes were required: * Disable TX HW checksum offload in containers. This is done in the Antrea CNI server when setting-up Pod networking, using an ioctl ethtool system call. * Disable TX HW checksum offload in the Linux host for the veth interface of each Kind Node. This must be done by invoking an additional script (hack/kind_linux.sh) after creating the Kind cluster. * Create a secondary br-phy bridge on each Node, as required by OVS userspace tunneling. Refer to antrea-io#14 for the rationale for all the above bullet points. A new test "provider" was added to the e2e test framework so that all the e2e tests can be run on Kind clusters. As part of this, some changes to the framework had to be performed. For example it is impractical to run SSH commands on Kind Nodes - as they do not have an SSH server - so instead we use "docker exec". Fixes antrea-io#14 Fixes antrea-io#13

The following changes were required: * Disable TX HW checksum offload in containers. This is done in the Antrea CNI server when setting-up Pod networking, using an ioctl ethtool system call. * Disable TX HW checksum offload in the Linux host for the veth interface of each Kind Node. This must be done by invoking an additional script (hack/kind_linux.sh) after creating the Kind cluster. * Create a secondary br-phy bridge on each Node, as required by OVS userspace tunneling. * Use a new version of start_ovs (start_ovs_netdev) which modifies the ovs-ctl script in-place to avoid loading the kernel module. Refer to antrea-io#14 for the rationale for all the above bullet points. A new test "provider" was added to the e2e test framework so that all the e2e tests can be run on Kind clusters. As part of this, some changes to the framework had to be performed. For example it is impractical to run SSH commands on Kind Nodes - as they do not have an SSH server - so instead we use "docker exec". Fixes antrea-io#14 Fixes antrea-io#13

The following changes were required: * Disable TX HW checksum offload in containers. This is done in the Antrea CNI server when setting-up Pod networking, using an ioctl ethtool system call. * Disable TX HW checksum offload in the Linux host for the veth interface of each Kind Node. This must be done by invoking an additional script (hack/kind_linux.sh) after creating the Kind cluster. * Create a secondary br-phy bridge on each Node, as required by OVS userspace tunneling. * Use a new version of start_ovs (start_ovs_netdev) which modifies the ovs-ctl script in-place to avoid loading the kernel module. Refer to #14 for the rationale for all the above bullet points. A new test "provider" was added to the e2e test framework so that all the e2e tests can be run on Kind clusters. As part of this, some changes to the framework had to be performed. For example it is impractical to run SSH commands on Kind Nodes - as they do not have an SSH server - so instead we use "docker exec". Fixes #14 Fixes #13

The following changes were required: * Disable TX HW checksum offload in containers. This is done in the Antrea CNI server when setting-up Pod networking, using an ioctl ethtool system call. * Disable TX HW checksum offload in the Linux host for the veth interface of each Kind Node. This must be done by invoking an additional script (hack/kind_linux.sh) after creating the Kind cluster. * Create a secondary br-phy bridge on each Node, as required by OVS userspace tunneling. * Use a new version of start_ovs (start_ovs_netdev) which modifies the ovs-ctl script in-place to avoid loading the kernel module. Refer to antrea-io#14 for the rationale for all the above bullet points. A new test "provider" was added to the e2e test framework so that all the e2e tests can be run on Kind clusters. As part of this, some changes to the framework had to be performed. For example it is impractical to run SSH commands on Kind Nodes - as they do not have an SSH server - so instead we use "docker exec". Fixes antrea-io#14 Fixes antrea-io#13

The following changes were required: * Disable TX HW checksum offload in containers. This is done in the Antrea CNI server when setting-up Pod networking, using an ioctl ethtool system call. * Disable TX HW checksum offload in the Linux host for the veth interface of each Kind Node. This must be done by invoking an additional script (hack/kind_linux.sh) after creating the Kind cluster. * Create a secondary br-phy bridge on each Node, as required by OVS userspace tunneling. * Use a new version of start_ovs (start_ovs_netdev) which modifies the ovs-ctl script in-place to avoid loading the kernel module. Refer to #14 for the rationale for all the above bullet points. A new test "provider" was added to the e2e test framework so that all the e2e tests can be run on Kind clusters. As part of this, some changes to the framework had to be performed. For example it is impractical to run SSH commands on Kind Nodes - as they do not have an SSH server - so instead we use "docker exec". Fixes #14 Fixes #13

This patch enables using netdev mode for OVS. This allows multiple docker containers hosting different OVS instances to function without the potential collisions that would occur while sharing the same kernel data path. Note, netdev without DPDK is considered "unsupported" officially, and is something we only want to use for KIND deployments. Therefore the config option to enable it is hidden, using an environment variable that is not exposed in the ovn-kubernetes config. Netdev mode does not support TX checksum offload, therefore it needs to be disabled on pod veth interfaces as well as veth interfaces attached from OVS to the host. See: antrea-io/antrea#14 Co-Authored-by: Andrew Sun <asun@redhat.com> Signed-off-by: Tim Rozet <trozet@redhat.com>

trozet · 2020-03-20T13:46:07Z

@antoninbas Hi, I've been working on adding similar support into OVN. I wanted to ask specifically why multiple OVS in separate containers cannot utilize the same kernel path? If each OVS is in its own namespace with its own unique DPID, will there be conflicts in kernel path? Thanks.

williamtu · 2020-03-25T15:23:00Z

I think it doesn't work, but in reality, I do see people running multiple ovs-vswitcd in multiple
containers sharing one ovs kernel datapath, without any problem. I guess it depends on use cases.

There is a talk about this in 2015 mentioning a couple of issues
https://www.openvswitch.org/support/ovscon2015/17/1555-benc.pdf

antoninbas · 2020-03-25T17:12:37Z

@trozet I believe you can make it work, but I also think that wasn't the best option for the Antrea case:

the OVS bridge for each Kind Node needs to have a different name (and I believe so does the host gateway interface), which means that we have to tinker with some Antrea configuration files to make this happen.
I don't know how easy it would be to make it work on macOS (or if it's even possible with some reasonable effort), I doubt that the OVS kernel module is available out-of-the-box in HyperKit.

And as William pointed-out, there may be some other issues on top of that. Of course, using the userspace datapath also comes with its own issues :)

antoninbas self-assigned this Nov 4, 2019

salv-orlando mentioned this issue Nov 8, 2019

Use kustomize to generate Antrea manifests #32

Merged

antoninbas added the p0 label Nov 21, 2019

antoninbas mentioned this issue Nov 25, 2019

Fix Kind support (Linux hosts only) #137

Merged

antoninbas closed this as completed in #137 Nov 27, 2019

antoninbas mentioned this issue Nov 28, 2019

Support Kind on macOS hosts #155

Closed

bertpersyn mentioned this issue Jun 4, 2021

Antrea on k3d + k3s #2238

Closed

yzaccc mentioned this issue Jan 11, 2022

Antrea-proxy K8sbykeshed/k8s-service-validator#86

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support running Antrea in clusters created with Kind #14

Support running Antrea in clusters created with Kind #14

antoninbas commented Nov 4, 2019

antoninbas commented Nov 8, 2019

antoninbas commented Nov 16, 2019

antoninbas commented Nov 22, 2019 •

edited

Loading

antoninbas commented Nov 22, 2019

trozet commented Mar 20, 2020

williamtu commented Mar 25, 2020

antoninbas commented Mar 25, 2020 •

edited

Loading

Support running Antrea in clusters created with Kind #14

Support running Antrea in clusters created with Kind #14

Comments

antoninbas commented Nov 4, 2019

antoninbas commented Nov 8, 2019

antoninbas commented Nov 16, 2019

antoninbas commented Nov 22, 2019 • edited Loading

antoninbas commented Nov 22, 2019

trozet commented Mar 20, 2020

williamtu commented Mar 25, 2020

antoninbas commented Mar 25, 2020 • edited Loading

antoninbas commented Nov 22, 2019 •

edited

Loading

antoninbas commented Mar 25, 2020 •

edited

Loading