Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary of networking issues found with NF testing #44

Closed
4 tasks done
lixuna opened this issue Aug 14, 2018 · 5 comments
Closed
4 tasks done

Summary of networking issues found with NF testing #44

lixuna opened this issue Aug 14, 2018 · 5 comments

Comments

@lixuna
Copy link
Collaborator

lixuna commented Aug 14, 2018


Write up on networking issues with NF testing, including but not limited to:

  • Network configuration requirements for Packet
  • Networking requirements for VPP/DPDK NFs with the benefits for those specific requirements.
  • Create google doc with draft summary
  • add issues in this ticket
@taylor
Copy link
Collaborator

taylor commented Sep 12, 2018

@lixuna lixuna added this to To do in CNF Edge Throughput via automation Sep 12, 2018
@lixuna lixuna added this to the CNF Edge Throughput milestone Sep 12, 2018
@taylor
Copy link
Collaborator

taylor commented Sep 12, 2018

Network configuration requirements for Packet

For the initial box-by-box benchmark and comparison we are only interested in the performance of individual VNFs and CNFs, with focus on the data plane performance (throughput) and memory usage (resident set size, RSS).

For these tests the data plane network should be as simple as possible, which can be realized by attaching VFs (Virtual Functions) directly to the VNF or CNF being tested. The traffic generator (Pktgen) runs on a separate instance and can be connected via either PFs (Physical Functions) or VFs depending on the network configuration provided by Packet. Given that the current configuration runs both data plane and management / external networks through the same NIC, the connections will likely be based on VFs created from single port / PF, as the other port will be handling management and external network.

Below is a small diagram showing how this implementation can be realized using two Packet instances. Note that the data plane network will need to be configured with a VLAN to act as an L2 connection between instances.

image

The main requirement for this to work is that the necessary flags for SR-IOV are set in the BIOS (it should be possible to configure this via the Packet.net customer portal)

There are other configurations for the “System Under Test” instance that involves the use of VPP between the NIC and the VNFs and CNFs, but this only changes the software requirements, and should not change the requirements for Packet.

@lixuna lixuna moved this from To do to Done in CNF Edge Throughput Sep 12, 2018
@taylor
Copy link
Collaborator

taylor commented Sep 12, 2018

Network configuration requirements for fd.io

The requirements for fd.io are very similar to those for Packet. The biggest difference is seen in the connections between instances, as the fd.io CSIT testbeds have NICs dedicated for data plane traffic using point-to-point connections that removes the need for configuring the data plane network.

The diagram below shows the configuration that has been used for benchmarks.

image

By default the testbeds don’t fully support IOMMU. This can be fixed by enabling Intel. VT for Directed I/O (VT-d) in the BIOS (listed under Chipset -> North Bridge -> IIO Configuration).
Details about testbeds and network connections are available through CSIT

@taylor
Copy link
Collaborator

taylor commented Sep 12, 2018

Networking requirements for VPP/DPDK NFs

Focusing on the “System Under Test” instance, there are several ways that this can be configured to support multiple VNFs and CNFs. Examples of how this can be done is shown in the diagram below.

image

  • Starting at the top, the connections between NIC (VF) and VPP (PMD) do have a dependency on the hardware being used, more specifically the type of NIC. The PMD (Poll Mode Driver) varies depending on the NIC, but PMDs for both Intel and Mellanox (ConnectX-4) have been installed and should work (not yet tested with traffic on instances provided by Packet)
  • There is a high probability that PMDs will have to be compiled using DPDK before they can be used by VPP. For Mellanox in particular this involves a few extra steps
  • Going to the VNF, it is possible to create a connection using a Vhost interface in VPP, which is passed to the VM where the guest will see it as a “Virtio PCI” device.
  • For the CNF there are two ways of creating network connections to VPP. Both rely on having resources shared between the host and the guest, where packets can be shared and accessed by both VPP and the CNF.
    • Vhost - Vhost: Similar to the method used for VNFs. The difference here is that the CNF will have to use the vhost interface directly, where the VNF has it mapped to a “Virtio PCI” device
    • Memif - Memif: A relatively new implementation that is quite similar to Vhost, but with the potential for higher performance.

Most of these connections have been partially tested on Packet hardware.

@taylor
Copy link
Collaborator

taylor commented Sep 12, 2018

Issues seen during deployment and testing

Initial deployments were done on a single “all-in-one” instance, meaning both traffic generator and NF was running side by side. The data plane network was implemented using the default bridge implementations available in the frameworks used for virtualization, Vagrant (libvirt) for VMs and Docker for containers. Both of these work in similar fashion, as can be seen in the diagram below.

image

While both of these deployments did work, the amount of traffic that can be handled by these host bridges is very limited, to the point where the VNFs/CNFs would only be utilizing a few percent of their available resources. Variations based on these configurations were also tested, e.g. using TCP tunnels between the traffic generator and NF, but the results were similar to what was observed using the host bridges.

A different approach using an “all-in-one” instance was also tested, this one using VPP as the data plane network inside a single instance. The diagram below shows the configuration differences when testing either VNFs or CNFs.

  1. https://community.mellanox.com/docs/DOC-2386

  2. https://community.mellanox.com/docs/DOC-2729

image

The traffic generator is deployed as a VNF in both scenarios, as it currently only supports attaching to PCI devices, which is done through the Vhost to “Virtio PCI” mapping that happens in the VM. This solution removes the bottleneck that was seen previously with host bridges.
This solution is however also not ideal, as the traffic generator only supports a single queue per “Virtio PCI” interface, which limits the number of CPU cores that can be used to one per interface, or two in total with the configuration used. While the throughput is several times higher compared to the bridge configuration, it is still too low to fully utilize the resources available for the VNFs/CNFs.

List of issues with references

  • Throughput bottleneck using host bridge (and TCP tunnel) for data plane network

  • Throughput bottleneck using “All-in-one” instance with VPP

  • VPP doesn’t work with Mellanox ConnectX-4 (Workaround found)

  • SSH broken in newer version of Ubuntu box provided by Vagrant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

2 participants