Skip to content

Commit

Permalink
Merge branch 'packetmill'
Browse files Browse the repository at this point in the history
  • Loading branch information
tbarbette committed Apr 21, 2021
2 parents 828d726 + 8d02cd9 commit e0a81f3
Show file tree
Hide file tree
Showing 14 changed files with 226 additions and 57 deletions.
87 changes: 62 additions & 25 deletions README.md
@@ -1,38 +1,75 @@
[FastClick](https://www.fastclick.dev) ![CI](https://github.com/tbarbette/fastclick/workflows/C/C++%20CI/badge.svg)
=========
This is an extended version of the Click Modular Router featuring an
improved Netmap support and a new DPDK support. It was the result of
our ANCS paper available at http://hdl.handle.net/2268/181954, but received
multiple contributions and improvements since then.
# FastClick (PacketMill)

The [Wiki](https://github.com/tbarbette/fastclick/wiki) provides documentation about the elements and how to use some FastClick features
such as batching.
This repo is a modified FastClick that comes with some additional source-code optimization techniques to improve the performance of high-speed packet processing. Additionally, it implements different metadata management models, i.e., copying, overlaying, and X-Change.

Announcements
-------------
Be sure to watch the repository and check out the [GitHub Discussions](https://github.com/tbarbette/fastclick/discussions) to stay up to date!
For more information, please refer to PacketMill's [paper][packetmill-paper] and [repo][packetmill-repo].

Quick start for DPDK
--------------------

* Install DPDK's dependencies (`sudo apt install libelf-dev build-essential pkg-config zlib1g-dev libnuma-dev`)
* Install DPDK (http://core.dpdk.org/doc/quick-start/). Since 20.11 you have to use meson : `meson build && cd build && ninja && sudo ninja install`
* Build FastClick, with support for DPDK using the following command:
## Source-code Modifications

```
./configure --enable-dpdk --enable-intel-cpu --verbose --enable-select=poll CFLAGS="-O3" CXXFLAGS="-std=c++11 -O3" --disable-dynamic-linking --enable-poll --enable-bound-port-transfer --enable-local --enable-flow --disable-task-stats --disable-cpu-load
PacketMill performs multiple source-code optimizations that exploit the availabe information in a given NF configuration file. We have implemented these optimizations on top of `click-devirtualize`.

To use these optimization, you need to build and install FastClick as follows:


```bash
git clone --branch packetmill git@github.com:tbarbette/fastclick.git
cd fastclick
./configure --disable-linuxmodule --enable-userlevel --enable-user-multithread --enable-etherswitch --disable-dynamic-linking --enable-local --enable-dpdk --enable-research --enable-flow --disable-task-stats --enable-cpu-load --prefix $(pwd)/build/ --enable-intel-cpu CXX="clang++ -fno-access-control" CC="clang" CXXFLAGS="-std=gnu++14 -O3" --disable-bound-port-transfer --enable-dpdk-pool --disable-dpdk-packet
make
sudo make uninstall
sudo make install
```
* Since DPDK is using Meson and pkg-config, to compile against various, or non-globally installed DPDK versions, one can prepend `PKG_CONFIG_PATH=path/to/libpdpdk.pc/../` to both configure and make.

*You will find more information in the [High-Speed I/O wiki page](https://github.com/tbarbette/fastclick/wiki/High-speed-I-O).*
You need to define `RTE_SDK` and `RTE_TARGET` before configuring FastClick.

**Note: PacketMill's [repo][packetmill-repo] offers a automated pipeline/workflow for building/installing PacketMill (FastClick + X-Change) and performing some experiments related to PacketMill's PacketMill's [paper][packetmill-paper].**

### Devirtualize Pass

FastClick "Light"
-----------------
FastClick, like Click comes with a lot of features that you may not use. The following options will improve performance further :
This optimization pass removes virtual function calls from elements' source code based on the input configuration file. More specifically, it duplicates the source code, override some methods, and defines the type of called pointers & the called functions. The following code snippet shows the source code of `FromDPDKDevice::output_push` defined by `click-devirtualize` after applying the pass.

```cpp
inline void
FromDPDKDevice_a_afd0::output_push(int i, Packet *p) const
{
if (i == 0) { ((Classifier_a_aFNT_a3_sc0 *)output(i).element())->Classifier_a_aFNT_a3_sc0::push(0, p); return; }
output(i).push(p);
}
```
./configure --enable-dpdk --enable-intel-cpu --verbose --enable-select=poll CFLAGS="-O3" CXXFLAGS="-std=c++11 -O3" --disable-dynamic-linking --enable-poll --enable-bound-port-transfer --enable-local --enable-flow --disable-task-stats --disable-cpu-load --enable-dpdk-packet --disable-clone --disable-dpdk-softqueue
make
The new source code replaces the actual call as opposed to the normal FastClick code that uses `output(port).push_batch(batch);`. This pass has been originally introduced by `click-devirtualize`. We have adopted it and modified it to work with FastClick. Fore more information, please check click-devirtualize [paper][devirtualize-paper].
To use this pass, run the following commands:
**Note that you need to compile & install FastClick before applying any pass.**
```bash
sudo bin/click-devirtualize CONFIG > package.uo
ar x package.uo config
cd userlevel
sudo ../bin/click-mkmindriver -V -C .. -p embed --ship -u ../package.uo
make embedclick MINDRIVER=embed STATIC=0
```

After building `embedclick`, you can run the new click binary with the new configuration file. For instance, you can run the following command:

```bash
sudo userlevel/embedclick --dpdk -l 0-35 -n 6 -w 0000:17:00.0 -v -- config
```

To see the changes to the source code, you can check `clickdv*.cc` and `clickdv*.hh` files in `userlevel/`.

### Replace Pass

This pass replaces the variables with their available value in the configuration file. For instance, the following code snippet shows a part `FromDPDKDevice::run_task` function. Replace pass substitutes `_burst` variable by its value, `32`, which is specified in the input configuration file.

```cpp
// FastClick
unsigned n = rte_eth_rx_burst(_dev->port_id, iqueue, pkts, _burst);

// click-devirtualize + replace pass
unsigned n = rte_eth_rx_burst(_dev->port_id, iqueue, pkts, 32);
```
* Disable task stats suppress statistics tracking for advanced task scheduling with e.g. BalancedThreadSched. With DPDK, it's polling anyway... And as far as scheduling is concerned, RSS++ has a better solution.
* Disable CPU load will remove load tracking. That is accounting for a CPU percentage while using DPDK by counting cycles spent in empty runs vs all runs. Accessible with the "load" handler.
Expand Down
4 changes: 4 additions & 0 deletions click-buildtool.in
Expand Up @@ -34,6 +34,10 @@ elem2=""
default_provisions="@provisions@"
driver_provisions="@DRIVERS@"

if [ -z "$CLICK_ELEM_RAND_MAX" ] ; then
CLICK_ELEM_RAND_MAX=0
fi

export LC_COLLATE=C
trap "exit 1" HUP

Expand Down
3 changes: 3 additions & 0 deletions config-userlevel.h.in
Expand Up @@ -274,6 +274,9 @@
/* Define if a Click user-level driver manages Intel DPDK packet pools. */
#undef CLICK_PACKET_USE_DPDK

/* Define if a Click packet is inside DPDK packet. */
#undef CLICK_PACKET_INSIDE_DPDK

/* Define if Click should use Valgrind client requests. */
#undef HAVE_VALGRIND

Expand Down
3 changes: 3 additions & 0 deletions config.h.in
Expand Up @@ -144,6 +144,9 @@
/* Define if the RE2 library is present. */
#undef HAVE_RE2

/* Define if elements have random alignment. */
#undef HAVE_RAND_ALIGN

/* Define if you want to use the stride scheduler. */
#undef HAVE_STRIDE_SCHED

Expand Down
19 changes: 19 additions & 0 deletions configure
Expand Up @@ -725,6 +725,7 @@ minios_dir
xen_dir
INCLUDE_KSYMS
LINUXMODULE_FIXINCLUDES
DO_RAND_ALIGN
PTHREAD_LIBS
HAVE_BATCH
AR_CREATEFLAGS
Expand Down Expand Up @@ -820,6 +821,7 @@ enable_batch
enable_verbose_batch
enable_auto_batch
enable_netmap_pool
enable_rand_align
enable_select
enable_poll
enable_kqueue
Expand Down Expand Up @@ -1553,6 +1555,7 @@ Optional Features:
--enable-auto-batch=[list|jump|port]
make vanilla elements batch-compatible automatically
--enable-netmap-pool use netmap buffers instead of standard Click buffers
--enable-rand-align enable random alignment of element
--enable-select=[select|poll|kqueue]
set file descriptor wait mechanism
--disable-select do not use select()
Expand Down Expand Up @@ -8473,6 +8476,22 @@ else
have_pci=no
fi

# Check whether --enable-rand-align was given.
if test "${enable_rand_align+set}" = set; then :
enableval=$enable_rand_align; :
else
enable_rand_align=no
fi

if test "x$enable_rand_align" = "xyes"; then
$as_echo "#define HAVE_RAND_ALIGN 1" >>confdefs.h

DO_RAND_ALIGN=1

else
DO_RAND_ALIGN=0

fi

# Check whether --enable-select was given.
if test "${enable_select+set}" = set; then :
Expand Down
9 changes: 9 additions & 0 deletions configure.in
Expand Up @@ -278,6 +278,15 @@ else
have_pci=no
fi

AC_ARG_ENABLE([rand-align],
[AS_HELP_STRING([ --enable-rand-align], [enable random alignment of element])],
[:], [enable_rand_align=no])
if test "x$enable_rand_align" = "xyes"; then
AC_DEFINE([HAVE_RAND_ALIGN])
AC_SUBST(DO_RAND_ALIGN,1)
else
AC_SUBST(DO_RAND_ALIGN,0)
fi

AC_ARG_ENABLE([select],
[AS_HELP_STRING([ --enable-select=[[select|poll|kqueue]]], [set file descriptor wait mechanism])
Expand Down
61 changes: 40 additions & 21 deletions elements/userlevel/fromdpdkdevice.cc
Expand Up @@ -322,36 +322,55 @@ void FromDPDKDevice::clear_buffers() {
click_chatter("Cleared %d buffers for queue %d",tot,q);
}
}
#ifdef DPDK_USE_XCHG
extern "C" {
#include <mlx5_xchg.h>
}
#endif

bool FromDPDKDevice::run_task(Task *t)
{
struct rte_mbuf *pkts[_burst];
int ret = 0;
bool FromDPDKDevice::run_task(Task *t) {
struct rte_mbuf *pkts[_burst];
int ret = 0;

for (int iqueue = queue_for_thisthread_begin();
iqueue<=queue_for_thisthread_end(); iqueue++) {
int iqueue = queue_for_thisthread_begin();
{ //This version differs from multi by having support for one queue per thread only, which is extremly usual
#if HAVE_BATCH
PacketBatch* head = 0;
WritablePacket *last;
PacketBatch *head = 0;
WritablePacket *last;
#endif
unsigned n = rte_eth_rx_burst(_dev->port_id, iqueue, pkts, _burst);
for (unsigned i = 0; i < n; ++i) {
unsigned char* data = rte_pktmbuf_mtod(pkts[i], unsigned char *);
rte_prefetch0(data);

#ifdef DPDK_USE_XCHG
unsigned n = rte_mlx5_rx_burst_xchg(_dev->port_id, iqueue, (struct xchg**)pkts, _burst);
#else
unsigned n = rte_eth_rx_burst(_dev->port_id, iqueue, pkts, _burst);
#endif

for (unsigned i = 0; i < n; ++i) {
unsigned char *data = rte_pktmbuf_mtod(pkts[i], unsigned char *);
rte_prefetch0(data);
#if CLICK_PACKET_USE_DPDK
WritablePacket *p = static_cast<WritablePacket *>(Packet::make(pkts[i], _clear));
#elif HAVE_ZEROCOPY
WritablePacket *p = Packet::make(data,
rte_pktmbuf_data_len(pkts[i]),
DPDKDevice::free_pkt,
pkts[i],
rte_pktmbuf_headroom(pkts[i]),
rte_pktmbuf_tailroom(pkts[i]),
_clear
);

# if CLICK_PACKET_INSIDE_DPDK
WritablePacket *p =(WritablePacket*)( pkts[i] + 1);
new (p) WritablePacket();

p->initialize(_clear);
p->set_buffer((unsigned char*)(pkts[i]->buf_addr), DPDKDevice::MBUF_DATA_SIZE);
p->set_data(data);
p->set_data_length(rte_pktmbuf_data_len(pkts[i]));
p->set_buffer_destructor(DPDKDevice::free_pkt);

p->set_destructor_argument(pkts[i]);
# else
WritablePacket *p = Packet::make(
data, rte_pktmbuf_data_len(pkts[i]), DPDKDevice::free_pkt, pkts[i],
rte_pktmbuf_headroom(pkts[i]), rte_pktmbuf_tailroom(pkts[i]), _clear);
# endif
#else
WritablePacket *p = Packet::make(data,
(uint32_t)rte_pktmbuf_pkt_len(pkts[i]), _clear);
(uint32_t)rte_pktmbuf_pkt_len(pkts[i]));
rte_pktmbuf_free(pkts[i]);
data = p->data();
#endif
Expand Down
29 changes: 22 additions & 7 deletions elements/userlevel/todpdkdevice.cc
Expand Up @@ -223,6 +223,13 @@ void ToDPDKDevice::run_timer(Timer *)
flush_internal_tx_queue(_iqueues.get());
}

# ifdef DPDK_USE_XCHG
extern "C" {
# include <mlx5_xchg.h>
}
# endif


/* Flush as much as possible packets from a given internal queue to the DPDK
* device. */
void ToDPDKDevice::flush_internal_tx_queue(DPDKDevice::TXInternalQueue &iqueue) {
Expand All @@ -245,8 +252,15 @@ void ToDPDKDevice::flush_internal_tx_queue(DPDKDevice::TXInternalQueue &iqueue)
// The sub_burst wraps around the ring
sub_burst = _internal_tx_queue_size - iqueue.index;
//Todo : if there is multiple queue assigned to this thread, send on all of them


# ifdef DPDK_USE_XCHG
r = rte_mlx5_tx_burst_xchg(_dev->port_id, queue_for_thisthread_begin(),(struct xchg**) &iqueue.pkts[iqueue.index], sub_burst);
# else
r = rte_eth_tx_burst(_dev->port_id, queue_for_thisthread_begin(), &iqueue.pkts[iqueue.index],
sub_burst);
sub_burst);
# endif

iqueue.nr_pending -= r;
iqueue.index += r;

Expand Down Expand Up @@ -302,12 +316,12 @@ void ToDPDKDevice::push(int, Packet *p)
// If we're in blocking mode, we loop until we can put p in the iqueue
} while (unlikely(_blocking && congestioned));

#if !CLICK_PACKET_USE_DPDK
# if !CLICK_PACKET_USE_DPDK && !CLICK_PACKET_INSIDE_DPDK
if (likely(is_fullpush()))
p->kill_nonatomic();
else
p->kill();
#endif
# endif
}

/**
Expand All @@ -327,7 +341,7 @@ void ToDPDKDevice::push_batch(int, PacketBatch *head)

//No recycling through click if we have DPDK-backed packets
bool congestioned;
# if !CLICK_PACKET_USE_DPDK
# if !CLICK_PACKET_USE_DPDK && !CLICK_PACKET_INSIDE_DPDK
BATCH_RECYCLE_START();
# endif
do {
Expand All @@ -344,7 +358,8 @@ void ToDPDKDevice::push_batch(int, PacketBatch *head)
abort();
}
next = p->next();
# if !CLICK_PACKET_USE_DPDK

# if !CLICK_PACKET_USE_DPDK && !CLICK_PACKET_INSIDE_DPDK
BATCH_RECYCLE_PACKET_CONTEXT(p);
# endif
p = next;
Expand Down Expand Up @@ -373,7 +388,7 @@ void ToDPDKDevice::push_batch(int, PacketBatch *head)
// If we're in blocking mode, we loop until we can put p in the iqueue
} while (unlikely(_blocking && congestioned));

# if !CLICK_PACKET_USE_DPDK
# if !CLICK_PACKET_USE_DPDK && !CLICK_PACKET_INSIDE_DPDK
//If non-blocking, drop all packets that could not be sent
while (p) {
next = p->next();
Expand All @@ -383,7 +398,7 @@ void ToDPDKDevice::push_batch(int, PacketBatch *head)
}
# endif

# if !CLICK_PACKET_USE_DPDK
# if !CLICK_PACKET_USE_DPDK && !CLICK_PACKET_INSIDE_DPDK
BATCH_RECYCLE_END();
# endif
}
Expand Down

0 comments on commit e0a81f3

Please sign in to comment.