Add new program 'firehose' #557

Merged
merged 2 commits into from Aug 4, 2015

Projects

None yet

2 participants

@lukego
Member
lukego commented Jul 10, 2015

This is a program that me and @pavel-odintsov have been working on this week. It is a specially optimized program that picks up packets off the wire and passes them to a simple C callback function. We are exploring it as a possible faster way to get packets into fastnetmon. Hopefully it is generally the easiest way in the universe to quickly attach a C function to an arbitrarily large amount of traffic.

On Interlaken I have setup a 40G / 41 Mpps load testing setup (96-byte SYN flood). For simple packet counting I see full line rate with one 2.4 GHz core and I suspect there is a lot more capacity than this. For basic packet header parsing/analysis based on fastnetmon code I see 34 Mpps. It's really fast.

The trick is that we statically initialize all packet buffers and DMA descriptors so that new packets can be received with just a few instructions.

README:

  Usage: firehose [OPTION]... <callback.so>

  Firehose: Execute C callback functions directly on live traffic.

  The callback function is provided as a shared library (.so file) and
  is called with each packet received from one or more 10-Gigabit (Intel
  82599) network interfaces.

  Receive overhead is very low. firehose can process tens of millions of
  packets per second with one CPU core.

  Print the example (-e option) and header file (-H option) for
  instructions on how to write a callback library.

  Options:

    -H, --print-header   Print a copy of firehose.h.
                         You need this to compile your callback library.
    -e, --example        Print instructions for compiling an example library.
    -i PCI, --input PCI  Receive packets from port with PCI address.
    -t SECONDS, --time SECONDS

Example usage:

  Instructions for creating an example program for firehose.

  This example assumes that you have the 'firehose' executable in your
  path. If you are using the 'snabb' executable then use a syntax like
  'snabb firehose' or './snabb firehose' instead as appropirate.

  Step 1: Create firehose.h

    Create firehose.h with this command:
    $ firehose -H > firehose.h

  Step 2: Create firehose_example.c

    Create the file firehose_example.c with these contents:

      #include <stdio.h>
      #include "firehose.h" // generated by 'firehose -h'

      static int packets;
      void firehose_start() { printf("Starting\n"); }
      void firehose_stop()  { printf("Stopping after %d packets\n", packets); }
      void firehose_packet(const char *pci, char *data, int length) { packets++; }

  Step 3: Compile the callback library

   Compile firehose_example.so callback shared library:
    $ gcc -O2 -fPIC -shared -o firehose_example.so firehose_example.c

  Step 4: Run the example on one or more 10G ports

    Indentify the PCI addresses of the (Intel 82599) network ports that
    you want to test with and run firehose:

    $ sudo firehose -i 0000:01:00.0 -i 0000:01:00.1 \
                    -i 0000:02:00.0 -i 0000:02:00.1 \
                    -t 1 ./firehose_example.so
    Loading shared object: ./firehose_example.so
    Initializing NIC: 0000:01:00.0
    Initializing NIC: 0000:01:00.1
    Initializing NIC: 0000:02:00.0
    Initializing NIC: 0000:02:00.1
    Initializing callback library
    Starting
    Processing traffic...
    Stopping after 41136315 packets

    which shows 41 million packets being processed in one
    second. (This is not the performance limit: that was the total
    traffic being received on the links for this example.)
@lukego lukego Add new program 'firehose'
README:

  Usage: firehose [OPTION]... <callback.so>

  Firehose: Execute C callback functions directly on live traffic.

  The callback function is provided as a shared library (.so file) and
  is called with each packet received from one or more 10-Gigabit (Intel
  82599) network interfaces.

  Receive overhead is very low. firehose can process tens of millions of
  packets per second with one CPU core.

  Print the example (-e option) and header file (-H option) for
  instructions on how to write a callback library.

  Options:

    -H, --print-header   Print a copy of firehose.h.
                         You need this to compile your callback library.
    -e, --example        Print instructions for compiling an example library.
    -i PCI, --input PCI  Receive packets from port with PCI address.
    -t SECONDS, --time SECONDS

Example usage:

  Instructions for creating an example program for firehose.

  This example assumes that you have the 'firehose' executable in your
  path. If you are using the 'snabb' executable then use a syntax like
  'snabb firehose' or './snabb firehose' instead as appropirate.

  Step 1: Create firehose.h

    Create firehose.h with this command:
    $ firehose -H > firehose.h

  Step 2: Create firehose_example.c

    Create the file firehose_example.c with these contents:

      #include <stdio.h>
      #include "firehose.h" // generated by 'firehose -h'

      static int packets;
      void firehose_start() { printf("Starting\n"); }
      void firehose_stop()  { printf("Stopping after %d packets\n", packets); }
      void firehose_packet(const char *pci, char *data, int length) { packets++; }

  Step 3: Compile the callback library

   Compile firehose_example.so callback shared library:
    $ gcc -O2 -fPIC -shared -o firehose_example.so firehose_example.c

  Step 4: Run the example on one or more 10G ports

    Indentify the PCI addresses of the (Intel 82599) network ports that
    you want to test with and run firehose:

    $ sudo firehose -i 0000:01:00.0 -i 0000:01:00.1 \
                    -i 0000:02:00.0 -i 0000:02:00.1 \
                    -t 1 ./firehose_example.so
    Loading shared object: ./firehose_example.so
    Initializing NIC: 0000:01:00.0
    Initializing NIC: 0000:01:00.1
    Initializing NIC: 0000:02:00.0
    Initializing NIC: 0000:02:00.1
    Initializing callback library
    Starting
    Processing traffic...
    Stopping after 41136315 packets

    which shows 41 million packets being processed in one
    second. (This is not the performance limit: that was the total
    traffic being received on the links for this example.)
3100960
@lukego
Member
lukego commented Jul 10, 2015

The whole inner loop for the example program is only 12 instructions with no branch except for the loop exit condition. That is looping over the receive descriptor queue, passing new packets to the callback, having the callback bump a counter, resetting the descriptor, continuing until no packets left.

I like the way this design made it possible to inline all of that C code into one place. Like what LuaJIT does automatically with its tracing JIT :-).

Cool to browse the disassembly.

Function entry and loop setup:

00000000000007d0 <firehose_callback_v1>:
 7d0:   49 63 c0                movslq %r8d,%rax
 7d3:   48 c1 e0 04             shl    $0x4,%rax
 7d7:   48 8d 3c 02             lea    (%rdx,%rax,1),%rdi
 7db:   f6 47 0c 01             testb  $0x1,0xc(%rdi)
 7df:   74 41                   je     822 <firehose_callback_v1+0x52>
 7e1:   8b 05 5d 08 20 00       mov    0x20085d(%rip),%eax        # 201044 <packets>
 7e7:   83 e9 01                sub    $0x1,%ecx
 7ea:   44 8d 48 01             lea    0x1(%rax),%r9d
 7ee:   66 90                   xchg   %ax,%ax

Inner loop:

 7f0:   41 83 c0 01             add    $0x1,%r8d
 7f4:   41 21 c8                and    %ecx,%r8d
 7f7:   49 63 c0                movslq %r8d,%rax
 7fa:   4c 8b 14 c6             mov    (%rsi,%rax,8),%r10
 7fe:   48 c1 e0 04             shl    $0x4,%rax
 802:   41 0f 18 0a             prefetcht0 (%r10)
 806:   c6 47 0c 00             movb   $0x0,0xc(%rdi)
 80a:   48 8d 3c 02             lea    (%rdx,%rax,1),%rdi
 80e:   45 89 ca                mov    %r9d,%r10d
 811:   41 83 c1 01             add    $0x1,%r9d
 815:   f6 47 0c 01             testb  $0x1,0xc(%rdi)
 819:   75 d5                   jne    7f0 <firehose_callback_v1+0x20>

Function return:

 81b:   44 89 15 22 08 20 00    mov    %r10d,0x200822(%rip)        # 201044 <packets>
 822:   44 89 c0                mov    %r8d,%eax
 825:   c3                      retq   
 826:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 82d:   00 00 00 
@lukego lukego added a commit to lukego/snabb that referenced this pull request Jul 10, 2015
@lukego lukego Merge PR #557 (firehose) into next 95c9eee
@lukego lukego added a commit to lukego/snabb that referenced this pull request Jul 26, 2015
@lukego lukego Merge PR #557 (firehose program) into next fe78615
@eugeneia eugeneia merged commit ca6d988 into snabbco:master Aug 4, 2015

1 check passed

snabb_bot SnabbBot
Details
@lukego lukego deleted the lukego:firehose branch Feb 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment