Intel IXGBE 10 20G

Nicola Bonelli edited this page Jul 14, 2017 · 10 revisions

Description

In this page we report the performance of PFQ running on top of different Intel processor architectures, Xeon and i7.

Processors

  • Intel(R) Xeon(R), CPU X5650, 6 cores @2.66Ghz, 16GB RM, NIC Intel 10G 82599.

  • Intel(R) Core(TM), CPU i7-2600, 4 cores @3.40Ghz, 8GB RM, NIC Intel 10G 82599.

  • Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz

Software configuration

  • To properly enable DCA, load ioatdma kernel module first.

  • The device driver used is the Intel ixgbe-3.23.2.1, compiled through [pfq-omatic] (https://github.com/pfq/PFQ/wiki/PFqOmatic) script.

  • The PFQ kernel module is configured and loaded with pfq-load, using the following config file:

# ~/.pfq.conf 

Config
{
    pfq_module   = "/opt/PFQ/kernel/pfq.ko",

    pfq_options  = [ "capture_incoming=0", "capt_batch_len=32", "xmit_batch_len=128", "skb_pool_size=256" ],

    exclude_core = [],

    irq_affinity = "round-robin",

    drivers =
    [
        Driver
        {
            drvmod  = "/opt/ixgbe/src/ixgbe.ko",
            drvopt  = [ "LRO=0,0", "DCA=1,1", "AtrSampleRate=0,0" ],

            devices =
            [
               Device
               {
                   devname  = "eth2",
                   devspeed = Just 10000,
                   flowctrl = No,
                   ethopt   = [("-G", "tx", 768),
                               ("-C", "tx-frames-irq", 1024),
                               ("-C", "rx-usecs", 50)]
               }
            ]
        }
    ]
}

Test Single Thread

  • traffic generated: 60 bytes long UDP packets with random IP addresses, at 14.8Mpps

Capture 10G

  • Xeon processor: pfq-counters -c 64 -t 0.5.eth2

  • i7 processor: pfq-counters -c 64 -t 0.3.eth2

    RSS | setup | Xeon-E51660 | Xeon X5650 | i7-2600
    ---------|----------------|----------------|--------------|----------- 1 | pfq-load -q1 | 8.3 Mpp | 5.72 Mpps | 6.4 Mpps 2 | pfq-load -q2 | ~14.8 Mpps | 10.6 Mpps | 12.1 Mpps 3 | pfq-load -q3 | ~14.8 Mpps | ~14.8 Mpps | 14.6 Mpps 4 | pfq-load -q4 | ~14.8 Mpps | ~14.8 Mpps | 14.4 Mpps

Capture 20G

  • Xeon processor: pfq-counters -c 64 -t 0.7.eth2:eth4

Note: this test is performed with a single thread capturing traffic from two different boards. IRQ affinities of the two NICs are not overlapped.

RSS setup Xeon-E51660
1 pfq-load -q1 15.48 Mpps
2 pfq-load -q2 ~26.97 Mpps
3 pfq-load -q3 ~29.13 Mpps
4 pfq-load -q4 ~29.13 Mpps

Traffic generation

  • Xeon processor: pfq-gen -l 60 -R -t 0.5.eth2 -k 1,2...

  • i7 processor: pfq-gen -l 60 -R -t 0.3.eth2 -k 1,2...

    TSS | setup | Tx (Xeon) | TX Speed (i7) ---------|----------------|----------------|------------ 1 | -k 1 | 7.5 Mpps | 9.5 Mpps 2 | -k 1,2 | 14.8 Mpps | 13.8 Mpps 3 | -k 1,2,3 | 14.8 Mpps | 13.0 Mpps 4 | -k 1,2,3,4 | 14.8 Mpps | -

Test Multiple Threads

The traffic is balanced across 2 (or more) user-space threads with the PFQ/lang function steer_flow. Additional steering functions are described in [PFQ/lang wiki] (http://www.pfq.io/v6.x/lang/haskell/Network-PFQ-Lang-Default.html).

  • traffic generated: 60 bytes long UDP packets with random IP addresses, at 14.8Mpps
  • capture tool: user/tool/pfq-counters

Command Line

  • Xeon processor: pfq-counters -f steer_flow -c 64 -t 0.5.eth2 -t 0.4

  • i7 processor: pfq-counters -f steer_flow -c 64 -t 0.3.eth2 -t 0.2

    RSS | setup | Xeon-E51660 | Xeon X5650 | i7-2600
    ---------|----------------|----------------|--------------|------------ 1 | pfq-load -q1 | 7.05 Mpps | 5 Mpps | 5.44 Mpps
    2 | pfq-load -q2 | 14.07 Mpps | 9.7 Mpps | 10.33 Mpps 3 | pfq-load -q3 | ~14.8 Mpps | 14.1 Mpps | 14.6 Mpps 4 | pfq-load -q4 | 14.8 Mpps | 14.6 Mpps | 14.5 Mpps

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.