Intel IXGBE 10 20G

Nicola Bonelli edited this page Sep 28, 2015 · 8 revisions

Description

In this page we report the performance of PFQ running on top of different Intel processor architectures, Xeon and i7.

Processors

  • Intel(R) Xeon(R), CPU X5650, 6 cores @2.66Ghz, 16GB RM, NIC Intel 10G 82599.

  • Intel(R) Core(TM), CPU i7-2600, 4 cores @3.40Ghz, 8GB RM, NIC Intel 10G 82599.

  • Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz

Software configuration

  • To properly enable DCA, load ioatdma kernel module first.

  • The device driver used is the Intel ixgbe-3.23.2.1, compiled through pfq-omatic script.

  • The PFQ kernel module is configured and loaded with pfq-load, using the following config file:

# ~/.pfq.conf 

Config
{
    pfq_module   = "/opt/PFQ/kernel/pfq.ko",

    pfq_options  = [ "capture_incoming=0", "capt_batch_len=32", "xmit_batch_len=128", "skb_pool_size=256" ],

    exclude_core = [],

    irq_affinity = "round-robin",

    drivers =
    [
        Driver
        {
            drvmod  = "/opt/ixgbe/src/ixgbe.ko",
            drvopt  = [ "LRO=0,0", "DCA=1,1", "AtrSampleRate=0,0" ],

            devices =
            [
               Device
               {
                   devname  = "eth2",
                   devspeed = Just 10000,
                   flowctrl = No,
                   ethopt   = [("-G", "tx", 768),
                               ("-C", "tx-frames-irq", 1024),
                               ("-C", "rx-usecs", 50)]
               }
            ]
        }
    ]
}

Test Single Thread

  • traffic generated: 60 bytes long UDP packets with random IP addresses, at 14.8Mpps

Capture 10G

  • Xeon processor: pfq-counters -c 64 -t 0.5.eth2
  • i7 processor: pfq-counters -c 64 -t 0.3.eth2

    RSS setup Xeon-E51660 Xeon X5650 i7-2600
    1 pfq-load -q1 8.3 Mpp 5.72 Mpps 6.4 Mpps
    2 pfq-load -q2 ~14.8 Mpps 10.6 Mpps 12.1 Mpps
    3 pfq-load -q3 ~14.8 Mpps ~14.8 Mpps 14.6 Mpps
    4 pfq-load -q4 ~14.8 Mpps ~14.8 Mpps 14.4 Mpps

Capture 20G

  • Xeon processor: pfq-counters -c 64 -t 0.7.eth2:eth4

Note: this test is performed with a single thread capturing traffic from two different boards. IRQ affinities of the two NICs are not overlapped.

RSS setup Xeon-E51660
1 pfq-load -q1 15.48 Mpps
2 pfq-load -q2 ~26.97 Mpps
3 pfq-load -q3 ~29.13 Mpps
4 pfq-load -q4 ~29.13 Mpps

Traffic generation

  • Xeon processor: pfq-gen -l 60 -R -t 0.5.eth2 -k 1,2...
  • i7 processor: pfq-gen -l 60 -R -t 0.3.eth2 -k 1,2...

    TSS setup Tx (Xeon) TX Speed (i7)
    1 -k 1 7.5 Mpps 9.5 Mpps
    2 -k 1,2 14.8 Mpps 13.8 Mpps
    3 -k 1,2,3 14.8 Mpps 13.0 Mpps
    4 -k 1,2,3,4 14.8 Mpps -

Test Multiple Threads

The traffic is balanced across 2 (or more) user-space threads with the PFQ/lang function steer_flow. Additional steering functions are described in PFQ/lang wiki.

  • traffic generated: 60 bytes long UDP packets with random IP addresses, at 14.8Mpps
  • capture tool: user/tool/pfq-counters

Command Line

  • Xeon processor: pfq-counters -f steer_flow -c 64 -t 0.5.eth2 -t 0.4
  • i7 processor: pfq-counters -f steer_flow -c 64 -t 0.3.eth2 -t 0.2

    RSS setup Xeon-E51660 Xeon X5650 i7-2600
    1 pfq-load -q1 7.05 Mpps 5 Mpps 5.44 Mpps
    2 pfq-load -q2 14.07 Mpps 9.7 Mpps 10.33 Mpps
    3 pfq-load -q3 ~14.8 Mpps 14.1 Mpps 14.6 Mpps
    4 pfq-load -q4 14.8 Mpps 14.6 Mpps 14.5 Mpps