Skip to content

Commit

Permalink
Avoid dp_hash recirculation for balance-tcp bond selection mode
Browse files Browse the repository at this point in the history
Problem:
--------
In OVS-DPDK, flows with output over a bond interface of type “balance-tcp”
(using a hash on TCP/UDP 5-tuple) get translated by the ofproto layer into
"HASH" and "RECIRC" datapath actions. After recirculation, the packet is
forwarded to the bond member port based on 8-bits of the datapath hash
value computed through dp_hash. This causes performance degradation in the
following ways:

1. The recirculation of the packet implies another lookup of the packet’s
flow key in the exact match cache (EMC) and potentially Megaflow classifier
(DPCLS). This is the biggest cost factor.

2. The recirculated packets have a new “RSS” hash and compete with the
original packets for the scarce number of EMC slots. This implies more
EMC misses and potentially EMC thrashing causing costly DPCLS lookups.

3. The 256 extra megaflow entries per bond for dp_hash bond selection put
additional load on the revalidation threads.

Owing to this performance degradation, deployments stick to “balance-slb”
bond mode even though it does not do active-active load balancing for
VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same
source MAC address.

Proposed optimization:
----------------------
This proposal introduces a new load-balancing output action instead of
recirculation.

Maintain one table per-bond (could just be an array of uint16's) and
program it the same way internal flows are created today for each possible
hash value(256 entries) from ofproto layer. Use this table to load-balance
flows as part of output action processing.

Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules()
-> bond_may_recirc() and compose_output_action__() generate
“dp_hash(hash_l4(0))” and “recirc(<RecircID>)” actions. In this case the
RecircID identifies the bond. For the recirculated packets the ofproto layer
installs megaflow entries that match on RecircID and masked dp_hash and send
them to the corresponding output port.

Instead, we will now generate action as
    "lb_output(bond,<bond id>)"

This combines hash computation (only if needed, else re-use RSS hash) and
inline load-balancing over the bond. This action is used *only* for balance-tcp
bonds in OVS-DPDK datapath (the OVS kernel datapath remains unchanged).

Example:
--------
Current scheme:
---------------
With 1 IP-UDP flow:

flow-dump from pmd on cpu core: 2
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), packets:2828969, bytes:181054016, used:0.000s, actions:hash(hash_l4(0)),recirc(0x1)

recirc_id(0x1),dp_hash(0x113683bd/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:2828937, bytes:181051968, used:0.000s, actions:2

With 8 IP-UDP flows (with random UDP src port): (all hitting same DPCL):

flow-dump from pmd on cpu core: 2
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), packets:2674009, bytes:171136576, used:0.000s, actions:hash(hash_l4(0)),recirc(0x1)

recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:377395, bytes:24153280, used:0.000s, actions:2
recirc_id(0x1),dp_hash(0xb236c260/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:333486, bytes:21343104, used:0.000s, actions:1
recirc_id(0x1),dp_hash(0x7d89eb18/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:348461, bytes:22301504, used:0.000s, actions:1
recirc_id(0x1),dp_hash(0xa78d75df/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:633353, bytes:40534592, used:0.000s, actions:2
recirc_id(0x1),dp_hash(0xb58d846f/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:319901, bytes:20473664, used:0.001s, actions:2
recirc_id(0x1),dp_hash(0x24534406/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:334985, bytes:21439040, used:0.001s, actions:1
recirc_id(0x1),dp_hash(0x3cf32550/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:326404, bytes:20889856, used:0.001s, actions:1

New scheme:
-----------
We can do with a single flow entry (for any number of new flows):

in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), packets:2674009, bytes:171136576, used:0.000s, actions:lb_output(bond,1)

A new CLI has been added to dump datapath bond cache as given below.

“sudo ovs-appctl dpif-netdev/dp-bond-show [dp]”

root@ubuntu-190:performance_scripts # sudo ovs-appctl dpif-netdev/dp-bond-show
Bond cache:
        bond-id 1 :
                bucket 0 - slave 2
                bucket 1 - slave 1
                bucket 2 - slave 2
                bucket 3 - slave 1

Performance improvement:
------------------------
With a prototype of the proposed idea, the following perf improvement is seen
with Phy-VM-Phy UDP traffic, single flow. With multiple flows, the improvement
is even more enhanced (due to reduced number of flows).

1 VM:
*****
+--------------------------------------+
|                 mpps                 |
+--------------------------------------+
| Flows  master  with-opt.   %delta    |
+--------------------------------------+
| 1      4.47    5.73        28.21
| 10     4.17    5.35        28.45
| 400    3.52    5.39        53.01
| 1k     3.41    5.25        53.78
| 10k    2.53    4.57        80.44
| 100k   2.32    4.27        83.59
| 500k   2.32    4.27        83.59
+--------------------------------------+
mpps: million packets per second.
packet size 64 bytes.

Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>

CC: Jan Scheurich <jan.scheurich@ericsson.com>
CC: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
  • Loading branch information
vishaldeepajmera authored and ovsrobot committed Dec 17, 2019
1 parent 391b52f commit 8f03c2b
Show file tree
Hide file tree
Showing 18 changed files with 688 additions and 59 deletions.
1 change: 1 addition & 0 deletions datapath/linux/compat/include/linux/openvswitch.h
Expand Up @@ -994,6 +994,7 @@ enum ovs_action_attr {
#ifndef __KERNEL__
OVS_ACTION_ATTR_TUNNEL_PUSH, /* struct ovs_action_push_tnl*/
OVS_ACTION_ATTR_TUNNEL_POP, /* u32 port number. */
OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id */
#endif
__OVS_ACTION_ATTR_MAX, /* Nothing past this will be accepted
* from userspace. */
Expand Down

0 comments on commit 8f03c2b

Please sign in to comment.