Skip to content

Commit

Permalink
net/i40e: remove memory barrier from NEON Rx
Browse files Browse the repository at this point in the history
[ upstream commit 78b5059 ]

For x86, the descriptors needs to be loaded in order, so in between two
descriptors loading, there is a compiler barrier in place.[1]
For aarch64, a patch [2] is in place to survive with discontinuous DD
bits, the barriers can be removed to take full advantage of out-of-order
execution.

50% performance gain in the RFC2544 NDR test was measured on ThunderX2.
12.50% performance gain in the RFC2544 NDR test was measured on Ampere
eMAG80 platform.

[1] http://inbox.dpdk.org/users/039ED4275CED7440929022BC67E7061153D71548@
SHSMSX105.ccr.corp.intel.com/
[2] https://mails.dpdk.org/archives/stable/2017-October/003324.html

Fixes: ae0eb31 ("net/i40e: implement vector PMD for ARM")

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
  • Loading branch information
GavinHu-Arm authored and kevintraynor committed Nov 21, 2019
1 parent c78fa03 commit 8628a3a
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion drivers/net/i40e/i40e_rxtx_vec_neon.c
Expand Up @@ -285,7 +285,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
/* Read desc statuses backwards to avoid race condition */
/* A.1 load 4 pkts desc */
descs[3] = vld1q_u64((uint64_t *)(rxdp + 3));
rte_rmb();

/* B.2 copy 2 mbuf point into rx_pkts */
vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1);
Expand Down

0 comments on commit 8628a3a

Please sign in to comment.