Skip to content

Commit

Permalink
ice: Add support for XDP multi-buffer on Rx side
Browse files Browse the repository at this point in the history
Ice driver needs to be a bit reworked on Rx data path in order to
support multi-buffer XDP. For skb path, it currently works in a way that
Rx ring carries pointer to skb so if driver didn't manage to combine
fragmented frame at current NAPI instance, it can restore the state on
next instance and keep looking for last fragment (so descriptor with EOP
bit set). What needs to be achieved is that xdp_buff needs to be
combined in such way (linear + frags part) in the first place. Then skb
will be ready to go in case of XDP_PASS or BPF program being not present
on interface. If BPF program is there, it would work on multi-buffer
XDP. At this point xdp_buff resides directly on Rx ring, so given the
fact that skb will be built straight from xdp_buff, there will be no
further need to carry skb on Rx ring.

Besides removing skb pointer from Rx ring, lots of members have been
moved around within ice_rx_ring. First and foremost reason was to place
rx_buf with xdp_buff on the same cacheline. This means that once we
touch rx_buf (which is a preceding step before touching xdp_buff),
xdp_buff will already be hot in cache. Second thing was that xdp_rxq is
used rather rarely and it occupies a separate cacheline, so maybe it is
better to have it at the end of ice_rx_ring.

Other change that affects ice_rx_ring is the introduction of
ice_rx_ring::first_desc. Its purpose is twofold - first is to propagate
rx_buf->act to all the parts of current xdp_buff after running XDP
program, so that ice_put_rx_buf() that got moved out of the main Rx
processing loop will be able to tak an appriopriate action on each
buffer. Second is for ice_construct_skb().

ice_construct_skb() has a copybreak mechanism which had an explicit
impact on xdp_buff->skb conversion in the new approach when legacy Rx
flag is toggled. It works in a way that linear part is 256 bytes long,
if frame is bigger than that, remaining bytes are going as a frag to
skb_shared_info.

This means while memcpying frags from xdp_buff to newly allocated skb,
care needs to be taken when picking the destination frag array entry.
Upon the time ice_construct_skb() is called, when dealing with
fragmented frame, current rx_buf points to the *last* fragment, but
copybreak needs to be done against the first one.  That's where
ice_rx_ring::first_desc helps.

When frame building spans across NAPI polls (DD bit is not set on
current descriptor and xdp->data is not NULL) with current Rx buffer
handling state there might be some problems.
Since calls to ice_put_rx_buf() were pulled out of the main Rx
processing loop and were scoped from cached_ntc to current ntc, remember
that now mentioned function relies on rx_buf->act, which is set within
ice_run_xdp(). ice_run_xdp() is called when EOP bit was found, so
currently we could put Rx buffer with rx_buf->act being *uninitialized*.
To address this, change scoping to rely on first_desc on both boundaries
instead.

This also implies that cleaned_count which is used as an input to
ice_alloc_rx_buffers() and tells how many new buffers should be refilled
has to be adjusted. If it stayed as is, what could happen is a case
where ntc would go over ntu.

Therefore, remove cleaned_count altogether and use against allocing
routine newly introduced ICE_RX_DESC_UNUSED() macro which is an
equivalent of ICE_DESC_UNUSED() dedicated for Rx side and based on
struct ice_rx_ring::first_desc instead of next_to_clean.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/bpf/20230131204506.219292-11-maciej.fijalkowski@intel.com
  • Loading branch information
mfijalko authored and borkmann committed Feb 1, 2023
1 parent 8a11b33 commit 2fba7dc
Show file tree
Hide file tree
Showing 6 changed files with 200 additions and 98 deletions.
17 changes: 11 additions & 6 deletions drivers/net/ethernet/intel/ice/ice_base.c
Original file line number Diff line number Diff line change
Expand Up @@ -492,16 +492,18 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring)
int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
{
struct device *dev = ice_pf_to_dev(ring->vsi->back);
u16 num_bufs = ICE_DESC_UNUSED(ring);
u32 num_bufs = ICE_RX_DESC_UNUSED(ring);
int err;

ring->rx_buf_len = ring->vsi->rx_buf_len;

if (ring->vsi->type == ICE_VSI_PF) {
if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))
/* coverity[check_return] */
xdp_rxq_info_reg(&ring->xdp_rxq, ring->netdev,
ring->q_index, ring->q_vector->napi.napi_id);
__xdp_rxq_info_reg(&ring->xdp_rxq, ring->netdev,
ring->q_index,
ring->q_vector->napi.napi_id,
ring->vsi->rx_buf_len);

ring->xsk_pool = ice_xsk_pool(ring);
if (ring->xsk_pool) {
Expand All @@ -521,9 +523,11 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
} else {
if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))
/* coverity[check_return] */
xdp_rxq_info_reg(&ring->xdp_rxq,
ring->netdev,
ring->q_index, ring->q_vector->napi.napi_id);
__xdp_rxq_info_reg(&ring->xdp_rxq,
ring->netdev,
ring->q_index,
ring->q_vector->napi.napi_id,
ring->vsi->rx_buf_len);

err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
MEM_TYPE_PAGE_SHARED,
Expand All @@ -534,6 +538,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
}

xdp_init_buff(&ring->xdp, ice_rx_pg_size(ring) / 2, &ring->xdp_rxq);
ring->xdp.data = NULL;
err = ice_setup_rx_ctx(ring);
if (err) {
dev_err(dev, "ice_setup_rx_ctx failed for RxQ %d, err %d\n",
Expand Down
2 changes: 1 addition & 1 deletion drivers/net/ethernet/intel/ice/ice_ethtool.c
Original file line number Diff line number Diff line change
Expand Up @@ -3092,7 +3092,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,

/* allocate Rx buffers */
err = ice_alloc_rx_bufs(&rx_rings[i],
ICE_DESC_UNUSED(&rx_rings[i]));
ICE_RX_DESC_UNUSED(&rx_rings[i]));
rx_unwind:
if (err) {
while (i) {
Expand Down
13 changes: 9 additions & 4 deletions drivers/net/ethernet/intel/ice/ice_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -2888,9 +2888,12 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
bool if_running = netif_running(vsi->netdev);
int ret = 0, xdp_ring_err = 0;

if (frame_size > ice_max_xdp_frame_size(vsi)) {
NL_SET_ERR_MSG_MOD(extack, "MTU too large for loading XDP");
return -EOPNOTSUPP;
if (prog && !prog->aux->xdp_has_frags) {
if (frame_size > ice_max_xdp_frame_size(vsi)) {
NL_SET_ERR_MSG_MOD(extack,
"MTU is too large for linear frames and XDP prog does not support frags");
return -EOPNOTSUPP;
}
}

/* need to stop netdev while setting up the program for Rx rings */
Expand Down Expand Up @@ -7354,6 +7357,7 @@ static int ice_change_mtu(struct net_device *netdev, int new_mtu)
struct ice_netdev_priv *np = netdev_priv(netdev);
struct ice_vsi *vsi = np->vsi;
struct ice_pf *pf = vsi->back;
struct bpf_prog *prog;
u8 count = 0;
int err = 0;

Expand All @@ -7362,7 +7366,8 @@ static int ice_change_mtu(struct net_device *netdev, int new_mtu)
return 0;
}

if (ice_is_xdp_ena_vsi(vsi)) {
prog = vsi->xdp_prog;
if (prog && !prog->aux->xdp_has_frags) {
int frame_size = ice_max_xdp_frame_size(vsi);

if (new_mtu + ICE_ETH_PKT_HDR_PAD > frame_size) {
Expand Down
Loading

0 comments on commit 2fba7dc

Please sign in to comment.