Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix dpdk runtime bug based on spdk/dpdk #1

Closed

Conversation

liu-chunmei
Copy link

@liu-chunmei liu-chunmei commented Dec 13, 2017

ceph async messenger has some run time error with this dpdk library, 1) need init mb->next= null when allocate a buffer other wise rte_mbuf_sanity_check will report error. 2) when check the size, can't calculate mbuf_data_room_size because async messenger dpdk will allocate this part later not at create mempool.

Signed-off-by: chunmei chunmei.liu@intel.com

@tchaikov
Copy link
Contributor

tchaikov commented Dec 13, 2017

@liu-chunmei i think the DPDK project does not accept PRs over github. you might want to contribute patches by sending them to the mailing list, see http://dpdk.org/dev .

@liu-chunmei
Copy link
Author

thanks, will do it.

@liu-chunmei liu-chunmei closed this Jan 2, 2018
ghost pushed a commit that referenced this pull request May 8, 2018
Below commit introduced pthread barrier for synchronization.
But two IPC threads block on the barrier, and never wake up.

  (gdb) bt
  #0  futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4)
      at ../sysdeps/unix/sysv/linux/futex-internal.h:61
  #1  futex_wait_simple (private=0, expected=0, futex_word=0x7fffffffcff4)
      at ../sysdeps/nptl/futex-internal.h:135
  #2  __pthread_barrier_wait (barrier=0x7fffffffcff0) at pthread_barrier_wait.c:184
  #3  rte_thread_init (arg=0x7fffffffcfe0)
      at ../dpdk/lib/librte_eal/common/eal_common_thread.c:160
  #4  start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333
  #5  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Through analysis, we find the barrier defined on the stack could be the
root cause. This patch will change to use heap memory as the barrier.

Fixes: d651ee4 ("eal: set affinity for control threads")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
jimharris pushed a commit that referenced this pull request Aug 13, 2018
This patch reverts the testpmd CLI prompt routine modifications done
in order to support softnic.
The reason of doing so is due to testpmd abnormal exit observed on
several setups caused by the softnic modifications to this routine,
for example: When running testpmd with tap interface
(testpmd
 -n 4 --vdev=net_tap0,iface=tap0,remote=eth1 -- --burst=64
 --mbcache=512 -i --nb-cores=7 --rxq=2 --txq=2 --txd=512
 --rxd=512 --port-topology=chained --forward-mode=rxonly)
testpmd crashes seconds after presenting its prompt with the following
error:
  testpmd> PANIC in prompt():
  CLI poll error (-1)

  Thread 1 "testpmd" received signal SIGABRT, Aborted.
  0x00007ffff668e0d0 in raise () from /lib64/libc.so.6
  (gdb) bt
  #0  0x00007ffff668e0d0 in raise () from /lib64/libc.so.6
  #1  0x00007ffff668f6b1 in abort () from /lib64/libc.so.6
  #2  0x0000000000468027 in __rte_panic ()
  #3  0x00000000004876ed in prompt ()
  #4  0x000000000046dffc in main ()

When running testpmd with bare-metal device
(testpmd -n 4 --socket-mem=1024,1024 -w 04:00.0  --
 --burst=64 --mbcache=512 -i --nb-cores=7
 --rxq=64 --txq=4 --txd=16 --rxd=16)
and pressing CTRL+D right after testpmd prompt is presented then
the program crashes while presenting the same messages as above.

Needless to say that this behavior is not observed when using the
previous CLI prompt routine.

Fixes: 0ad778b ("app/testpmd: rework softnic forward mode")

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
darsto pushed a commit that referenced this pull request Mar 20, 2019
The NIC's interrupt source has some active handler when the
port removed. We should cancel the delay handler before removing
dev to prevent executing the delay handler.

Call Trace:
  #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
      at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
  #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
      <rte_eth_devices@@DPDK_2.2+33024>)
      at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
  #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_alarm.c:90
  #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
      events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
      librte_eal/linuxapp/eal/eal_interrupts.c:838
  #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_interrupts.c:885
  #5  eal_intr_thread_main (arg=<optimized out>)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_interrupts.c:965
  #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
  #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6

Fixes: 2866c5f ("ixgbe: support port hotplug")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
jimharris pushed a commit that referenced this pull request Dec 6, 2019
The compat.h header file provided macros for two purposes:
1. it provided the macros for marking functions as rte_experimental
2. it provided the macros for doing function versioning

Although these were in the same file, #1 is something that is for use by
public header files, which #2 is for internal use only. Therefore, we can
split these into two headers, keeping #1 in rte_compat.h and #2 in a new
file rte_function_versioning.h. For "make" builds, since internal objects
pick up the headers from the "include/" folder, we need to add the new
header to the installation list, but for "meson" builds it does not need to
be installed as it's not for public use.

The rework also serves to allow the use of the function versioning macros
to files that actually need them, so the use of experimental functions does
not need including of the versioning code.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Andrzej Ostruszka <amo@semihalf.com>
jimharris pushed a commit that referenced this pull request Dec 6, 2019
Previously rx/tx_queues were passed into eth_from_pcaps_common()
as ptrs and were checked for being NULL.

In commit da6ba28 ("net/pcap: use a struct to pass user options")
that changed to pass in a ptr to a pmd_devargs_all which contains
the rx/tx_queues.

The parameter checking was not updated as part of that commit and
coverity caught that there was still a check if rx/tx_queues were
NULL, apparently after they had been dereferenced.

In fact as they are a members of the devargs_all struct, they will
not be NULL so remove those checks.

1231        struct pmd_devargs *rx_queues = &devargs_all->rx_queues;
1232        struct pmd_devargs *tx_queues = &devargs_all->tx_queues;
1233        const unsigned int nb_rx_queues = rx_queues->num_of_queue;
    deref_ptr: Directly dereferencing pointer tx_queues.
1234        const unsigned int nb_tx_queues = tx_queues->num_of_queue;
1235        unsigned int i;
1236
1237        /* do some parameter checking */
    CID 345004: Dereference before null check (REVERSE_INULL)
    [select issue]
1238        if (rx_queues == NULL && nb_rx_queues > 0)
1239                return -1;
    CID 345029 (#1 of 1): Dereference before null check (REVERSE_INULL)
    check_after_deref: Null-checking tx_queues suggests that it may be
    null, but it has already been dereferenced on all paths leading to
    the check.
1240        if (tx_queues == NULL && nb_tx_queues > 0)
1241                return -1;

Coverity issue: 345029
Coverity issue: 345044
Fixes: da6ba28 ("net/pcap: use a struct to pass user options")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Cian Ferriter <cian.ferriter@intel.com>
jimharris pushed a commit that referenced this pull request Dec 6, 2019
Coverity complains that ctrl_flags is set to NULL at the start
of the function and it may not have been set before there is a
jump to fc_success and it is dereferenced.

Check for NULL before dereference.

312fc_success:
   CID 344983 (#1 of 1): Explicit null dereferenced
   (FORWARD_NULL)7. var_deref_op: Dereferencing null pointer ctrl_flags.
313        *ctrl_flags = rte_cpu_to_be_64(*ctrl_flags);

Coverity issue: 344983
Fixes: 6cc5409 ("crypto/octeontx: add supported sessions")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
jimharris pushed a commit that referenced this pull request Dec 6, 2019
Coverity is complaining about identical code regardless of which branch
of the if else is taken. Functionally it means an error will always be
returned if this if else is hit. Remove the else branch.

    CID 337928 (#1 of 1): Identical code for different branches
    (IDENTICAL_BRANCHES)identical_branches: The same code is executed
    regardless of whether n->level != IPN3KE_TM_NODE_LEVEL_COS ||
    n->n_children != 0U is true, because the 'then' and 'else' branches
    are identical. Should one of the branches be modified, or the entire
    'if' statement replaced?
1506  if (n->level != IPN3KE_TM_NODE_LEVEL_COS ||
1507          n->n_children != 0) {
1508          return -rte_tm_error_set(error,
1509                  EINVAL,
1510                  RTE_TM_ERROR_TYPE_UNSPECIFIED,
1511                  NULL,
1512                  rte_strerror(EINVAL));
    else_branch: The else branch, identical to the then branch.
1513  } else {
1514          return -rte_tm_error_set(error,
1515                  EINVAL,
1516                  RTE_TM_ERROR_TYPE_UNSPECIFIED,
1517                  NULL,
1518                  rte_strerror(EINVAL));
1519  }

Coverity issue: 337928
Fixes: c820468 ("net/ipn3ke: support TM")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
jimharris pushed a commit that referenced this pull request Dec 6, 2019
Coverity complains that this statement is not needed as the goto
label is on the next line anyway. Remove the if statement.

653        ret = ipn3ke_cfg_parse_i40e_pf_ethdev(afu_name, pf_name);
   CID 337930 (#1 of 1): Identical code for different branches
   (IDENTICAL_BRANCHES)identical_branches: The same code is executed
   when the condition ret is true or false, because the code in the
   if-then branch and after the if statement is identical. Should
   the if statement be removed?
654        if (ret)
655                goto end;
   implicit_else: The code from the above if-then branch is identical
   to the code after the if statement.
656end:

Coverity issue: 337930
Fixes: c01c748 ("net/ipn3ke: add new driver")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
spdk-bot pushed a commit that referenced this pull request Aug 13, 2020
Currently, there are coverity defects warning as below:

CID 349937 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: port_number with
type uint16_t (16 bits, unsigned) is promoted in port_number << cur_pos
to type int (32 bits, signed), then sign-extended to type unsigned long
(64 bits, unsigned). If port_number << cur_pos is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.

CID 349893 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: vlan_tag with type
uint8_t (8 bits, unsigned) is promoted in vlan_tag << cur_pos to type
int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If vlan_tag << cur_pos is greater than 0x7FFFFFFF, the
upper bits of the result will all be 1.

This patch fixes them by replacing the data type of port_number and
vlan_tag with uint32_t in the inner static function named
hns3_fd_convert_meta_data of hns3 PMD driver.

Coverity issue: 349937, 349893
Fixes: fcba820 ("net/hns3: support flow director")
Cc: stable@dpdk.org

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
spdk-bot pushed a commit that referenced this pull request Aug 13, 2020
Currently, there is a coverity defect warning about hns3 PMD driver, the
detail information as blow:
CID 289969 (#1 of 1): Unchecked return value (CHECKED_RETURN)
1. check_return: Calling rte_mp_action_register without checking return
   value (as is done elsewhere 11 out of 13 times).

The problem is that missing checking the return value of calling the API
rte_mp_action_register during initialization. If registering an action
function for primary and secondary communication failed, the secondary
process can't work properly.

This patch fixes it by adding check return value of the API function
named rte_mp_action_register in the '.dev_init' implementation function
of hns3 PMD driver.

Coverity issue: 289969
Fixes: 23d4b61 ("net/hns3: support multiple process")
Cc: stable@dpdk.org

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
spdk-bot pushed a commit that referenced this pull request Feb 10, 2021
[ upstream commit c064f69 ]

Currently, we encounter segmentation fault when performing the following
test case:
1. Run testpmd application, config the flow filter rules then flush them
   repeatedly.
2. Inject FLR concurrently every 5 second.
The calltrace info:

This GDB was configured as "aarch64-linux-gnu".
Reading symbols from ./testpmd...(no debugging symbols found)...done.
[New LWP 322]
[New LWP 325]
[New LWP 324]
[New LWP 326]
[New LWP 323]
[New LWP 327]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/
libthread_db.so.1".
Core was generated by `/home/root/app/testpmd -w 0000:00:01.0 -w
0000:00:02.0 -w 0000:00:03.0 -l 0-3 -'.
Program terminated with signal SIGSEGV, Segmentation fault.
libc.so.6
[Current thread is 1 (Thread 0xffff8bb35110 (LWP 322))]
(gdb) bt
 #0  0x0000ffff8b936a90 in strlen () from /lib/aarch64-linux-gnu/
 libc.so.6
 #1  0x0000ffff8b905ccc in vfprintf () from /lib/aarch64-linux-gnu/
 libc.so.6
 #2  0x0000ffff8b993d04 in __printf_chk () from /lib/aarch64-linux-gnu/
 libc.so.6
 #3  0x0000000000754828 in port_flow_flush ()
 #4  0x0000000000870f3c in cmdline_parse ()

The root cause as follows:
In the '.flush' ops implementation function named hns3_flow_flush, By
the way the '.flush' ops is defined in the struct rte_flow_ops, if
failed to call hns3_clear_rss_filter, the out parameter error is not
set, and then the member variable name message in the struct error is
invalid(filled with 0x44444444 in port_flow_flush function of the
testpmd application), it leads to segmentation fault when format the
message.

We fixes it by filling error parameter when failure in calling static
function named hns3_clear_rss_filter in the the '.flush' ops
implementation function named hns3_flow_flush.

Fixes: c37ca66 ("net/hns3: support RSS")

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
spdk-bot pushed a commit that referenced this pull request Feb 10, 2021
[ upstream commit 5c471cb ]

Currently, there are coverity defects warning as below:

CID 349937 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: port_number with
type uint16_t (16 bits, unsigned) is promoted in port_number << cur_pos
to type int (32 bits, signed), then sign-extended to type unsigned long
(64 bits, unsigned). If port_number << cur_pos is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.

CID 349893 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: vlan_tag with type
uint8_t (8 bits, unsigned) is promoted in vlan_tag << cur_pos to type
int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If vlan_tag << cur_pos is greater than 0x7FFFFFFF, the
upper bits of the result will all be 1.

This patch fixes them by replacing the data type of port_number and
vlan_tag with uint32_t in the inner static function named
hns3_fd_convert_meta_data of hns3 PMD driver.

Coverity issue: 349937, 349893
Fixes: fcba820 ("net/hns3: support flow director")

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
spdk-bot pushed a commit that referenced this pull request Feb 10, 2021
[ upstream commit 9570b1f ]

Currently, there is a coverity defect warning about hns3 PMD driver, the
detail information as blow:
CID 289969 (#1 of 1): Unchecked return value (CHECKED_RETURN)
1. check_return: Calling rte_mp_action_register without checking return
   value (as is done elsewhere 11 out of 13 times).

The problem is that missing checking the return value of calling the API
rte_mp_action_register during initialization. If registering an action
function for primary and secondary communication failed, the secondary
process can't work properly.

This patch fixes it by adding check return value of the API function
named rte_mp_action_register in the '.dev_init' implementation function
of hns3 PMD driver.

Coverity issue: 289969
Fixes: 23d4b61 ("net/hns3: support multiple process")

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
spdk-bot pushed a commit that referenced this pull request Feb 16, 2021
Up to this commit the regex application used one QP which was assigned a
number of jobs, each with a different segment of a file to parse.  This
commit adds support for multiple QPs assignments. All QPs will be
assigned the same number of jobs, with the same segments of file to
parse. It will enable comparing functionality with different numbers of
QPs. All queues are managed on one core with one thread. This commit
focuses on changing routines API to support multi QPs, mainly, QP scalar
variables are replaced by per-QP struct instance.  The enqueue/dequeue
operations are interleaved as follows:
 enqueue(QP #1)
 enqueue(QP #2)
 ...
 enqueue(QP #n)
 dequeue(QP #1)
 dequeue(QP #2)
 ...
 dequeue(QP #n)

A new parameter 'nb_qps' was added to configure the number of QPs:
 --nb_qps <num of qps>.
If not configured, nb_qps is set to 1 by default.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
spdk-bot pushed a commit that referenced this pull request Feb 16, 2021
Since the patch '1848b117' has initialized the variable 'key' in
'struct rte_flow_action_rss' with 'NULL', the PMD cannot get the
RSS key now. Details as bellow:

testpmd> flow create 0 ingress pattern eth / ipv4 / end actions
         rss types ipv4-other end key
         1234567890123456789012345678901234567890FFFFFFFFFFFF123
         4567890123456789012345678901234567890FFFFFFFFFFFF
         queues end / end
Flow rule #1 created
testpmd> show port 0 rss-hash key
RSS functions:
         all ipv4-other ip
RSS key:
         4439796BB54C5023B675EA5B124F9F30B8A2C03DDFDC4D02A08C9B3
         34AF64A4C05C6FA343958D8557D99583AE138C92E81150366

This patch sets offset and size of the 'key' variable as the first
parameter of the token 'key'. Later, the address of the RSS key will
be copied to 'key' variable.

Fixes: 1848b11 ("app/testpmd: fix RSS key for flow API RSS rule")
Cc: stable@dpdk.org

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Tested-by: Jun W Zhou <junx.w.zhou@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
spdk-bot pushed a commit that referenced this pull request May 25, 2021
CID 363716 (#1 of 1): Unchecked return value (CHECKED_RETURN)
check_return: Calling rte_pci_write_config without checking
return value (as is done elsewhere 46 out of 49 times).

Coverity issue: 363716
Fixes: be14720 ("net/bnxt: support FW reset")
Cc: stable@dpdk.org

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
spdk-bot pushed a commit that referenced this pull request Nov 29, 2021
Current testpmd implementation supports VXLAN only for tunnel offload.
Add GRE, NVGRE and GENEVE for tunnel offload flow matches.

For example:
testpmd> flow tunnel create 0 type vxlan
port 0: flow tunnel #1 type vxlan
testpmd> flow tunnel create 0 type nvgre
port 0: flow tunnel #2 type nvgre
testpmd> flow tunnel create 0 type gre
port 0: flow tunnel #3 type gre
testpmd> flow tunnel create 0 type geneve
port 0: flow tunnel #4 type geneve

Fixes: 1b9f274 ("app/testpmd: add commands for tunnel offload")
Cc: stable@dpdk.org

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gregory Etelson <getelson@nvidia.com>
spdk-bot pushed a commit that referenced this pull request Nov 29, 2021
Caught with ASan:
==9727==ERROR: AddressSanitizer: stack-buffer-overflow on address
  0x7f0daa2fc0d0 at pc 0x7f0daeefacb2 bp 0x7f0daa2fadd0 sp 0x7f0daa2fa578
READ of size 1 at 0x7f0daa2fc0d0 thread T1
    #0 0x7f0daeefacb1  (/lib64/libasan.so.5+0xbacb1)
    #1 0x115eba1 in dev_uev_parse ../lib/eal/linux/eal_dev.c:167
    #2 0x115f281 in dev_uev_handler ../lib/eal/linux/eal_dev.c:248
    #3 0x1169b91 in eal_intr_process_interrupts
  ../lib/eal/linux/eal_interrupts.c:1026
    #4 0x116a3a2 in eal_intr_handle_interrupts
  ../lib/eal/linux/eal_interrupts.c:1100
    #5 0x116a7f0 in eal_intr_thread_main
  ../lib/eal/linux/eal_interrupts.c:1172
    #6 0x112640a in ctrl_thread_init
  ../lib/eal/common/eal_common_thread.c:202
    #7 0x7f0dade27159 in start_thread (/lib64/libpthread.so.0+0x8159)
    #8 0x7f0dadb58f72 in clone (/lib64/libc.so.6+0xfcf72)

Address 0x7f0daa2fc0d0 is located in stack of thread T1 at offset 4192
  in frame
    #0 0x115f0c9 in dev_uev_handler ../lib/eal/linux/eal_dev.c:226

  This frame has 2 object(s):
    [32, 48) 'uevent'
    [96, 4192) 'buf' <== Memory access at offset 4192 overflows this
  variable
HINT: this may be a false positive if your program uses some custom
  stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
Thread T1 created by T0 here:
    #0 0x7f0daee92ea3 in __interceptor_pthread_create
  (/lib64/libasan.so.5+0x52ea3)
    #1 0x1126542 in rte_ctrl_thread_create
  ../lib/eal/common/eal_common_thread.c:228
    #2 0x116a8b5 in rte_eal_intr_init
  ../lib/eal/linux/eal_interrupts.c:1200
    #3 0x1159dd1 in rte_eal_init ../lib/eal/linux/eal.c:1044
    #4 0x7a22f8 in main ../app/test-pmd/testpmd.c:4105
    #5 0x7f0dada7f802 in __libc_start_main (/lib64/libc.so.6+0x23802)

Bugzilla ID: 792
Fixes: 0d0f478 ("eal/linux: add uevent parse and process")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Yan Xia <yanx.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
[ upstream commit df58076 ]

Cast 1 to type uint64_t to avoid overflow.

CID 375812 (#1 of 1):
Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression 1 << 2 * i + 1
with type int (32 bits, signed) is evaluated using 32-bit arithmetic, and
then used in a context that expects an expression of type uint64_t
(64 bits, unsigned).

Coverity issue: 375812
Fixes: 5fec01c ("net/i40e: support Linux VF to configure IRQ link list")

Signed-off-by: Steve Yang <stevex.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
[ upstream commit d57f289 ]

The "kni_dev" is the private data of the "net_device" in kni, and allocated
with the "net_device" by calling "alloc_netdev()". The "net_device" is
freed by calling "free_netdev()" when kni release. The freed memory
includes the "kni_dev". So after "kni_dev" should not be accessed after
"net_device" is released.

Fixes: e77fec6 ("kni: fix possible mbuf leaks and speed up port release")

KASAN trace:

[   85.263717] ==========================================================
[   85.264418] BUG: KASAN: use-after-free in kni_net_release_fifo_phy+
		0x30/0x84 [rte_kni]
[   85.265139] Read of size 8 at addr ffff000260668d60 by task kni/341
[   85.265703]
[   85.265857] CPU: 0 PID: 341 Comm: kni Tainted: G     U     O
		5.15.0-rc4+ #1
[   85.266525] Hardware name: linux,dummy-virt (DT)
[   85.266968] Call trace:
[   85.267220]  dump_backtrace+0x0/0x2d0
[   85.267591]  show_stack+0x24/0x30
[   85.267924]  dump_stack_lvl+0x8c/0xb8
[   85.268294]  print_address_description.constprop.0+0x74/0x2b8
[   85.268855]  kasan_report+0x1e4/0x200
[   85.269224]  __asan_load8+0x98/0xd4
[   85.269577]  kni_net_release_fifo_phy+0x30/0x84 [rte_kni]
[   85.270116]  kni_dev_remove.isra.0+0x50/0x64 [rte_kni]
[   85.270630]  kni_ioctl_release+0x254/0x320 [rte_kni]
[   85.271136]  kni_ioctl+0x64/0xb0 [rte_kni]
[   85.271553]  __arm64_sys_ioctl+0xdc/0x120
[   85.271955]  invoke_syscall+0x68/0x1a0
[   85.272332]  el0_svc_common.constprop.0+0x90/0x200
[   85.272807]  do_el0_svc+0x94/0xa4
[   85.273144]  el0_svc+0x78/0x240
[   85.273463]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.273895]  el0t_64_sync+0x1a0/0x1a4
[   85.274264]
[   85.274427] Allocated by task 341:
[   85.274767]  kasan_save_stack+0x2c/0x60
[   85.275157]  __kasan_kmalloc+0x90/0xb4
[   85.275533]  __kmalloc_node+0x230/0x594
[   85.275917]  kvmalloc_node+0x8c/0x190
[   85.276286]  alloc_netdev_mqs+0x70/0x6b0
[   85.276678]  kni_ioctl_create+0x224/0xf40 [rte_kni]
[   85.277166]  kni_ioctl+0x9c/0xb0 [rte_kni]
[   85.277581]  __arm64_sys_ioctl+0xdc/0x120
[   85.277980]  invoke_syscall+0x68/0x1a0
[   85.278357]  el0_svc_common.constprop.0+0x90/0x200
[   85.278830]  do_el0_svc+0x94/0xa4
[   85.279172]  el0_svc+0x78/0x240
[   85.279491]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.279925]  el0t_64_sync+0x1a0/0x1a4
[   85.280292]
[   85.280454] Freed by task 341:
[   85.280763]  kasan_save_stack+0x2c/0x60
[   85.281147]  kasan_set_track+0x2c/0x40
[   85.281522]  kasan_set_free_info+0x2c/0x50
[   85.281930]  __kasan_slab_free+0xdc/0x140
[   85.282331]  slab_free_freelist_hook+0x90/0x250
[   85.282782]  kfree+0x128/0x580
[   85.283099]  kvfree+0x48/0x60
[   85.283402]  netdev_freemem+0x34/0x44
[   85.283770]  netdev_release+0x50/0x64
[   85.284138]  device_release+0xa0/0x120
[   85.284516]  kobject_put+0xf8/0x160
[   85.284867]  put_device+0x20/0x30
[   85.285204]  free_netdev+0x22c/0x310
[   85.285562]  kni_dev_remove.isra.0+0x48/0x64 [rte_kni]
[   85.286076]  kni_ioctl_release+0x254/0x320 [rte_kni]
[   85.286573]  kni_ioctl+0x64/0xb0 [rte_kni]
[   85.286992]  __arm64_sys_ioctl+0xdc/0x120
[   85.287392]  invoke_syscall+0x68/0x1a0
[   85.287769]  el0_svc_common.constprop.0+0x90/0x200
[   85.288243]  do_el0_svc+0x94/0xa4
[   85.288579]  el0_svc+0x78/0x240
[   85.288899]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.289332]  el0t_64_sync+0x1a0/0x1a4
[   85.289699]
[   85.289862] The buggy address belongs to the object at ffff000260668000
[   85.289862]  which belongs to the cache kmalloc-cg-8k of size 8192
[   85.291079] The buggy address is located 3424 bytes inside of
[   85.291079]  8192-byte region [ffff000260668000, ffff00026066a000)
[   85.292213] The buggy address belongs to the page:
[   85.292684] page:(____ptrval____) refcount:1 mapcount:0 mapping:
		0000000000000000 index:0x0 pfn:0x2a0668
[   85.293585] head:(____ptrval____) order:3 compound_mapcount:0
		compound_pincount:0
[   85.294305] flags: 0xbfff80000010200(slab|head|node=0|zone=2|
		lastcpupid=0x7fff)
[   85.295020] raw: 0bfff80000010200 0000000000000000 dead000000000122
		ffff0000c000d680
[   85.295767] raw: 0000000000000000 0000000080020002 00000001ffffffff
		0000000000000000
[   85.296512] page dumped because: kasan: bad access detected
[   85.297054]
[   85.297217] Memory state around the buggy address:
[   85.297688]  ffff000260668c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.298384]  ffff000260668c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.299088] >ffff000260668d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.299781]                                                        ^
[   85.300396]  ffff000260668d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.301092]  ffff000260668e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.301787] ===========================================================

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
[ upstream commit 0490d69 ]

This patch fixes the stack buffer overflow error reported
from AddressSanitizer.
Function send_packetsx4() tries to access out of bound data
from rte_mbuf and fill it into TX buffer even in the case
where no pending packets (len = 0).
Performance impact:- No

ASAN error report:-
==819==ERROR: AddressSanitizer: stack-buffer-overflow on address
0xffffe2c0dcf0 at pc 0x0000005e791c bp 0xffffe2c0d7e0 sp 0xffffe2c0d800
READ of size 8 at 0xffffe2c0dcf0 thread T0
 #0 0x5e7918 in send_packetsx4 ../examples/l3fwd/l3fwd_common.h:251
 #1 0x5e7918 in send_packets_multi ../examples/l3fwd/l3fwd_neon.h:226

Fixes: 96ff445 ("examples/l3fwd: reorganise and optimize LPM code path")

Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
The Kni kthreads seem to be re-scheduled at a granularity of roughly
1 millisecond right now, which seems to be insufficient for performing
tests involving a lot of control plane traffic.

Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it
seems that the existing code cannot reschedule at the desired granularily,
due to precision constraints of schedule_timeout_interruptible().

In our use case, we leverage the Linux Kernel for control plane, and
it is not uncommon to have 60K - 100K pps for some signaling protocols.

Since we are not in atomic context, the usleep_range() function seems to be
more appropriate for being able to introduce smaller controlled delays,
in the range of 5-10 microseconds. Upon reading the existing code, it would
seem that this was the original intent. Adding sub-millisecond delays,
seems unfeasible with a call to schedule_timeout_interruptible().

KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
schedule_timeout_interruptible(
        usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL));

Below, we attempted a brief comparison between the existing implementation,
which uses schedule_timeout_interruptible() and usleep_range().

We attempt to measure the CPU usage, and RTT between two Kni interfaces,
which are created on top of vmxnet3 adapters, connected by a vSwitch.

insmod rte_kni.ko kthread_mode=single carrier=on

schedule_timeout_interruptible(usecs_to_jiffies(5))
kni_single CPU Usage: 2-4 %
[root@localhost ~]# ping 1.1.1.2 -I eth1
PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms

usleep_range(5, 10)
kni_single CPU usage: 50%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms

usleep_range(20, 50)
kni_single CPU usage: 24%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms

usleep_range(50, 100)
kni_single CPU usage: 13%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms

usleep_range(100, 200)
kni_single CPU usage: 7%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms

usleep_range(1000, 1100)
kni_single CPU usage: 2%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms

Upon testing, usleep_range(1000, 1100) seems roughly equivalent in
latency and cpu usage to the variant with schedule_timeout_interruptible(),
while usleep_range(100, 200) seems to give a decent tradeoff between
latency and cpu usage, while allowing users to tweak the limits for
improved precision if they have such use cases.

Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a
softlockup on my kernel.

Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 1226 Comm: kni_single Tainted: G        W  O 3.10 #1
 <IRQ>  [<ffffffff814f84de>] dump_stack+0x19/0x1b
 [<ffffffff814f7891>] panic+0xcd/0x1e0
 [<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160
 [<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0
 [<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0
 [<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0
 [<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80

This patch also attempts to remove this option.

References:
[1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt

Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
Acked-by: Padraig Connolly <Padraig.J.Connolly@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
Cast 1 to type uint64_t to avoid overflow.

CID 375812 (#1 of 1):
Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression 1 << 2 * i + 1
with type int (32 bits, signed) is evaluated using 32-bit arithmetic, and
then used in a context that expects an expression of type uint64_t
(64 bits, unsigned).

Coverity issue: 375812
Fixes: 5fec01c ("net/i40e: support Linux VF to configure IRQ link list")
Cc: stable@dpdk.org

Signed-off-by: Steve Yang <stevex.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
The "kni_dev" is the private data of the "net_device" in kni, and allocated
with the "net_device" by calling "alloc_netdev()". The "net_device" is
freed by calling "free_netdev()" when kni release. The freed memory
includes the "kni_dev". So after "kni_dev" should not be accessed after
"net_device" is released.

Fixes: e77fec6 ("kni: fix possible mbuf leaks and speed up port release")
Cc: stable@dpdk.org

KASAN trace:

[   85.263717] ==========================================================
[   85.264418] BUG: KASAN: use-after-free in kni_net_release_fifo_phy+
		0x30/0x84 [rte_kni]
[   85.265139] Read of size 8 at addr ffff000260668d60 by task kni/341
[   85.265703]
[   85.265857] CPU: 0 PID: 341 Comm: kni Tainted: G     U     O
		5.15.0-rc4+ #1
[   85.266525] Hardware name: linux,dummy-virt (DT)
[   85.266968] Call trace:
[   85.267220]  dump_backtrace+0x0/0x2d0
[   85.267591]  show_stack+0x24/0x30
[   85.267924]  dump_stack_lvl+0x8c/0xb8
[   85.268294]  print_address_description.constprop.0+0x74/0x2b8
[   85.268855]  kasan_report+0x1e4/0x200
[   85.269224]  __asan_load8+0x98/0xd4
[   85.269577]  kni_net_release_fifo_phy+0x30/0x84 [rte_kni]
[   85.270116]  kni_dev_remove.isra.0+0x50/0x64 [rte_kni]
[   85.270630]  kni_ioctl_release+0x254/0x320 [rte_kni]
[   85.271136]  kni_ioctl+0x64/0xb0 [rte_kni]
[   85.271553]  __arm64_sys_ioctl+0xdc/0x120
[   85.271955]  invoke_syscall+0x68/0x1a0
[   85.272332]  el0_svc_common.constprop.0+0x90/0x200
[   85.272807]  do_el0_svc+0x94/0xa4
[   85.273144]  el0_svc+0x78/0x240
[   85.273463]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.273895]  el0t_64_sync+0x1a0/0x1a4
[   85.274264]
[   85.274427] Allocated by task 341:
[   85.274767]  kasan_save_stack+0x2c/0x60
[   85.275157]  __kasan_kmalloc+0x90/0xb4
[   85.275533]  __kmalloc_node+0x230/0x594
[   85.275917]  kvmalloc_node+0x8c/0x190
[   85.276286]  alloc_netdev_mqs+0x70/0x6b0
[   85.276678]  kni_ioctl_create+0x224/0xf40 [rte_kni]
[   85.277166]  kni_ioctl+0x9c/0xb0 [rte_kni]
[   85.277581]  __arm64_sys_ioctl+0xdc/0x120
[   85.277980]  invoke_syscall+0x68/0x1a0
[   85.278357]  el0_svc_common.constprop.0+0x90/0x200
[   85.278830]  do_el0_svc+0x94/0xa4
[   85.279172]  el0_svc+0x78/0x240
[   85.279491]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.279925]  el0t_64_sync+0x1a0/0x1a4
[   85.280292]
[   85.280454] Freed by task 341:
[   85.280763]  kasan_save_stack+0x2c/0x60
[   85.281147]  kasan_set_track+0x2c/0x40
[   85.281522]  kasan_set_free_info+0x2c/0x50
[   85.281930]  __kasan_slab_free+0xdc/0x140
[   85.282331]  slab_free_freelist_hook+0x90/0x250
[   85.282782]  kfree+0x128/0x580
[   85.283099]  kvfree+0x48/0x60
[   85.283402]  netdev_freemem+0x34/0x44
[   85.283770]  netdev_release+0x50/0x64
[   85.284138]  device_release+0xa0/0x120
[   85.284516]  kobject_put+0xf8/0x160
[   85.284867]  put_device+0x20/0x30
[   85.285204]  free_netdev+0x22c/0x310
[   85.285562]  kni_dev_remove.isra.0+0x48/0x64 [rte_kni]
[   85.286076]  kni_ioctl_release+0x254/0x320 [rte_kni]
[   85.286573]  kni_ioctl+0x64/0xb0 [rte_kni]
[   85.286992]  __arm64_sys_ioctl+0xdc/0x120
[   85.287392]  invoke_syscall+0x68/0x1a0
[   85.287769]  el0_svc_common.constprop.0+0x90/0x200
[   85.288243]  do_el0_svc+0x94/0xa4
[   85.288579]  el0_svc+0x78/0x240
[   85.288899]  el0t_64_sync_handler+0x1a8/0x1b0
[   85.289332]  el0t_64_sync+0x1a0/0x1a4
[   85.289699]
[   85.289862] The buggy address belongs to the object at ffff000260668000
[   85.289862]  which belongs to the cache kmalloc-cg-8k of size 8192
[   85.291079] The buggy address is located 3424 bytes inside of
[   85.291079]  8192-byte region [ffff000260668000, ffff00026066a000)
[   85.292213] The buggy address belongs to the page:
[   85.292684] page:(____ptrval____) refcount:1 mapcount:0 mapping:
		0000000000000000 index:0x0 pfn:0x2a0668
[   85.293585] head:(____ptrval____) order:3 compound_mapcount:0
		compound_pincount:0
[   85.294305] flags: 0xbfff80000010200(slab|head|node=0|zone=2|
		lastcpupid=0x7fff)
[   85.295020] raw: 0bfff80000010200 0000000000000000 dead000000000122
		ffff0000c000d680
[   85.295767] raw: 0000000000000000 0000000080020002 00000001ffffffff
		0000000000000000
[   85.296512] page dumped because: kasan: bad access detected
[   85.297054]
[   85.297217] Memory state around the buggy address:
[   85.297688]  ffff000260668c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.298384]  ffff000260668c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.299088] >ffff000260668d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.299781]                                                        ^
[   85.300396]  ffff000260668d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.301092]  ffff000260668e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb
		fb fb
[   85.301787] ===========================================================

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
spdk-bot pushed a commit that referenced this pull request May 4, 2022
This patch fixes the stack buffer overflow error reported
from AddressSanitizer.
Function send_packetsx4() tries to access out of bound data
from rte_mbuf and fill it into TX buffer even in the case
where no pending packets (len = 0).
Performance impact:- No

ASAN error report:-
==819==ERROR: AddressSanitizer: stack-buffer-overflow on address
0xffffe2c0dcf0 at pc 0x0000005e791c bp 0xffffe2c0d7e0 sp 0xffffe2c0d800
READ of size 8 at 0xffffe2c0dcf0 thread T0
 #0 0x5e7918 in send_packetsx4 ../examples/l3fwd/l3fwd_common.h:251
 #1 0x5e7918 in send_packets_multi ../examples/l3fwd/l3fwd_neon.h:226

Fixes: 96ff445 ("examples/l3fwd: reorganise and optimize LPM code path")
Cc: stable@dpdk.org

Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
spdk-bot pushed a commit that referenced this pull request Nov 28, 2022
If DPDK is built with thread sanitizer it reports a race
in setting of multiprocess file descriptor. The fix is to
use atomic operations when updating mp_fd.

Build:
$ meson -Db_sanitize=address build
$ ninja -C build

Simple example:
$ .build/app/dpdk-testpmd -l 1-3 --no-huge
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
testpmd: No probed ethernet devices
testpmd: create a new mbuf pool <mb_pool_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: Error - exiting with code: 1
  Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
==================
WARNING: ThreadSanitizer: data race (pid=87245)
  Write of size 4 at 0x558e04d8ff70 by main thread:
    #0 rte_mp_channel_cleanup <null> (dpdk-testpmd+0x1e7d30c)
    #1 rte_eal_cleanup <null> (dpdk-testpmd+0x1e85929)
    #2 rte_exit <null> (dpdk-testpmd+0x1e5bc0a)
    #3 mbuf_pool_create.cold <null> (dpdk-testpmd+0x274011)
    #4 main <null> (dpdk-testpmd+0x5cc15d)

  Previous read of size 4 at 0x558e04d8ff70 by thread T2:
    #0 mp_handle <null> (dpdk-testpmd+0x1e7c439)
    #1 ctrl_thread_init <null> (dpdk-testpmd+0x1e6ee1e)

  As if synchronized via sleep:
    #0 nanosleep libsanitizer/tsan/tsan_interceptors_posix.cpp:366
    #1 get_tsc_freq <null> (dpdk-testpmd+0x1e92ff9)
    #2 set_tsc_freq <null> (dpdk-testpmd+0x1e6f2fc)
    #3 rte_eal_timer_init <null> (dpdk-testpmd+0x1e931a4)
    #4 rte_eal_init.cold <null> (dpdk-testpmd+0x29e578)
    #5 main <null> (dpdk-testpmd+0x5cbc45)

  Location is global 'mp_fd' of size 4 at 0x558e04d8ff70 (dpdk-testpmd+0x000003122f70)

  Thread T2 'rte_mp_handle' (tid=87248, running) created by main thread at:
    #0 pthread_create libsanitizer/tsan/tsan_interceptors_posix.cpp:969
    #1 rte_ctrl_thread_create <null> (dpdk-testpmd+0x1e6efd0)
    #2 rte_mp_channel_init.cold <null> (dpdk-testpmd+0x29cb7c)
    #3 rte_eal_init <null> (dpdk-testpmd+0x1e8662e)
    #4 main <null> (dpdk-testpmd+0x5cbc45)

SUMMARY: ThreadSanitizer: data race (app/dpdk-testpmd+0x1e7d30c) in rte_mp_channel_cleanup
==================
ThreadSanitizer: reported 1 warnings

Fixes: bacaa27 ("eal: add channel for multi-process communication")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
spdk-bot pushed a commit that referenced this pull request Jan 24, 2023
Devices can end up without driver assigned after
probing. The cleanup function from patch below does
not free devices that do not have it assigned.
The devices reported to leak are not the ones that
are the object of the hotplug test.

This issue appears only when using shared objects
build, and on a specific set of CI machines.
More debugging needs to happen to fully understand
the issue here.

Fixes patch below:
(1cab1a4)bus: cleanup devices on shutdown

Errors from ASAN:
00:16:25.099  ==48971==ERROR: LeakSanitizer: detected memory leaks
00:16:25.099
00:16:25.099  Indirect leak of 11544 byte(s) in 37 object(s) allocated
from:
00:16:25.099      #0 0x7f4d00f1a6af in __interceptor_malloc
(/usr/lib64/libasan.so.8+0xba6af)
00:16:25.099      #1 0x7f4d0017c4a7 in pci_scan_one
../drivers/bus/pci/linux/pci.c:218
00:16:25.099      #2 0x7f4d0017ceb5 in rte_pci_scan
../drivers/bus/pci/linux/pci.c:471
00:16:25.099      #3 0x7f4d0002394c in rte_bus_scan
../lib/eal/common/eal_common_bus.c:56
00:16:25.099      #4 0x7f4d00053847 in rte_eal_init
../lib/eal/linux/eal.c:1065
00:16:25.099      #5 0x7f4d00256c01 in spdk_env_init
/var/jenkins/workspace/hw-nvme-hotplug/spdk/lib/env_dpdk/init.c:585
00:16:25.099      #6 0x40709c in main
/var/jenkins/workspace/hw-nvme-hotplug/spdk/examples/nvme/hotplug/hotplug.c:571
00:16:25.099      #7 0x7f4cff50150f in __libc_start_call_main
(/usr/lib64/libc.so.6+0x2750f)

Change-Id: I78252587a0930a15097ce16227a4935d34871b75
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/dpdk/+/16443
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk-bot pushed a commit that referenced this pull request May 29, 2023
The net/vhost pmd currently provides a -1 vid when disabling interrupt
after a virtio port got disconnected.

This can be caught when running with ASan.

First, start dpdk-l3fwd-power in interrupt mode with a net/vhost port.

$ ./build-clang/examples/dpdk-l3fwd-power -l0,1 --in-memory \
	-a 0000:00:00.0 \
	--vdev net_vhost0,iface=plop.sock,client=1\
	-- \
	-p 0x1 \
	--interrupt-only \
	--config '(0,0,1)' \
	--parse-ptype 0

Then start testpmd with virtio-user.

$ ./build-clang/app/dpdk-testpmd -l0,2 --single-file-segment --in-memory \
	-a 0000:00:00.0 \
	--vdev net_virtio_user0,path=plop.sock,server=1 \
	-- \
	-i

Finally stop testpmd.
ASan then splats in dpdk-l3fwd-power:

=================================================================
==3641005==ERROR: AddressSanitizer: global-buffer-overflow on address
	0x000005ed0778 at pc 0x000001270f81 bp 0x7fddbd2eee20
	sp 0x7fddbd2eee18
READ of size 8 at 0x000005ed0778 thread T2
    #0 0x1270f80 in get_device .../lib/vhost/vhost.h:801:27
    #1 0x1270f80 in rte_vhost_get_vhost_vring .../lib/vhost/vhost.c:951:8
    #2 0x3ac95cb in eth_rxq_intr_disable
	.../drivers/net/vhost/rte_eth_vhost.c:647:8
    #3 0x170e0bf in rte_eth_dev_rx_intr_disable
	.../lib/ethdev/rte_ethdev.c:5443:25
    #4 0xf72ba7 in turn_on_off_intr .../examples/l3fwd-power/main.c:881:4
    #5 0xf71045 in main_intr_loop .../examples/l3fwd-power/main.c:1061:6
    #6 0x17f9292 in eal_thread_loop
	.../lib/eal/common/eal_common_thread.c:210:9
    #7 0x18373f5 in eal_worker_thread_loop .../lib/eal/linux/eal.c:915:2
    #8 0x7fddc16ae12c in start_thread (/lib64/libc.so.6+0x8b12c)
	(BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)
    #9 0x7fddc172fbbf in __GI___clone3 (/lib64/libc.so.6+0x10cbbf)
	(BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)

0x000005ed0778 is located 8 bytes to the left of global variable
	'vhost_devices' defined in '.../lib/vhost/vhost.c:24'
	(0x5ed0780) of size 8192
0x000005ed0778 is located 20 bytes to the right of global variable
	'vhost_config_log_level' defined in '.../lib/vhost/vhost.c:2174'
	(0x5ed0760) of size 4
SUMMARY: AddressSanitizer: global-buffer-overflow
	.../lib/vhost/vhost.h:801:27 in get_device
Shadow bytes around the buggy address:
  0x000080bd2090: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x000080bd20a0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x000080bd20b0: f9 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9
  0x000080bd20c0: 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x000080bd20d0: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
=>0x000080bd20e0: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 04 f9 f9[f9]
  0x000080bd20f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080bd2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080bd2110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080bd2120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080bd2130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
Thread T2 created by T0 here:
    #0 0xe98996 in __interceptor_pthread_create
	(.examples/dpdk-l3fwd-power+0xe98996)
	(BuildId: d0b984a3b0287b9e0f301b73426fa921aeecca3a)
    #1 0x1836767 in eal_worker_thread_create .../lib/eal/linux/eal.c:952:6
    #2 0x1834b83 in rte_eal_init .../lib/eal/linux/eal.c:1257:9
    #3 0xf68902 in main .../examples/l3fwd-power/main.c:2496:8
    #4 0x7fddc164a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
	(BuildId: 81daba31ee66dbd63efdc4252a872949d874d136)

==3641005==ABORTING

More generally, any application passing an incorrect vid would trigger
such an OOB access.

Fixes: 4796ad6 ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
spdk-bot pushed a commit that referenced this pull request Jul 31, 2023
Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
processor which has 64 cores and 4 NUMA nodes.

From the ixgbe_recv_pkts_lro() function, we found that as long as the first
packet has the EOP bit set, and the length of this packet is less than or
equal to rxq->crc_len, the segmentation fault will definitely happen even
though on the other platforms. For example, if we made the first packet
which had the EOP bit set had a zero length by force, the segmentation
fault would happen on X86.

Because when processd the first packet the first_seg->next will be NULL, if
at the same time this packet has the EOP bit set and its length is less
than or equal to rxq->crc_len, the following loop will be executed:

    for (lp = first_seg; lp->next != rxm; lp = lp->next)
        ;

We know that the first_seg->next will be NULL under this condition. So the
expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be greater
than rxq->crc_len. However, the out-of-order execution of CPU may make the
read ordering of the status and the rest of the descriptor fields in this
function not be correct. The related codes are as following:

        rxdp = &rx_ring[rx_id];
 #1     staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

        if (!(staterr & IXGBE_RXDADV_STAT_DD))
            break;

 #2     rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is likely
to make the ready packet zero length. If the packet is the first packet and
has the EOP bit set, the above segmentation fault will happen.

So, we should add a proper memory barrier to ensure the read ordering be
correct. We also did the same thing in the ixgbe_recv_pkts() function to
make the rxd data be valid even though we did not find segmentation fault
in this function.

Fixes: 8eecb32 ("ixgbe: add LRO support")
Cc: stable@dpdk.org

Signed-off-by: Min Zhou <zhoumin@loongson.cn>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Artemy-Mellanox referenced this pull request in Artemy-Mellanox/dpdk Nov 28, 2023
getline() may allocate a buffer even though it returns -1:
"""
If  *lineptr  is set to NULL before the call, then getline() will allocate
a buffer for storing the line.  This buffer should be freed by the user
program even if getline() failed.
"""

This leak has been observed on a RHEL8 system with two CX5 PF devices
(no VFs).

ASan reports:
==8899==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 120 byte(s) in 1 object(s) allocated from:
    #0 0x7fe58576aba8 in __interceptor_malloc
	(/lib64/libasan.so.5+0xefba8)
    #1 0x7fe583e866b2 in __getdelim (/lib64/libc.so.6+0x886b2)
    spdk#2 0x327bd23 in mlx5_sysfs_switch_info
	../drivers/net/mlx5/linux/mlx5_ethdev_os.c:1084
    spdk#3 0x3271f86 in mlx5_os_pci_probe_pf
	../drivers/net/mlx5/linux/mlx5_os.c:2282
    spdk#4 0x3273c83 in mlx5_os_pci_probe
	../drivers/net/mlx5/linux/mlx5_os.c:2497
    spdk#5 0x327475f in mlx5_os_net_probe
	../drivers/net/mlx5/linux/mlx5_os.c:2578
    #6 0xc6eac7 in drivers_probe
	../drivers/common/mlx5/mlx5_common.c:937
    #7 0xc6f150 in mlx5_common_dev_probe
	../drivers/common/mlx5/mlx5_common.c:1027
    #8 0xc8ef80 in mlx5_common_pci_probe
	../drivers/common/mlx5/mlx5_common_pci.c:168
    #9 0xc21b67 in rte_pci_probe_one_driver
	../drivers/bus/pci/pci_common.c:312
    #10 0xc2224c in pci_probe_all_drivers
	../drivers/bus/pci/pci_common.c:396
    #11 0xc222f4 in pci_probe ../drivers/bus/pci/pci_common.c:423
    #12 0xb71fff in rte_bus_probe ../lib/eal/common/eal_common_bus.c:78
    #13 0xbe6888 in rte_eal_init ../lib/eal/linux/eal.c:1300
    #14 0x5ec717 in main ../app/test-pmd/testpmd.c:4515
    #15 0x7fe583e38d84 in __libc_start_main (/lib64/libc.so.6+0x3ad84)

As far as why getline() errors, strace gives a hint:
8516  openat(AT_FDCWD, "/sys/class/net/enp130s0f0/phys_port_name",
	O_RDONLY) = 34
8516  fstat(34, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
8516  read(34, 0x621000098900, 4096)    = -1 EOPNOTSUPP (Operation
	not supported)

Fixes: f8a226e ("net/mlx5: fix sysfs port name translation")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
spdk-bot pushed a commit that referenced this pull request Apr 9, 2024
In function mlx5_dev_configure, dev->data->tx_queues is assigned
to priv->txqs. When a member is removed from a bond, the function
eth_dev_tx_queue_config is called to release dev->data->tx_queues.
However, function mlx5_dev_close will access priv->txqs again and
cause the use after free problem.

In function mlx5_dev_close, before free priv->txqs, we add a check
that dev->data->tx_queues is not NULL.

build/app/dpdk-testpmd -c7 -a 0000:08:00.2 --  -i --nb-cores=2
--total-num-mbufs=2048

testpmd> port stop 0
testpmd> create bonding device 4 0
testpmd> add bonding member 0 1
testpmd> remove bonding member 0 1
testpmd> quit

ASan reports:
==2571911==ERROR: AddressSanitizer: heap-use-after-free on address
0x000174529880 at pc 0x0000113c8440 bp 0xffffefae0ea0 sp 0xffffefae0eb0
READ of size 8 at 0x000174529880 thread T0
    #0 0x113c843c in mlx5_txq_release ../drivers/net/mlx5/mlx5_txq.c:
1203
    #1 0xffdb53c in mlx5_dev_close ../drivers/net/mlx5/mlx5.c:2286
    #2 0xe12dc0 in rte_eth_dev_close ../lib/ethdev/rte_ethdev.c:1877
    #3 0x6bac1c in close_port ../app/test-pmd/testpmd.c:3540
    #4 0x6bc320 in pmd_test_exit ../app/test-pmd/testpmd.c:3808
    #5 0x6c1a94 in main ../app/test-pmd/testpmd.c:4759
    #6 0xffff9328f038  (/usr/lib64/libc.so.6+0x2b038)
    #7 0xffff9328f110 in __libc_start_main (/usr/lib64/libc.so.6+
0x2b110)

Fixes: 6e78005 ("net/mlx5: add reference counter on DPDK Tx queues")
Cc: stable@dpdk.org

Reported-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Pengfei Sun <sunpengfei16@huawei.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants