Skip to content
Permalink
Martin-KaFai-L…
Switch branches/tags

Commits on Mar 16, 2021

  1. bpf: selftest: Add kfunc_call test

    This patch adds two kernel function bpf_kfunc_call_test[12]() for the
    selftest's test_run purpose.  They will be allowed for tc_cls prog.
    
    The selftest calling the kernel function bpf_kfunc_call_test[12]()
    is also added in this patch.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  2. bpf: selftest: bpf_cubic and bpf_dctcp calling kernel functions

    This patch removes the bpf implementation of tcp_slow_start()
    and tcp_cong_avoid_ai().  Instead, it directly uses the kernel
    implementation.
    
    It also replaces the bpf_cubic_undo_cwnd implementation by directly
    calling tcp_reno_undo_cwnd().  bpf_dctcp also directly calls
    tcp_reno_cong_avoid() instead.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  3. bpf: selftests: Rename bictcp to bpf_cubic

    As a similar chanage in the kernel, this patch gives the proper
    name to the bpf cubic.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  4. libbpf: Support extern kernel function

    This patch is to make libbpf able to handle the following extern
    kernel function declaration and do the needed relocations before
    loading the bpf program to the kernel.
    
    extern int foo(struct sock *) __attribute__((section(".ksyms")))
    
    In the collect extern phase, needed changes is made to
    bpf_object__collect_externs() and find_extern_btf_id() to collect
    function.
    
    In the collect relo phase, it will record the kernel function
    call as RELO_EXTERN_FUNC.
    
    bpf_object__resolve_ksym_func_btf_id() is added to find the func
    btf_id of the running kernel.
    
    During actual relocation, it will patch the BPF_CALL instruction with
    src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running
    kernel func's btf_id.
    
    btf_fixup_datasec() is changed also because a datasec may
    only have func and its size will be 0.  The "!size" test
    is postponed till it is confirmed there are vars.
    It also takes this chance to remove the
    "if (... || (t->size && t->size != size)) { return -ENOENT; }" test
    because t->size is zero at the point.
    
    The required LLVM patch: https://reviews.llvm.org/D93563
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  5. libbpf: Record extern sym relocation first

    This patch records the extern sym relocs first before recording
    subprog relocs.  The later patch will have relocs for extern
    kernel function call which is also using BPF_JMP | BPF_CALL.
    It will be easier to handle the extern symbols first in
    the later patch.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  6. libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR

    This patch renames RELO_EXTERN to RELO_EXTERN_VAR.
    It is to avoid the confusion with a later patch adding
    RELO_EXTERN_FUNC.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  7. libbpf: Refactor codes for finding btf id of a kernel symbol

    This patch refactors code, that finds kernel btf_id by kind
    and symbol name, to a new function find_ksym_btf_id().
    
    It also adds a new helper __btf_kind_str() to return
    a string by the numeric kind value.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  8. libbpf: Refactor bpf_object__resolve_ksyms_btf_id

    This patch refactors most of the logic from
    bpf_object__resolve_ksyms_btf_id() into a new function
    bpf_object__resolve_ksym_var_btf_id().
    It is to get ready for a later patch adding
    bpf_object__resolve_ksym_func_btf_id() which resolves
    a kernel function to the running kernel btf_id.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  9. bpf: tcp: White list some tcp cong functions to be called by bpf-tcp-cc

    This patch white list some tcp cong helper functions, tcp_slow_start()
    and tcp_cong_avoid_ai().  They are allowed to be directly called by
    the bpf-tcp-cc program.
    
    A few tcp cc implementation functions are also white listed.
    A potential use case is the bpf-tcp-cc implementation may only
    want to override a subset of a tcp_congestion_ops.  For others,
    the bpf-tcp-cc can directly call the kernel counter parts instead of
    re-implementing (or copy-and-pasting) them to the bpf program.
    
    They will only be available to the bpf-tcp-cc typed program.
    The white listed functions are not bounded to a fixed ABI contract.
    When any of them has changed, the bpf-tcp-cc program has to be changed
    like any in-tree/out-of-tree kernel tcp-cc implementations do also.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  10. tcp: Rename bictcp function prefix to cubictcp

    The cubic functions in tcp_cubic.c are using the bictcp prefix as
    in tcp_bic.c.  This patch gives it the proper name cubictcp
    because the later patch will allow the bpf prog to directly
    call the cubictcp implementation.  Renaming them will avoid
    the name collision when trying to find the intended
    one to call during bpf prog load time.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  11. bpf: Support kernel function call in x86-32

    This patch adds kernel function call support to the x86-32 bpf jit.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  12. bpf: Support bpf program calling kernel function

    This patch adds support to BPF verifier to allow bpf program calling
    kernel function directly.
    
    The use case included in this set is to allow bpf-tcp-cc to directly
    call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
    functions have already been used by some kernel tcp-cc implementations.
    
    This set will also allow the bpf-tcp-cc program to directly call the
    kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
    implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
    from the kernel tcp_dctcp.c instead of reimplementing (or
    copy-and-pasting) them.
    
    The tcp-cc kernel functions mentioned above will be white listed
    for the struct_ops bpf-tcp-cc programs to use in a later patch.
    The white listed functions are not bounded to a fixed ABI contract.
    Those functions have already been used by the existing kernel tcp-cc.
    If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
    implementations have to be changed.  The same goes for the struct_ops
    bpf-tcp-cc programs which have to be adjusted accordingly.
    
    This patch is to make the required changes in the bpf verifier.
    
    First change is in btf.c, it adds a case in "do_btf_check_func_arg_match()".
    When the passed in "btf->kernel_btf == true", it means matching the
    verifier regs' states with a kernel function.  This will handle the
    PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
    and PTR_TO_TCP_SOCK to its kernel's btf_id.
    
    In the later libbpf patch, the insn calling a kernel function will
    look like:
    
    insn->code == (BPF_JMP | BPF_CALL)
    insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
    insn->imm == func_btf_id /* btf_id of the running kernel */
    
    [ For the future calling function-in-kernel-module support, an array
      of module btf_fds can be passed at the load time and insn->off
      can be used to index into this array. ]
    
    At the early stage of verifier, the verifier will collect all kernel
    function calls into "struct bpf_kern_func_descriptor".  Those
    descriptors are stored in "prog->aux->kfunc_tab" and will
    be available to the JIT.  Since this "add" operation is similar
    to the current "add_subprog()" and looking for the same insn->code,
    they are done together in the new "add_subprog_and_kern_func()".
    
    In the "do_check()" stage, the new "check_kern_func_call()" is added
    to verify the kernel function call instruction:
    1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
       A new bpf_verifier_ops "check_kern_func_call" is added to do that.
       The bpf-tcp-cc struct_ops program will implement this function in
       a later patch.
    2. Call "btf_check_kern_func_args_match()" to ensure the regs can be
       used as the args of a kernel function.
    3. Mark the regs' type, subreg_def, and zext_dst.
    
    At the later do_misc_fixups() stage, the new fixup_kern_func_call()
    will replace the insn->imm with the function address (relative
    to __bpf_call_base).  If needed, the jit can find the btf_func_model
    by calling the new bpf_jit_find_kern_func_model(prog, insn->imm).
    With the imm set to the function address, "bpftool prog dump xlated"
    will be able to display the kernel function calls the same way as
    it displays other bpf helper calls.
    
    gpl_compatible program is required to call kernel function.
    
    This feature currently requires JIT.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  13. bpf: Refactor btf_check_func_arg_match

    This patch refactors the core logic of "btf_check_func_arg_match()"
    into a new function "do_btf_check_func_arg_match()".
    "do_btf_check_func_arg_match()" will be reused later to check
    the kernel function call.
    
    The "if (!btf_type_is_ptr(t))" is checked first to improve the indentation
    which will be useful for a later patch.
    
    Some of the "btf_kind_str[]" usages is replaced with the shortcut
    "btf_type_str(t)".
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  14. bpf: btf: Support parsing extern func

    This patch makes BTF verifier to accept extern func. It is used for
    allowing bpf program to call a limited set of kernel functions
    in a later patch.
    
    When writing bpf prog, the extern kernel function needs
    to be declared under a ELF section (".ksyms") which is
    the same as the current extern kernel variables and that should
    keep its usage consistent without requiring to remember another
    section name.
    
    For example, in a bpf_prog.c:
    
    extern int foo(struct sock *) __attribute__((section(".ksyms")))
    
    [24] FUNC_PROTO '(anon)' ret_type_id=15 vlen=1
    	'(anon)' type_id=18
    [25] FUNC 'foo' type_id=24 linkage=extern
    [ ... ]
    [33] DATASEC '.ksyms' size=0 vlen=1
    	type_id=25 offset=0 size=0
    
    LLVM will put the "func" type into the BTF datasec ".ksyms".
    The current "btf_datasec_check_meta()" assumes everything under
    it is a "var" and ensures it has non-zero size ("!vsi->size" test).
    The non-zero size check is not true for "func".  This patch postpones the
    "!vsi-size" test from "btf_datasec_check_meta()" to
    "btf_datasec_resolve()" which has all types collected to decide
    if a vsi is a "var" or a "func" and then enforce the "vsi->size"
    differently.
    
    If the datasec only has "func", its "t->size" could be zero.
    Thus, the current "!t->size" test is no longer valid.  The
    invalid "t->size" will still be caught by the later
    "last_vsi_end_off > t->size" check.   This patch also takes this
    chance to consolidate other "t->size" tests ("vsi->offset >= t->size"
    "vsi->size > t->size", and "t->size < sum") into the existing
    "last_vsi_end_off > t->size" test.
    
    The LLVM will also put those extern kernel function as an extern
    linkage func in the BTF:
    
    [24] FUNC_PROTO '(anon)' ret_type_id=15 vlen=1
    	'(anon)' type_id=18
    [25] FUNC 'foo' type_id=24 linkage=extern
    
    This patch allows BTF_FUNC_EXTERN in btf_func_check_meta().
    Also extern kernel function declaration does not
    necessary have arg name. Another change in btf_func_check() is
    to allow extern function having no arg name.
    
    The btf selftest is adjusted accordingly.  New tests are also added.
    
    The required LLVM patch: https://reviews.llvm.org/D93563
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021
  15. bpf: Simplify freeing logic in linfo and jited_linfo

    This patch simplifies the linfo freeing logic by combining
    "bpf_prog_free_jited_linfo()" and "bpf_prog_free_unused_jited_linfo()"
    into the new "bpf_prog_jit_attempt_done()".
    It is a prep work for the kernel function call support.  In a later
    patch, freeing the kernel function call descriptors will also
    be done in the "bpf_prog_jit_attempt_done()".
    
    "bpf_prog_free_linfo()" is removed since it is only called by
    "__bpf_prog_put_noref()".  The kvfree() are directly called
    instead.
    
    It also takes this chance to s/kcalloc/kvcalloc/ for the jited_linfo
    allocation.
    
    Signed-off-by: Martin KaFai Lau <kafai@fb.com>
    iamkafai authored and intel-lab-lkp committed Mar 16, 2021

Commits on Mar 15, 2021

  1. bpf: Add getter and setter for SO_REUSEPORT through bpf_{g,s}etsockopt

    Augment the current set of options that are accessible via
    bpf_{g,s}etsockopt to also support SO_REUSEPORT.
    
    Signed-off-by: Manu Bretelle <chantra@fb.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20210310182305.1910312-1-chantra@fb.com
    chantra authored and borkmann committed Mar 15, 2021

Commits on Mar 10, 2021

  1. Merge branch 'libbpf/xsk cleanups'

    Björn Töpel says:
    
    ====================
    
    This series removes a header dependency from xsk.h, and moves
    libbpf_util.h into xsk.h.
    
    More details in each commit!
    
    Thank you,
    Björn
    ====================
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    anakryiko committed Mar 10, 2021
  2. libbpf: xsk: Move barriers from libbpf_util.h to xsk.h

    The only user of libbpf_util.h is xsk.h. Move the barriers to xsk.h,
    and remove libbpf_util.h. The barriers are used as an implementation
    detail, and should not be considered part of the stable API.
    
    Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20210310080929.641212-3-bjorn.topel@gmail.com
    Björn Töpel authored and anakryiko committed Mar 10, 2021
  3. libbpf: xsk: Remove linux/compiler.h header

    In commit 291471d ("libbpf, xsk: Add libbpf_smp_store_release
    libbpf_smp_load_acquire") linux/compiler.h was added as a dependency
    to xsk.h, which is the user-facing API. This makes it harder for
    userspace application to consume the library. Here the header
    inclusion is removed, and instead {READ,WRITE}_ONCE() is added
    explicitly.
    
    Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20210310080929.641212-2-bjorn.topel@gmail.com
    Björn Töpel authored and anakryiko committed Mar 10, 2021
  4. bpf: Fix warning comparing pointer to 0

    Fix the following coccicheck warning:
    
    ./tools/testing/selftests/bpf/progs/fentry_test.c:67:12-13: WARNING
    comparing pointer to 0.
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/1615360714-30381-1-git-send-email-jiapeng.chong@linux.alibaba.com
    Jiapeng Chong authored and anakryiko committed Mar 10, 2021
  5. selftests/bpf: Fix warning comparing pointer to 0

    Fix the following coccicheck warning:
    
    ./tools/testing/selftests/bpf/progs/test_global_func10.c:17:12-13:
    WARNING comparing pointer to 0.
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/1615357366-97612-1-git-send-email-jiapeng.chong@linux.alibaba.com
    Jiapeng Chong authored and anakryiko committed Mar 10, 2021
  6. Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

    Alexei Starovoitov says:
    
    ====================
    pull-request: bpf-next 2021-03-09
    
    The following pull-request contains BPF updates for your *net-next* tree.
    
    We've added 90 non-merge commits during the last 17 day(s) which contain
    a total of 114 files changed, 5158 insertions(+), 1288 deletions(-).
    
    The main changes are:
    
    1) Faster bpf_redirect_map(), from Björn.
    
    2) skmsg cleanup, from Cong.
    
    3) Support for floating point types in BTF, from Ilya.
    
    4) Documentation for sys_bpf commands, from Joe.
    
    5) Support for sk_lookup in bpf_prog_test_run, form Lorenz.
    
    6) Enable task local storage for tracing programs, from Song.
    
    7) bpf_for_each_map_elem() helper, from Yonghong.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Mar 10, 2021
  7. Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net

    Pull networking fixes from David Miller:
    
     1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau.
    
     2) TX skb error handling fix in mt76 driver, also from Felix.
    
     3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman.
    
     4) Avoid double free of percpu pointers when freeing a cloned bpf prog.
        From Cong Wang.
    
     5) Use correct printf format for dma_addr_t in ath11k, from Geert
        Uytterhoeven.
    
     6) Fix resolve_btfids build with older toolchains, from Kun-Chuan
        Hsieh.
    
     7) Don't report truncated frames to mac80211 in mt76 driver, from
        Lorenzop Bianconi.
    
     8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang.
    
     9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann.
    
    10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from
        Arjun Roy.
    
    11) Ignore routes with deleted nexthop object in mlxsw, from Ido
        Schimmel.
    
    12) Need to undo tcp early demux lookup sometimes in nf_nat, from
        Florian Westphal.
    
    13) Fix gro aggregation for udp encaps with zero csum, from Daniel
        Borkmann.
    
    14) Make sure to always use imp*_ndo_send when necessaey, from Jason A.
        Donenfeld.
    
    15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov.
    
    16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin.
    
    17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir
        Oltean.
    
    18) Make sure textsearch copntrol block is large enough, from Wilem de
        Bruijn.
    
    19) Revert MAC changes to r8152 leading to instability, from Hates Wang.
    
    20) Advance iov in 9p even for empty reads, from Jissheng Zhang.
    
    21) Double hook unregister in nftables, from PabloNeira Ayuso.
    
    22) Fix memleak in ixgbe, fropm Dinghao Liu.
    
    23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne.
    
    24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang
        Tang.
    
    25) Fix DOI refcount bugs in cipso, from Paul Moore.
    
    26) One too many irqsave in ibmvnic, from Junlin Yang.
    
    27) Fix infinite loop with MPLS gso segmenting via virtio_net, from
        Balazs Nemeth.
    
    * git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits)
      s390/qeth: fix notification for pending buffers during teardown
      s390/qeth: schedule TX NAPI on QAOB completion
      s390/qeth: improve completion of pending TX buffers
      s390/qeth: fix memory leak after failed TX Buffer allocation
      net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
      net: check if protocol extracted by virtio_net_hdr_set_proto is correct
      net: dsa: xrs700x: check if partner is same as port in hsr join
      net: lapbether: Remove netif_start_queue / netif_stop_queue
      atm: idt77252: fix null-ptr-dereference
      atm: uPD98402: fix incorrect allocation
      atm: fix a typo in the struct description
      net: qrtr: fix error return code of qrtr_sendmsg()
      mptcp: fix length of ADD_ADDR with port sub-option
      net: bonding: fix error return code of bond_neigh_init()
      net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
      net: enetc: set MAC RX FIFO to recommended value
      net: davicom: Use platform_get_irq_optional()
      net: davicom: Fix regulator not turned off on driver removal
      net: davicom: Fix regulator not turned off on failed probe
      net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports
      ...
    torvalds committed Mar 10, 2021
  8. Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc

    Pull sparc fixes from David Miller:
     "Fix opcode filtering for exceptions, and clean up defconfig"
    
    * git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc:
      sparc: sparc64_defconfig: remove duplicate CONFIGs
      sparc64: Fix opcode filtering in handling of no fault loads
    torvalds committed Mar 10, 2021
  9. sparc: sparc64_defconfig: remove duplicate CONFIGs

    After my patch there is CONFIG_ATA defined twice.
    Remove the duplicate one.
    Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot
    test with NFS.
    
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Fixes: a57cdeb ("sparc: sparc64_defconfig: add necessary configs for qemu")
    Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    montjoie authored and davem330 committed Mar 10, 2021
  10. sparc64: Fix opcode filtering in handling of no fault loads

    is_no_fault_exception() has two bugs which were discovered via random
    opcode testing with stress-ng. Both are caused by improper filtering
    of opcodes.
    
    The first bug can be triggered by a floating point store with a no-fault
    ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.
    
    The code first tests op3[5] (0x1000000), which denotes a floating
    point instruction, and then tests op3[2] (0x200000), which denotes a
    store instruction. But these bits are not mutually exclusive, and the
    above mentioned opcode has both bits set. The intent is to filter out
    stores, so the test for stores must be done first in order to have
    any effect.
    
    The second bug can be triggered by a floating point load with one of
    the invalid ASI values 0x8e or 0x8f, which pass this check in
    is_no_fault_exception():
         if ((asi & 0xf2) == ASI_PNF)
    
    An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
    opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
    in handle_ldf_stq(), and is_no_fault_exception() must not allow these
    invalid asi values to make it that far.
    
    In both of these cases, handle_ldf_stq() reacts by calling
    sun4v_data_access_exception() or spitfire_data_access_exception(),
    which call is_no_fault_exception() and results in an infinite
    recursion.
    
    Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
    Tested-by: Anatoly Pugachev <matorola@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Rob Gardner authored and davem330 committed Mar 10, 2021
  11. Merge branch 's390-qeth-fixes'

    Julian Wiedmann says:
    
    ====================
    s390/qeth: fixes 2021-03-09
    
    please apply the following patch series to netdev's net tree.
    
    This brings one fix for a memleak in an error path of the setup code.
    Also several fixes for dealing with pending TX buffers - two for old
    bugs in their completion handling, and one recent regression in a
    teardown path.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Mar 10, 2021
  12. s390/qeth: fix notification for pending buffers during teardown

    The cited commit reworked the state machine for pending TX buffers.
    In qeth_iqd_tx_complete() it turned PENDING into a transient state, and
    uses NEED_QAOB for buffers that get parked while waiting for their QAOB
    completion.
    
    But it missed to adjust the check in qeth_tx_complete_buf(). So if
    qeth_tx_complete_pending_bufs() is called during teardown to drain
    the parked TX buffers, we no longer raise a notification for af_iucv.
    
    Instead of updating the checked state, just move this code into
    qeth_tx_complete_pending_bufs() itself. This also gets rid of the
    special-case in the common TX completion path.
    
    Fixes: 8908f36 ("s390/qeth: fix af_iucv notification race")
    Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    julianwiedmann authored and davem330 committed Mar 10, 2021
  13. s390/qeth: schedule TX NAPI on QAOB completion

    When a QAOB notifies us that a pending TX buffer has been delivered, the
    actual TX completion processing by qeth_tx_complete_pending_bufs()
    is done within the context of a TX NAPI instance. We shouldn't rely on
    this instance being scheduled by some other TX event, but just do it
    ourselves.
    
    qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI
    instance. To avoid touching the TX queue's NAPI instance
    before/after it is (un-)registered, reorder the code in qeth_open()
    and qeth_stop() accordingly.
    
    Fixes: 0da9581 ("qeth: exploit asynchronous delivery of storage blocks")
    Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    julianwiedmann authored and davem330 committed Mar 10, 2021
  14. s390/qeth: improve completion of pending TX buffers

    The current design attaches a pending TX buffer to a custom
    single-linked list, which is anchored at the buffer's slot on the
    TX ring. The buffer is then checked for final completion whenever
    this slot is processed during a subsequent TX NAPI poll cycle.
    
    But if there's insufficient traffic on the ring, we might never make
    enough progress to get back to this ring slot and discover the pending
    buffer's final TX completion. In particular if this missing TX
    completion blocks the application from sending further traffic.
    
    So convert the custom single-linked list code to a per-queue list_head,
    and scan this list on every TX NAPI cycle.
    
    Fixes: 0da9581 ("qeth: exploit asynchronous delivery of storage blocks")
    Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    julianwiedmann authored and davem330 committed Mar 10, 2021
  15. s390/qeth: fix memory leak after failed TX Buffer allocation

    When qeth_alloc_qdio_queues() fails to allocate one of the buffers that
    back an Output Queue, the 'out_freeoutqbufs' path will free all
    previously allocated buffers for this queue. But it misses to free the
    half-finished queue struct itself.
    
    Move the buffer allocation into qeth_alloc_output_queue(), and deal with
    such errors internally.
    
    Fixes: 0da9581 ("qeth: exploit asynchronous delivery of storage blocks")
    Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
    Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    julianwiedmann authored and davem330 committed Mar 10, 2021
  16. Merge branch 'virtio_net-infinite-loop'

    Balazs Nemeth says:
    
    ====================
    net: prevent infinite loop caused by incorrect proto from virtio_net_hdr_set_proto
    
    These patches prevent an infinite loop for gso packets with a protocol
    from virtio net hdr that doesn't match the protocol in the packet.
    Note that packets coming from a device without
    header_ops->parse_protocol being implemented will not be caught by
    the check in virtio_net_hdr_to_skb, but the infinite loop will still
    be prevented by the check in the gso layer.
    
    Changes from v2 to v3:
      - Remove unused *eth.
      - Use MPLS_HLEN to also check if the MPLS header length is a multiple
        of four.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Mar 10, 2021
  17. net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0

    A packet with skb_inner_network_header(skb) == skb_network_header(skb)
    and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers
    from the packet. Subsequently, the call to skb_mac_gso_segment will
    again call mpls_gso_segment with the same packet leading to an infinite
    loop. In addition, ensure that the header length is a multiple of four,
    which should hold irrespective of the number of stacked labels.
    
    Signed-off-by: Balazs Nemeth <bnemeth@redhat.com>
    Acked-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    bn222 authored and davem330 committed Mar 10, 2021
  18. net: check if protocol extracted by virtio_net_hdr_set_proto is correct

    For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
    set) based on the type in the virtio net hdr, but the skb could contain
    anything since it could come from packet_snd through a raw socket. If
    there is a mismatch between what virtio_net_hdr_set_proto sets and
    the actual protocol, then the skb could be handled incorrectly later
    on.
    
    An example where this poses an issue is with the subsequent call to
    skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
    correctly. A specially crafted packet could fool
    skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
    
    Avoid blindly trusting the information provided by the virtio net header
    by checking that the protocol in the packet actually matches the
    protocol set by virtio_net_hdr_set_proto. Note that since the protocol
    is only checked if skb->dev implements header_ops->parse_protocol,
    packets from devices without the implementation are not checked at this
    stage.
    
    Fixes: 9274124 ("net: stricter validation of untrusted gso packets")
    Signed-off-by: Balazs Nemeth <bnemeth@redhat.com>
    Acked-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    bn222 authored and davem330 committed Mar 10, 2021
  19. Merge branch 'bpf-xdp-redirect'

    Björn Töpel says:
    
    ====================
    This two patch series contain two optimizations for the
    bpf_redirect_map() helper and the xdp_do_redirect() function.
    
    The bpf_redirect_map() optimization is about avoiding the map lookup
    dispatching. Instead of having a switch-statement and selecting the
    correct lookup function, we let bpf_redirect_map() be a map operation,
    where each map has its own bpf_redirect_map() implementation. This way
    the run-time lookup is avoided.
    
    The xdp_do_redirect() patch restructures the code, so that the map
    pointer indirection can be avoided.
    
    Performance-wise I got 4% improvement for XSKMAP
    (sample:xdpsock/rx-drop), and 8% (sample:xdp_redirect_map) on my
    machine.
    
    v5->v6:  Removed REDIR enum, and instead use map_id and map_type. (Daniel)
             Applied Daniel's fixups on patch 1. (Daniel)
    v4->v5:  Renamed map operation to map_redirect. (Daniel)
    v3->v4:  Made bpf_redirect_map() a map operation. (Daniel)
    v2->v3:  Fix build when CONFIG_NET is not set. (lkp)
    v1->v2:  Removed warning when CONFIG_BPF_SYSCALL was not set. (lkp)
             Cleaned up case-clause in xdp_do_generic_redirect_map(). (Toke)
             Re-added comment. (Toke)
    rfc->v1: Use map_id, and remove bpf_clear_redirect_map(). (Toke)
             Get rid of the macro and use __always_inline. (Jesper)
    ====================
    
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    borkmann committed Mar 10, 2021
Older