Skip to content
Permalink
Yonghong-Song/…
Switch branches/tags

Commits on Nov 17, 2021

  1. selftests/bpf: add a selftest with __user tag

    Added a selftest where the argument is a pointer with __user tag.
    Directly accessing its field without helper will result
    verification failure.
      $ ./test_progs -v -n 21/3
      ...
      Successfully loaded bpf_testmod.ko.
      test_btf_type_tag_user:PASS:btf_type_tag_user 0 nsec
      libbpf: load bpf program failed: Permission denied
      libbpf: -- BEGIN DUMP LOG ---
      libbpf:
      R1 type=ctx expected=fp
      ; int BPF_PROG(sub, struct bpf_testmod_btf_type_tag *arg)
      0: (79) r1 = *(u64 *)(r1 +0)
      func 'bpf_testmod_test_btf_type_tag_user' arg0 accesses user memory
      invalid bpf_context access off=0 size=8
      processed 1 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
      ...
      test_btf_type_tag_user:PASS:btf_type_tag_user 0 nsec
      torvalds#21/3 btf_tag/btf_type_tag_user:OK
      torvalds#21 btf_tag:OK
      Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
      Successfully unloaded bpf_testmod.ko.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    yonghong-song authored and intel-lab-lkp committed Nov 17, 2021
  2. bpf: reject program if a __user tagged memory accessed in kernel way

    BPF verifier supports direct access, e.g., a->b. If "a" is a pointer
    pointing to kernel memory, bpf verifier will allow user to write
    code in C like a->b and bpf verifier will translate it to a kernel
    load properly. If "a" is a pointer to user memory, it is expected
    that bpf developer should be bpf_probe_read_user() helper to
    get the value a->b. In the current mechanism, if "a" is a user pointer,
    a->b access may trigger a page fault and the verifier generated
    code will simulate bpf_probe_read() and return 0 for a->b, which
    may not be correct value.
    
    Now BTF contains __user information, it can check whether the
    pointer points to a user memory or not. If it is, the verifier
    can reject the program and force users to use bpf_probe_read_user()
    helper explicitly.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    yonghong-song authored and intel-lab-lkp committed Nov 17, 2021
  3. compiler_types: define __user as __attribute__((btf_type_tag("user")))

    If pahole and compiler supports btf_type_tag attributes,
    during kernel build, we can define __user as
    __attribute__((btf_type_tag("user"))). This will encode __user
    information in BTF. Such information, encoded in BTF
    as BTF_KIND_TYPE_TAG, can help bpf verifier to
    ensure proper memory dereference mechanism depending
    on user memory or kernel memory.
    
    The encoded __user info is also useful for other tracing
    facility where instead of to require user to specify
    kernel/user address type, the kernel can detect it
    by itself with btf.
    
    The following is an example with latest upstream clang
    (clang14, [1]) and latest pahole:
      [$ ~] cat test.c
      #define __tag1 __attribute__((btf_type_tag("tag1")))
      int foo(int __tag1 *arg) {
              return *arg;
      }
      [$ ~] clang -O2 -g -c test.c
      [$ ~] pahole -JV test.o
      ...
      [1] INT int size=4 nr_bits=32 encoding=SIGNED
      [2] TYPE_TAG tag1 type_id=1
      [3] PTR (anon) type_id=2
      [4] FUNC_PROTO (anon) return=1 args=(3 arg)
      [5] FUNC foo type_id=4
      [$ ~]
    
    You can see for the function argument "int __tag1 *arg",
    its type is described as
      PTR -> TYPE_TAG(tag1) -> INT
    
    The kernel can take advantage of this information
    to bpf verification or other use cases.
    
    Current btf_type_tag is only supported in clang (>= clang14).
    gcc support is also proposed and under development ([2]).
    
      [1] https://reviews.llvm.org/D111199
      [2] https://www.spinics.net/lists/bpf/msg45773.html
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    yonghong-song authored and intel-lab-lkp committed Nov 17, 2021
  4. selftests/bpf: Mark variable as static

    Fix warnings from checkstyle.pl
    
    Signed-off-by: Yucong Sun <sunyucong@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112192535.898352-4-fallentree@fb.com
    thefallentree authored and anakryiko committed Nov 17, 2021
  5. selftests/bpf: Variable naming fix

    Change log_fd to log_fp to reflect its type correctly.
    
    Signed-off-by: Yucong Sun <sunyucong@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112192535.898352-3-fallentree@fb.com
    thefallentree authored and anakryiko committed Nov 17, 2021
  6. selftests/bpf: Move summary line after the error logs

    Makes it easier to find the summary line when there is a lot of logs to
    scroll back.
    
    Signed-off-by: Yucong Sun <sunyucong@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112192535.898352-2-fallentree@fb.com
    thefallentree authored and anakryiko committed Nov 17, 2021

Commits on Nov 16, 2021

  1. selftests/bpf: Add uprobe triggering overhead benchmarks

    Add benchmark to measure overhead of uprobes and uretprobes. Also have
    a baseline (no uprobe attached) benchmark.
    
    On my dev machine, baseline benchmark can trigger 130M user_target()
    invocations. When uprobe is attached, this falls to just 700K. With
    uretprobe, we get down to 520K:
    
      $ sudo ./bench trig-uprobe-base -a
      Summary: hits  131.289 ± 2.872M/s
    
      # UPROBE
      $ sudo ./bench -a trig-uprobe-without-nop
      Summary: hits    0.729 ± 0.007M/s
    
      $ sudo ./bench -a trig-uprobe-with-nop
      Summary: hits    1.798 ± 0.017M/s
    
      # URETPROBE
      $ sudo ./bench -a trig-uretprobe-without-nop
      Summary: hits    0.508 ± 0.012M/s
    
      $ sudo ./bench -a trig-uretprobe-with-nop
      Summary: hits    0.883 ± 0.008M/s
    
    So there is almost 2.5x performance difference between probing nop vs
    non-nop instruction for entry uprobe. And 1.7x difference for uretprobe.
    
    This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe
    and 2 microseconds for non-nop uretprobe.
    
    For nop variants, uprobe and uretprobe overhead is down to 0.556 and
    1.13 microseconds, respectively.
    
    For comparison, just doing a very low-overhead syscall (with no BPF
    programs attached anywhere) gives:
    
      $ sudo ./bench trig-base -a
      Summary: hits    4.830 ± 0.036M/s
    
    So uprobes are about 2.67x slower than pure context switch.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org
    anakryiko authored and borkmann committed Nov 16, 2021
  2. bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33

    In the current code, the actual max tail call count is 33 which is greater
    than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
    with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
    We can see the historical evolution from commit 04fd61a ("bpf: allow
    bpf programs to tail-call other bpf programs") and commit f9dabe0
    ("bpf: Undo off-by-one in interpreter tail call count limit"). In order
    to avoid changing existing behavior, the actual limit is 33 now, this is
    reasonable.
    
    After commit 874be05 ("bpf, tests: Add tail call test suite"), we can
    see there exists failed testcase.
    
    On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
     # echo 0 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf
     # dmesg | grep -w FAIL
     Tail call error path, max count reached jited:0 ret 34 != 33 FAIL
    
    On some archs:
     # echo 1 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf
     # dmesg | grep -w FAIL
     Tail call error path, max count reached jited:1 ret 34 != 33 FAIL
    
    Although the above failed testcase has been fixed in commit 18935a7
    ("bpf/tests: Fix error in tail call limit tests"), it would still be good
    to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
    more readable.
    
    The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
    limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
    mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
    For the riscv JIT, use RV_REG_TCC directly to save one register move as
    suggested by Björn Töpel. For the other implementations, no function changes,
    it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
    can reflect the actual max tail call count, the related tail call testcases
    in test_bpf module and selftests can work well for the interpreter and the
    JIT.
    
    Here are the test results on x86_64:
    
     # uname -m
     x86_64
     # echo 0 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf test_suite=test_tail_calls
     # dmesg | tail -1
     test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
     # rmmod test_bpf
     # echo 1 > /proc/sys/net/core/bpf_jit_enable
     # modprobe test_bpf test_suite=test_tail_calls
     # dmesg | tail -1
     test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
     # rmmod test_bpf
     # ./test_progs -t tailcalls
     torvalds#142 tailcalls:OK
     Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED
    
    Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
    Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Acked-by: Björn Töpel <bjorn@kernel.org>
    Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
    Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
    Tiezhu Yang authored and borkmann committed Nov 16, 2021
  3. selftests/bpf: Configure dir paths via env in test_bpftool_synctypes.py

    Script test_bpftool_synctypes.py parses a number of files in the bpftool
    directory (or even elsewhere in the repo) to make sure that the list of
    types or options in those different files are consistent. Instead of
    having fixed paths, let's make the directories configurable through
    environment variable. This should make easier in the future to run the
    script in a different setup, for example on an out-of-tree bpftool
    mirror with a different layout.
    
    Signed-off-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115225844.33943-4-quentin@isovalent.com
    qmonnet authored and borkmann committed Nov 16, 2021
  4. bpftool: Update doc (use susbtitutions) and test_bpftool_synctypes.py

    test_bpftool_synctypes.py helps detecting inconsistencies in bpftool
    between the different list of types and options scattered in the
    sources, the documentation, and the bash completion. For options that
    apply to all bpftool commands, the script had a hardcoded list of
    values, and would use them to check whether the man pages are
    up-to-date. When writing the script, it felt acceptable to have this
    list in order to avoid to open and parse bpftool's main.h every time,
    and because the list of global options in bpftool doesn't change so
    often.
    
    However, this is prone to omissions, and we recently added a new
    -l|--legacy option which was described in common_options.rst, but not
    listed in the options summary of each manual page. The script did not
    complain, because it keeps comparing the hardcoded list to the (now)
    outdated list in the header file.
    
    To address the issue, this commit brings the following changes:
    
    - Options that are common to all bpftool commands (--json, --pretty, and
      --debug) are moved to a dedicated file, and used in the definition of
      a RST substitution. This substitution is used in the sources of all
      the man pages.
    
    - This list of common options is updated, with the addition of the new
      -l|--legacy option.
    
    - The script test_bpftool_synctypes.py is updated to compare:
        - Options specific to a command, found in C files, for the
          interactive help messages, with the same specific options from the
          relevant man page for that command.
        - Common options, checked just once: the list in main.h is
          compared with the new list in substitutions.rst.
    
    Signed-off-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115225844.33943-3-quentin@isovalent.com
    qmonnet authored and borkmann committed Nov 16, 2021
  5. bpftool: Add SPDX tags to RST documentation files

    Most files in the kernel repository have a SPDX tags. The files that
    don't have such a tag (or another license boilerplate) tend to fall
    under the GPL-2.0 license. In the past, bpftool's Makefile (for example)
    has been marked as GPL-2.0 for that reason, when in fact all bpftool is
    dual-licensed.
    
    To prevent a similar confusion from happening with the RST documentation
    files for bpftool, let's explicitly mark all files as dual-licensed.
    
    Signed-off-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115225844.33943-2-quentin@isovalent.com
    qmonnet authored and borkmann committed Nov 16, 2021
  6. selftests/bpf: Add a dedup selftest with equivalent structure types

    Without previous libbpf patch, the following error will occur:
    
      $ ./test_progs -t btf
      ...
      do_test_dedup:FAIL:check btf_dedup failed errno:-22#13/205 btf/dedup: btf_type_tag #5, struct:FAIL
    
    And the previous libbpf patch fixed the issue.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115163943.3922547-1-yhs@fb.com
    yonghong-song authored and borkmann committed Nov 16, 2021
  7. libbpf: Fix a couple of missed btf_type_tag handling in btf.c

    Commit 2dc1e48 ("libbpf: Support BTF_KIND_TYPE_TAG") added the
    BTF_KIND_TYPE_TAG support. But to test vmlinux build with ...
    
      #define __user __attribute__((btf_type_tag("user")))
    
    ... I needed to sync libbpf repo and manually copy libbpf sources to
    pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the
    BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely
    used in kernel.
    
    But this approach missed one case in dedup with structures where
    BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in
    btf_dedup_is_equiv(), and this will result in a pahole dedup failure.
    This patch fixed this issue and a selftest is added in the subsequent
    patch to test this scenario.
    
    The other missed handling is in btf__resolve_size(). Currently the compiler
    always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing
    BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets
    add case BTF_KIND_TYPE_TAG in the switch statement to be future proof.
    
    Fixes: 2dc1e48 ("libbpf: Support BTF_KIND_TYPE_TAG")
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20211115163937.3922235-1-yhs@fb.com
    yonghong-song authored and borkmann committed Nov 16, 2021
  8. bpftool: Add current libbpf_strict mode to version output

    + bpftool --legacy --version
    bpftool v5.15.0
    features: libbfd, skeletons
    + bpftool --version
    bpftool v5.15.0
    features: libbfd, libbpf_strict, skeletons
    
    + bpftool --legacy --help
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    + bpftool --help
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    
    + bpftool --legacy
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    + bpftool
    Usage: bpftool [OPTIONS] OBJECT { COMMAND | help }
           bpftool batch file FILE
           bpftool version
    
           OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops | iter }
           OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy} |
                        {-V|--version} }
    
    + bpftool --legacy version
    bpftool v5.15.0
    features: libbfd, skeletons
    + bpftool version
    bpftool v5.15.0
    features: libbfd, libbpf_strict, skeletons
    
    + bpftool --json --legacy version
    {"version":"5.15.0","features":{"libbfd":true,"libbpf_strict":false,"skeletons":true}}
    + bpftool --json version
    {"version":"5.15.0","features":{"libbfd":true,"libbpf_strict":true,"skeletons":true}}
    
    Suggested-by: Quentin Monnet <quentin@isovalent.com>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Quentin Monnet <quentin@isovalent.com>
    Link: https://lore.kernel.org/bpf/20211116000448.2918854-1-sdf@google.com
    fomichev authored and borkmann committed Nov 16, 2021

Commits on Nov 15, 2021

  1. Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

    Daniel Borkmann says:
    
    ====================
    pull-request: bpf-next 2021-11-15
    
    We've added 72 non-merge commits during the last 13 day(s) which contain
    a total of 171 files changed, 2728 insertions(+), 1143 deletions(-).
    
    The main changes are:
    
    1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to
       BTF such that BPF verifier will be able to detect misuse, from Yonghong Song.
    
    2) Big batch of libbpf improvements including various fixes, future proofing APIs,
       and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko.
    
    3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the
       programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush.
    
    4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to
       ensure exception handling still works, from Russell King and Alan Maguire.
    
    5) Add a new bpf_find_vma() helper for tracing to map an address to the backing
       file such as shared library, from Song Liu.
    
    6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump,
       updating documentation and bash-completion among others, from Quentin Monnet.
    
    7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as
       the API is heavily tailored around perf and is non-generic, from Dave Marchevsky.
    
    8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an
       opt-out for more relaxed BPF program requirements, from Stanislav Fomichev.
    
    9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen.
    
    * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits)
      bpftool: Use libbpf_get_error() to check error
      bpftool: Fix mixed indentation in documentation
      bpftool: Update the lists of names for maps and prog-attach types
      bpftool: Fix indent in option lists in the documentation
      bpftool: Remove inclusion of utilities.mak from Makefiles
      bpftool: Fix memory leak in prog_dump()
      selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
      selftests/bpf: Fix an unused-but-set-variable compiler warning
      bpf: Introduce btf_tracing_ids
      bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
      bpftool: Enable libbpf's strict mode by default
      docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support
      selftests/bpf: Clarify llvm dependency with btf_tag selftest
      selftests/bpf: Add a C test for btf_type_tag
      selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
      selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
      selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
      selftests/bpf: Test libbpf API function btf__add_type_tag()
      bpftool: Support BTF_KIND_TYPE_TAG
      libbpf: Support BTF_KIND_TYPE_TAG
      ...
    ====================
    
    Link: https://lore.kernel.org/r/20211115162008.25916-1-daniel@iogearbox.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Nov 15, 2021
  2. Revert "Merge branch 'mctp-i2c-driver'"

    This reverts commit 71812af, reversing
    changes made to cc0be1a.
    
    Wolfram Sang says:
    
    Please revert. Besides the driver in net, it modifies the I2C core
    code. This has not been acked by the I2C maintainer (in this case me).
    So, please don't pull this in via the net tree. The question raised here
    (extending SMBus calls to 255 byte) is complicated because we need ABI
    backwards compatibility.
    
    Link: https://lore.kernel.org/all/YZJ9H4eM%2FM7OXVN0@shikoro/
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Nov 15, 2021
  3. Merge branch 'generic-phylink-validation'

    Russell King says:
    
    ====================
    introduce generic phylink validation
    
    The various validate method implementations we have in phylink users
    have been quite repetitive but also prone to bugs. These patches
    introduce a generic implementation which relies solely on the
    supported_interfaces bitmap introduced during last cycle, and in the
    first patch, a bit array of MAC capabilities.
    
    MAC drivers are free to continue to do their own thing if they have
    special requirements - such as mvneta and mvpp2 which do not support
    1000base-X without AN enabled. Most implementations currently in the
    kernel can be converted to call phylink_generic_validate() directly
    from the phylink MAC operations structure once they fill in the
    supported_interfaces and mac_capabilities members of phylink_config.
    
    This series introduces the generic implementation, and converts mvneta
    and mvpp2 to use it.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Nov 15, 2021
  4. net: mvpp2: use phylink_generic_validate()

    Convert mvpp2 to use phylink_generic_validate() for the bulk of its
    validate() implementation. This network adapter has a restriction
    that for 802.3z links, autonegotiation must be enabled.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Nov 15, 2021
  5. net: mvneta: use phylink_generic_validate()

    Convert mvneta to use phylink_generic_validate() for the bulk of its
    validate() implementation. This network adapter has a restriction
    that for 802.3z links, autonegotiation must be enabled.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Nov 15, 2021
  6. net: phylink: add generic validate implementation

    Add a generic validate() implementation using the supported_interfaces
    and a bitmask of MAC pause/speed/duplex capabilities. This allows us
    to entirely eliminate many driver private validate() implementations.
    
    We expose the underlying phylink_get_linkmodes() function so that
    drivers which have special needs can still benefit from conversion.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Nov 15, 2021
  7. net/wan/fsl_ucc_hdlc: fix sparse warnings

    CHECK   drivers/net/wan/fsl_ucc_hdlc.c
    drivers/net/wan/fsl_ucc_hdlc.c:309:57: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:309:57:    expected void [noderef] __iomem *
    drivers/net/wan/fsl_ucc_hdlc.c:309:57:    got restricted __be16 *
    drivers/net/wan/fsl_ucc_hdlc.c:311:46: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:311:46:    expected void [noderef] __iomem *
    drivers/net/wan/fsl_ucc_hdlc.c:311:46:    got restricted __be32 *
    drivers/net/wan/fsl_ucc_hdlc.c:320:57: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:320:57:    expected void [noderef] __iomem *
    drivers/net/wan/fsl_ucc_hdlc.c:320:57:    got restricted __be16 *
    drivers/net/wan/fsl_ucc_hdlc.c:322:46: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:322:46:    expected void [noderef] __iomem *
    drivers/net/wan/fsl_ucc_hdlc.c:322:46:    got restricted __be32 *
    drivers/net/wan/fsl_ucc_hdlc.c:372:29: warning: incorrect type in assignment (different base types)
    drivers/net/wan/fsl_ucc_hdlc.c:372:29:    expected unsigned short [usertype]
    drivers/net/wan/fsl_ucc_hdlc.c:372:29:    got restricted __be16 [usertype]
    drivers/net/wan/fsl_ucc_hdlc.c:379:36: warning: restricted __be16 degrades to integer
    drivers/net/wan/fsl_ucc_hdlc.c:402:12: warning: incorrect type in assignment (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:402:12:    expected struct qe_bd [noderef] __iomem *bd
    drivers/net/wan/fsl_ucc_hdlc.c:402:12:    got struct qe_bd *curtx_bd
    drivers/net/wan/fsl_ucc_hdlc.c:425:20: warning: incorrect type in assignment (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:425:20:    expected struct qe_bd [noderef] __iomem *[assigned] bd
    drivers/net/wan/fsl_ucc_hdlc.c:425:20:    got struct qe_bd *tx_bd_base
    drivers/net/wan/fsl_ucc_hdlc.c:427:16: error: incompatible types in comparison expression (different address spaces):
    drivers/net/wan/fsl_ucc_hdlc.c:427:16:    struct qe_bd [noderef] __iomem *
    drivers/net/wan/fsl_ucc_hdlc.c:427:16:    struct qe_bd *
    drivers/net/wan/fsl_ucc_hdlc.c:462:33: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:506:41: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:528:33: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:552:38: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:596:67: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:611:41: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:851:38: warning: incorrect type in initializer (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:854:40: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:855:40: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:858:39: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:861:37: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:866:38: warning: incorrect type in initializer (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:868:21: warning: incorrect type in argument 1 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:870:40: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:871:40: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:873:39: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:993:57: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:995:46: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:1004:57: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:1006:46: warning: incorrect type in argument 2 (different address spaces)
    drivers/net/wan/fsl_ucc_hdlc.c:412:35: warning: dereference of noderef expression
    drivers/net/wan/fsl_ucc_hdlc.c:412:35: warning: dereference of noderef expression
    drivers/net/wan/fsl_ucc_hdlc.c:724:29: warning: dereference of noderef expression
    drivers/net/wan/fsl_ucc_hdlc.c:815:21: warning: dereference of noderef expression
    drivers/net/wan/fsl_ucc_hdlc.c:1021:29: warning: dereference of noderef expression
    
    Most of the warnings are due to DMA memory being incorrectly handled as IO memory.
    Fix it by doing direct read/write and doing proper dma_rmb() / dma_wmb().
    
    Other problems are type mismatches or lack of use of IO accessors.
    
    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    Reported-by: kernel test robot <lkp@intel.com>
    Link: https://lkml.org/lkml/2021/11/12/647
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    chleroy authored and davem330 committed Nov 15, 2021
  8. net: fddi: use swap() to make code cleaner

    Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid
    opencoding it.
    
    Signed-off-by: Yihao Han <hanyihao@vivo.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Yihao Han authored and davem330 committed Nov 15, 2021
  9. hinic: use ARRAY_SIZE instead of ARRAY_LEN

    ARRAY_SIZE defined in <linux/kernel.h> is safer than self-defined
    macros to get size of an array such as ARRAY_LEN used here. Because
    ARRAY_SIZE uses __must_be_array(arr) to ensure arr is really an array.
    
    Reported-by: Alejandro Colomar <colomar.6.4.3@gmail.com>
    Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Guo Zhengkui authored and davem330 committed Nov 15, 2021
  10. net: usb: ax88179_178a: add TSO feature

    On low-effciency embedded platforms, transmission performance is poor
    due to on Bulk-out with single packet.
    Adding TSO feature improves the transmission performance and reduces
    the number of interrupt caused by Bulk-out complete.
    
    Reference to module, net: usb: aqc111.
    
    Signed-off-by: Jacky Chou <jackychou@asix.com.tw>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Jacky Chou authored and davem330 committed Nov 15, 2021
  11. Merge branch 'mctp-i2c-driver'

    Matt Johnston says:
    
    ====================
    MCTP I2C driver
    
    This patch series adds a netdev driver providing MCTP transport over
    I2C.
    
    It applies against net-next using recent MCTP changes there, though also
    has I2C core changes for review. I'll leave it to maintainers where it
    should be applied - please let me know if it needs to be submitted
    differently.
    
    The I2C patches were previously sent as RFC though the only feedback
    there was an ack to 255 bytes for aspeed.
    
    The dt-bindings patch went through review on the list.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Nov 15, 2021
  12. mctp i2c: MCTP I2C binding driver

    Provides MCTP network transport over an I2C bus, as specified in
    DMTF DSP0237. All messages between nodes are sent as SMBus Block Writes.
    
    Each I2C bus to be used for MCTP is flagged in devicetree by a
    'mctp-controller' property on the bus node. Each flagged bus gets a
    mctpi2cX net device created based on the bus number. A
    'mctp-i2c-controller' I2C client needs to be added under the adapter. In
    an I2C mux situation the mctp-i2c-controller node must be attached only
    to the root I2C bus. The I2C client will handle incoming I2C slave block
    write data for subordinate busses as well as its own bus.
    
    In configurations without devicetree a driver instance can be attached
    to a bus using the I2C slave new_device mechanism.
    
    The MCTP core will hold/release the MCTP I2C device while responses
    are pending (a 6 second timeout or once a socket is closed, response
    received etc). While held the MCTP I2C driver will lock the I2C bus so
    that the correct I2C mux remains selected while responses are received.
    
    (Ideally we would just lock the mux to keep the current bus selected for
    the response rather than a full I2C bus lock, but that isn't exposed in
    the I2C mux API)
    
    This driver requires I2C adapters that allow 255 byte transfers
    (SMBus 3.0) as the specification requires a minimum MTU of 68 bytes.
    
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mkj authored and davem330 committed Nov 15, 2021
  13. dt-bindings: net: New binding mctp-i2c-controller

    Used to define a local endpoint to communicate with MCTP peripherals
    attached to an I2C bus. This I2C endpoint can communicate with remote
    MCTP devices on the I2C bus.
    
    In the example I2C topology below (matching the second yaml example) we
    have MCTP devices on busses i2c1 and i2c6. MCTP-supporting busses are
    indicated by the 'mctp-controller' DT property on an I2C bus node.
    
    A mctp-i2c-controller I2C client DT node is placed at the top of the
    mux topology, since only the root I2C adapter will support I2C slave
    functionality.
                                                   .-------.
                                                   |eeprom |
        .------------.     .------.               /'-------'
        | adapter    |     | mux  --@0,i2c5------'
        | i2c1       ----.*|      --@1,i2c6--.--.
        |............|    \'------'           \  \  .........
        | mctp-i2c-  |     \                   \  \ .mctpB  .
        | controller |      \                   \  '.0x30   .
        |            |       \  .........        \  '.......'
        | 0x50       |        \ .mctpA  .         \ .........
        '------------'         '.0x1d   .          '.mctpC  .
                                '.......'          '.0x31   .
                                                    '.......'
    (mctpX boxes above are remote MCTP devices not included in the DT at
    present, they can be hotplugged/probed at runtime. A DT binding for
    specific fixed MCTP devices could be added later if required)
    
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Reviewed-by: Rob Herring <robh@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mkj authored and davem330 committed Nov 15, 2021
  14. i2c: npcm7xx: Allow 255 byte block SMBus transfers

    255 byte support has been tested on a npcm750 board
    
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Reviewed-by: Tali Perry <tali.perry1@gmail.com>
    Reviewed-by: Patrick Venture <venture@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mkj authored and davem330 committed Nov 15, 2021
  15. i2c: aspeed: Allow 255 byte block transfers

    255 byte transfers have been tested on an AST2500 board
    
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mkj authored and davem330 committed Nov 15, 2021
  16. i2c: dev: Handle 255 byte blocks for i2c ioctl

    I2C_SMBUS is limited to 32 bytes due to compatibility with the
    32 byte i2c_smbus_data.block
    
    I2C_RDWR allows larger transfers if sufficient sized buffers are passed.
    
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mkj authored and davem330 committed Nov 15, 2021
  17. i2c: core: Allow 255 byte transfers for SMBus 3.x

    SMBus 3.0 increased the maximum block transfer size from 32 bytes to
    255 bytes. We increase the size of struct i2c_smbus_data's block[]
    member.
    
    i2c_smbus_xfer() and i2c_smbus_xfer_emulated() now support 255 byte
    block operations, other block functions remain limited to 32 bytes for
    compatibility with existing callers.
    
    We allow adapters to indicate support for the larger size with
    I2C_FUNC_SMBUS_V3_BLOCK. Most emulated drivers should be able to use 255
    byte blocks by replacing I2C_SMBUS_BLOCK_MAX with I2C_SMBUS_V3_BLOCK_MAX
    though some will have hardware limitations that need testing.
    
    Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mkj authored and davem330 committed Nov 15, 2021
  18. net: bridge: Slightly optimize 'find_portno()'

    The 'inuse' bitmap is local to this function. So we can use the
    non-atomic '__set_bit()' to save a few cycles.
    
    While at it, also remove some useless {}.
    
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    tititiou36 authored and davem330 committed Nov 15, 2021
  19. net: sched: sch_netem: Refactor code in 4-state loss generator

    Fixed comments to match description with variable names and
    refactored code to match the convention as per [1].
    
    To match the convention mapping is done as follows:
    State 3 - LOST_IN_BURST_PERIOD
    State 4 - LOST_IN_GAP_PERIOD
    
    [1] S. Salsano, F. Ludovici, A. Ordine, "Definition of a general
    and intuitive loss model for packet networks and its implementation
    in the Netem module in the Linux kernel"
    
    Fixes: a6e2fe1 ("sch_netem: replace magic numbers with enumerate")
    Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
    Acked-by: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Harshit Mogalapalli authored and davem330 committed Nov 15, 2021
  20. net: dsa: vsc73xxx: Make vsc73xx_remove() return void

    vsc73xx_remove() returns zero unconditionally and no caller checks the
    returned value. So convert the function to return no value.
    
    Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    ukleinek authored and davem330 committed Nov 15, 2021
  21. net: stmmac: enhance XDP ZC driver level switching performance

    The previous stmmac_xdp_set_prog() implementation uses stmmac_release()
    and stmmac_open() which tear down the PHY device and causes undesirable
    autonegotiation which causes a delay whenever AFXDP ZC is setup.
    
    This patch introduces two new functions that just sufficiently tear
    down DMA descriptors, buffer, NAPI process, and IRQs and reestablish
    them accordingly in both stmmac_xdp_release() and stammac_xdp_open().
    
    As the results of this enhancement, we get rid of transient state
    introduced by the link auto-negotiation:
    
    $ ./xdpsock -i eth0 -t -z
    
     sock0@eth0:0 txonly xdp-drv
                       pps            pkts           1.00
    rx                 0              0
    tx                 634444         634560
    
     sock0@eth0:0 txonly xdp-drv
                       pps            pkts           1.00
    rx                 0              0
    tx                 632330         1267072
    
     sock0@eth0:0 txonly xdp-drv
                       pps            pkts           1.00
    rx                 0              0
    tx                 632438         1899584
    
     sock0@eth0:0 txonly xdp-drv
                       pps            pkts           1.00
    rx                 0              0
    tx                 632502         2532160
    
    Reported-by: Kurt Kanzenbach <kurt@linutronix.de>
    Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
    Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    elvinongbl authored and davem330 committed Nov 15, 2021
Older