Skip to content
Permalink
Wei-Wang/net-a…
Switch branches/tags

Commits on Sep 29, 2021

  1. tcp: adjust rcv_ssthresh according to sk_reserved_mem

    When user sets SO_RESERVE_MEM socket option, in order to utilize the
    reserved memory when in memory pressure state, we adjust rcv_ssthresh
    according to the available reserved memory for the socket, instead of
    using 4 * advmss always.
    
    Signed-off-by: Wei Wang <weiwan@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    tracywwnj authored and intel-lab-lkp committed Sep 29, 2021
  2. tcp: adjust sndbuf according to sk_reserved_mem

    If user sets SO_RESERVE_MEM socket option, in order to fully utilize the
    reserved memory in memory pressure state on the tx path, we modify the
    logic in sk_stream_moderate_sndbuf() to set sk_sndbuf according to
    available reserved memory, instead of MIN_SOCK_SNDBUF, and adjust it
    when new data is acked.
    
    Signed-off-by: Wei Wang <weiwan@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    tracywwnj authored and intel-lab-lkp committed Sep 29, 2021
  3. net: add new socket option SO_RESERVE_MEM

    This socket option provides a mechanism for users to reserve a certain
    amount of memory for the socket to use. When this option is set, kernel
    charges the user specified amount of memory to memcg, as well as
    sk_forward_alloc. This amount of memory is not reclaimable and is
    available in sk_forward_alloc for this socket.
    With this socket option set, the networking stack spends less cycles
    doing forward alloc and reclaim, which should lead to better system
    performance, with the cost of an amount of pre-allocated and
    unreclaimable memory, even under memory pressure.
    
    Note:
    This socket option is only available when memory cgroup is enabled and we
    require this reserved memory to be charged to the user's memcg. We hope
    this could avoid mis-behaving users to abused this feature to reserve a
    large amount on certain sockets and cause unfairness for others.
    
    Signed-off-by: Wei Wang <weiwan@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    tracywwnj authored and intel-lab-lkp committed Sep 29, 2021
  4. net: bridge: mcast: Associate the seqcount with its protecting lock.

    The sequence count bridge_mcast_querier::seq is protected by
    net_bridge::multicast_lock but seqcount_init() does not associate the
    seqcount with the lock. This leads to a warning on PREEMPT_RT because
    preemption is still enabled.
    
    Let seqcount_init() associate the seqcount with lock that protects the
    write section. Remove lockdep_assert_held_once() because lockdep already checks
    whether the associated lock is held.
    
    Fixes: 67b746f ("net: bridge: mcast: make sure querier port/address updates are consistent")
    Reported-by: Mike Galbraith <efault@gmx.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Tested-by: Mike Galbraith <efault@gmx.de>
    Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
    Link: https://lore.kernel.org/r/20210928141049.593833-1-bigeasy@linutronix.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Thomas Gleixner authored and Jakub Kicinski committed Sep 29, 2021
  5. net: mdio-ipq4019: Fix the error for an optional regs resource

    The second resource is optional which is only provided on the chipset
    IPQ5018. But the blamed commit ignores that and if the resource is
    not there it just fails.
    
    the resource is used like this,
    	if (priv->eth_ldo_rdy) {
    		val = readl(priv->eth_ldo_rdy);
    		val |= BIT(0);
    		writel(val, priv->eth_ldo_rdy);
    		fsleep(IPQ_PHY_SET_DELAY_US);
    	}
    
    This patch reverts that to still allow the second resource to be optional
    because other SoC have the some MDIO controller and doesn't need to
    second resource.
    
    Fixes: fa14d03 ("net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()")
    Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Link: https://lore.kernel.org/r/20210928134849.2092-1-caihuoqing@baidu.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Cai Huoqing authored and Jakub Kicinski committed Sep 29, 2021

Commits on Sep 28, 2021

  1. Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

    Daniel Borkmann says:
    
    ====================
    pull-request: bpf 2021-09-28
    
    The following pull-request contains BPF updates for your *net* tree.
    
    We've added 10 non-merge commits during the last 14 day(s) which contain
    a total of 11 files changed, 139 insertions(+), 53 deletions(-).
    
    The main changes are:
    
    1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk.
    
    2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh.
    
    3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann.
    
    4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi.
    
    5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer.
    
    6) Fix return value handling for struct_ops BPF programs, from Hou Tao.
    
    7) Various fixes to BPF selftests, from Jiri Benc.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    ,
    davem330 committed Sep 28, 2021
  2. net: hns3: fix hclge_dbg_dump_tm_pg() stack usage

    This function copies strings around between multiple buffers
    including a large on-stack array that causes a build warning
    on 32-bit systems:
    
    drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c: In function 'hclge_dbg_dump_tm_pg':
    drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:782:1: error: the frame size of 1424 bytes is larger than 1400 bytes [-Werror=frame-larger-than=]
    
    The function can probably be cleaned up a lot, to go back to
    printing directly into the output buffer, but dynamically allocating
    the structure is a simpler workaround for now.
    
    Fixes: 04d9613 ("net: hns3: refine function hclge_dbg_dump_tm_pri()")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    arndb authored and davem330 committed Sep 28, 2021
  3. net: mdio: mscc-miim: Fix the mdio controller

    According to the documentation the second resource is optional. But the
    blamed commit ignores that and if the resource is not there it just
    fails.
    
    This patch reverts that to still allow the second resource to be
    optional because other SoC have the some MDIO controller and doesn't
    need to second resource.
    
    Fixes: 672a1c3 ("net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource()")
    Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
    Reviewed-by: Cai Huoqing <caihuoqing@baidu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    HoratiuVultur authored and davem330 committed Sep 28, 2021
  4. af_unix: Return errno instead of NULL in unix_create1().

    unix_create1() returns NULL on error, and the callers assume that it never
    fails for reasons other than out of memory.  So, the callers always return
    -ENOMEM when unix_create1() fails.
    
    However, it also returns NULL when the number of af_unix sockets exceeds
    twice the limit controlled by sysctl: fs.file-max.  In this case, the
    callers should return -ENFILE like alloc_empty_file().
    
    This patch changes unix_create1() to return the correct error value instead
    of NULL on error.
    
    Out of curiosity, the assumption has been wrong since 1999 due to this
    change introduced in 2.2.4 [0].
    
      diff -u --recursive --new-file v2.2.3/linux/net/unix/af_unix.c linux/net/unix/af_unix.c
      --- v2.2.3/linux/net/unix/af_unix.c	Tue Jan 19 11:32:53 1999
      +++ linux/net/unix/af_unix.c	Sun Mar 21 07:22:00 1999
      @@ -388,6 +413,9 @@
       {
       	struct sock *sk;
    
      +	if (atomic_read(&unix_nr_socks) >= 2*max_files)
      +		return NULL;
      +
       	MOD_INC_USE_COUNT;
       	sk = sk_alloc(PF_UNIX, GFP_KERNEL, 1);
       	if (!sk) {
    
    [0]: https://cdn.kernel.org/pub/linux/kernel/v2.2/patch-2.2.4.gz
    
    Fixes: 1da177e ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    q2ven authored and davem330 committed Sep 28, 2021
  5. net: udp: annotate data race around udp_sk(sk)->corkflag

    up->corkflag field can be read or written without any lock.
    Annotate accesses to avoid possible syzbot/KCSAN reports.
    
    Fixes: 1da177e ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    neebe000 authored and davem330 committed Sep 28, 2021
  6. net: sun: SUNVNET_COMMON should depend on INET

    When CONFIG_INET is not set, there are failing references to IPv4
    functions, so make this driver depend on INET.
    
    Fixes these build errors:
    
    sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_start_xmit_common':
    sunvnet_common.c:(.text+0x1a68): undefined reference to `__icmp_send'
    sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_poll_common':
    sunvnet_common.c:(.text+0x358c): undefined reference to `ip_send_check'
    
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Jakub Kicinski <kuba@kernel.org>
    Cc: Aaron Young <aaron.young@oracle.com>
    Cc: Rashmi Narasimhan <rashmi.narasimhan@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    rddunlap authored and davem330 committed Sep 28, 2021
  7. ionic: fix gathering of debug stats

    Don't print stats for which we haven't reserved space as it can
    cause nasty memory bashing and related bad behaviors.
    
    Fixes: aa62099 ("ionic: pull per-q stats work out of queue loops")
    Signed-off-by: Shannon Nelson <snelson@pensando.io>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    emusln authored and davem330 committed Sep 28, 2021
  8. Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/t

    nguy/net-queue
    
    Tony Nguyen says:
    
    ====================
    Intel Wired LAN Driver Updates 2021-09-27
    
    This series contains updates to e100 driver only.
    
    Jake corrects under allocation of register buffer due to incorrect
    calculations and fixes buffer overrun of register dump.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Sep 28, 2021
  9. dmascc: add CONFIG_VIRT_TO_BUS dependency

    Many architectures don't define virt_to_bus() any more, as drivers
    should be using the dma-mapping interfaces where possible:
    
    In file included from drivers/net/hamradio/dmascc.c:27:
    drivers/net/hamradio/dmascc.c: In function 'tx_on':
    drivers/net/hamradio/dmascc.c:976:30: error: implicit declaration of function 'virt_to_bus'; did you mean 'virt_to_fix'? [-Werror=implicit-function-declaration]
      976 |                              virt_to_bus(priv->tx_buf[priv->tx_tail]) + n);
          |                              ^~~~~~~~~~~
    arch/arm/include/asm/dma.h:109:52: note: in definition of macro 'set_dma_addr'
      109 |         __set_dma_addr(chan, (void *)__bus_to_virt(addr))
          |                                                    ^~~~
    
    Add the Kconfig dependency to prevent this from being built on
    architectures without virt_to_bus().
    
    Fixes: bc1abb9 ("dmascc: use proper 'virt_to_bus()' rather than casting to 'int'")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    arndb authored and davem330 committed Sep 28, 2021
  10. net: ks8851: fix link error

    An object file cannot be built for both loadable module and built-in
    use at the same time:
    
    arm-linux-gnueabi-ld: drivers/net/ethernet/micrel/ks8851_common.o: in function `ks8851_probe_common':
    ks8851_common.c:(.text+0xf80): undefined reference to `__this_module'
    
    Change the ks8851_common code to be a standalone module instead,
    and use Makefile logic to ensure this is built-in if at least one
    of its two users is.
    
    Fixes: 797047f ("net: ks8851: Implement Parallel bus operations")
    Link: https://lore.kernel.org/netdev/20210125121937.3900988-1-arnd@kernel.org/
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Acked-by: Marek Vasut <marex@denx.de>
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    arndb authored and davem330 committed Sep 28, 2021
  11. bpf, x86: Fix bpf mapping of atomic fetch implementation

    Fix the case where the dst register maps to %rax as otherwise this produces
    an incorrect mapping with the implementation in 981f94c ("bpf: Add
    bitwise atomic instructions") as %rax is clobbered given it's part of the
    cmpxchg as operand.
    
    The issue is similar to b29dd96 ("bpf, x86: Fix BPF_FETCH atomic and/or/
    xor with r0 as src") just that the case of dst register was missed.
    
    Before, dst=r0 (%rax) src=r2 (%rsi):
    
      [...]
      c5:   mov    %rax,%r10
      c8:   mov    0x0(%rax),%rax       <---+ (broken)
      cc:   mov    %rax,%r11                |
      cf:   and    %rsi,%r11                |
      d2:   lock cmpxchg %r11,0x0(%rax) <---+
      d8:   jne    0x00000000000000c8       |
      da:   mov    %rax,%rsi                |
      dd:   mov    %r10,%rax                |
      [...]                                 |
                                            |
    After, dst=r0 (%rax) src=r2 (%rsi):     |
                                            |
      [...]                                 |
      da:	mov    %rax,%r10                |
      dd:	mov    0x0(%r10),%rax       <---+ (fixed)
      e1:	mov    %rax,%r11                |
      e4:	and    %rsi,%r11                |
      e7:	lock cmpxchg %r11,0x0(%r10) <---+
      ed:	jne    0x00000000000000dd
      ef:	mov    %rax,%rsi
      f2:	mov    %r10,%rax
      [...]
    
    The remaining combinations were fine as-is though:
    
    After, dst=r9 (%r15) src=r0 (%rax):
    
      [...]
      dc:	mov    %rax,%r10
      df:	mov    0x0(%r15),%rax
      e3:	mov    %rax,%r11
      e6:	and    %r10,%r11
      e9:	lock cmpxchg %r11,0x0(%r15)
      ef:	jne    0x00000000000000df      _
      f1:	mov    %rax,%r10                | (unneeded, but
      f4:	mov    %r10,%rax               _|  not a problem)
      [...]
    
    After, dst=r9 (%r15) src=r4 (%rcx):
    
      [...]
      de:	mov    %rax,%r10
      e1:	mov    0x0(%r15),%rax
      e5:	mov    %rax,%r11
      e8:	and    %rcx,%r11
      eb:	lock cmpxchg %r11,0x0(%r15)
      f1:	jne    0x00000000000000e1
      f3:	mov    %rax,%rcx
      f6:	mov    %r10,%rax
      [...]
    
    The case of dst == src register is rejected by the verifier and
    therefore not supported, but x86 JIT also handles this case just
    fine.
    
    After, dst=r0 (%rax) src=r0 (%rax):
    
      [...]
      eb:	mov    %rax,%r10
      ee:	mov    0x0(%r10),%rax
      f2:	mov    %rax,%r11
      f5:	and    %r10,%r11
      f8:	lock cmpxchg %r11,0x0(%r10)
      fe:	jne    0x00000000000000ee
     100:	mov    %rax,%r10
     103:	mov    %r10,%rax
      [...]
    
    Fixes: 981f94c ("bpf: Add bitwise atomic instructions")
    Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
    Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
    Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Brendan Jackman <jackmanb@google.com>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    almbladh authored and borkmann committed Sep 28, 2021
  12. selftests, bpf: test_lwt_ip_encap: Really disable rp_filter

    It's not enough to set net.ipv4.conf.all.rp_filter=0, that does not override
    a greater rp_filter value on the individual interfaces. We also need to set
    net.ipv4.conf.default.rp_filter=0 before creating the interfaces. That way,
    they'll also get their own rp_filter value of zero.
    
    Fixes: 0fde56e ("selftests: bpf: add test_lwt_ip_encap selftest")
    Signed-off-by: Jiri Benc <jbenc@redhat.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/b1cdd9d469f09ea6e01e9c89a6071c79b7380f89.1632386362.git.jbenc@redhat.com
    Jiri Benc authored and borkmann committed Sep 28, 2021
  13. selftests, bpf: Fix makefile dependencies on libbpf

    When building bpf selftest with make -j, I'm randomly getting build failures
    such as this one:
    
      In file included from progs/bpf_flow.c:19:
      [...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found
      #include "bpf_helper_defs.h"
               ^~~~~~~~~~~~~~~~~~~
    
    The file that fails the build varies between runs but it's always in the
    progs/ subdir.
    
    The reason is a missing make dependency on libbpf for the .o files in
    progs/. There was a dependency before commit 3ac2e20 but that commit
    removed it to prevent unneeded rebuilds. However, that only works if libbpf
    has been built already; the 'wildcard' prerequisite does not trigger when
    there's no bpf_helper_defs.h generated yet.
    
    Keep the libbpf as an order-only prerequisite to satisfy both goals. It is
    always built before the progs/ objects but it does not trigger unnecessary
    rebuilds by itself.
    
    Fixes: 3ac2e20 ("selftests/bpf: BPF object files should depend only on libbpf headers")
    Signed-off-by: Jiri Benc <jbenc@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/ee84ab66436fba05a197f952af23c98d90eb6243.1632758415.git.jbenc@redhat.com
    Jiri Benc authored and borkmann committed Sep 28, 2021
  14. bpf, test, cgroup: Use sk_{alloc,free} for test cases

    BPF test infra has some hacks in place which kzalloc() a socket and perform
    minimum init via sock_net_set() and sock_init_data(). As a result, the sk's
    skcd->cgroup is NULL since it didn't go through proper initialization as it
    would have been the case from sk_alloc(). Rather than re-adding a NULL test
    in sock_cgroup_ptr() just for this, use sk_{alloc,free}() pair for the test
    socket. The latter also allows to get rid of the bpf_sk_storage_free() special
    case.
    
    Fixes: 8520e22 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode")
    Fixes: b7a1848 ("bpf: add BPF_PROG_TEST_RUN support for flow dissector")
    Fixes: 2cb494a ("bpf: add tests for direct packet access from CGROUP_SKB")
    Reported-by: syzbot+664b58e9a40fbb2cec71@syzkaller.appspotmail.com
    Reported-by: syzbot+33f36d0754d4c5c0e102@syzkaller.appspotmail.com
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Tested-by: syzbot+664b58e9a40fbb2cec71@syzkaller.appspotmail.com
    Tested-by: syzbot+33f36d0754d4c5c0e102@syzkaller.appspotmail.com
    Link: https://lore.kernel.org/bpf/20210927123921.21535-2-daniel@iogearbox.net
    borkmann committed Sep 28, 2021
  15. bpf, cgroup: Assign cgroup in cgroup_sk_alloc when called from interrupt

    If cgroup_sk_alloc() is called from interrupt context, then just assign the
    root cgroup to skcd->cgroup. Prior to commit 8520e22 ("bpf, cgroups:
    Fix cgroup v2 fallback on v1/v2 mixed mode") we would just return, and later
    on in sock_cgroup_ptr(), we were NULL-testing the cgroup in fast-path, and
    iff indeed NULL returning the root cgroup (v ?: &cgrp_dfl_root.cgrp). Rather
    than re-adding the NULL-test to the fast-path we can just assign it once from
    cgroup_sk_alloc() given v1/v2 handling has been simplified. The migration from
    NULL test with returning &cgrp_dfl_root.cgrp to assigning &cgrp_dfl_root.cgrp
    directly does /not/ change behavior for callers of sock_cgroup_ptr().
    
    syzkaller was able to trigger a splat in the legacy netrom code base, where
    the RX handler in nr_rx_frame() calls nr_make_new() which calls sk_alloc()
    and therefore cgroup_sk_alloc() with in_interrupt() condition. Thus the NULL
    skcd->cgroup, where it trips over on cgroup_sk_free() side given it expects
    a non-NULL object. There are a few other candidates aside from netrom which
    have similar pattern where in their accept-like implementation, they just call
    to sk_alloc() and thus cgroup_sk_alloc() instead of sk_clone_lock() with the
    corresponding cgroup_sk_clone() which then inherits the cgroup from the parent
    socket. None of them are related to core protocols where BPF cgroup programs
    are running from. However, in future, they should follow to implement a similar
    inheritance mechanism.
    
    Additionally, with a !CONFIG_CGROUP_NET_PRIO and !CONFIG_CGROUP_NET_CLASSID
    configuration, the same issue was exposed also prior to 8520e22 due to
    commit e876ecc ("cgroup: memcg: net: do not associate sock with unrelated
    cgroup") which added the early in_interrupt() return back then.
    
    Fixes: 8520e22 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode")
    Fixes: e876ecc ("cgroup: memcg: net: do not associate sock with unrelated cgroup")
    Reported-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com
    Reported-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Tested-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com
    Tested-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com
    Acked-by: Tejun Heo <tj@kernel.org>
    Link: https://lore.kernel.org/bpf/20210927123921.21535-1-daniel@iogearbox.net
    borkmann committed Sep 28, 2021
  16. libbpf: Fix segfault in static linker for objects without BTF

    When a BPF object is compiled without BTF info (without -g),
    trying to link such objects using bpftool causes a SIGSEGV due to
    btf__get_nr_types accessing obj->btf which is NULL. Fix this by
    checking for the NULL pointer, and return error.
    
    Reproducer:
    $ cat a.bpf.c
    extern int foo(void);
    int bar(void) { return foo(); }
    $ cat b.bpf.c
    int foo(void) { return 0; }
    $ clang -O2 -target bpf -c a.bpf.c
    $ clang -O2 -target bpf -c b.bpf.c
    $ bpftool gen obj out a.bpf.o b.bpf.o
    Segmentation fault (core dumped)
    
    After fix:
    $ bpftool gen obj out a.bpf.o b.bpf.o
    libbpf: failed to find BTF info for object 'a.bpf.o'
    Error: failed to link 'a.bpf.o': Unknown error -22 (-22)
    
    Fixes: a463492 (libbpf: Add linker extern resolution support for functions and global variables)
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
    kkdwivedi authored and borkmann committed Sep 28, 2021
  17. MAINTAINERS: Add btf headers to BPF

    BPF folks maintain these and they're not picked up by the current
    MAINTAINERS entries.
    
    Files caught by the added globs:
    
      include/linux/btf.h
      include/linux/btf_ids.h
      include/uapi/linux/btf.h
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20210924193557.3081469-1-davemarchevsky@fb.com
    davemarchevsky authored and borkmann committed Sep 28, 2021
  18. bpf: Exempt CAP_BPF from checks against bpf_jit_limit

    When introducing CAP_BPF, bpf_jit_charge_modmem() was not changed to treat
    programs with CAP_BPF as privileged for the purpose of JIT memory allocation.
    This means that a program without CAP_BPF can block a program with CAP_BPF
    from loading a program.
    
    Fix this by checking bpf_capable() in bpf_jit_charge_modmem().
    
    Fixes: 2c78ee8 ("bpf: Implement CAP_BPF")
    Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20210922111153.19843-1-lmb@cloudflare.com
    lmb authored and borkmann committed Sep 28, 2021

Commits on Sep 27, 2021

  1. e100: fix buffer overrun in e100_get_regs

    The e100_get_regs function is used to implement a simple register dump
    for the e100 device. The data is broken into a couple of MAC control
    registers, and then a series of PHY registers, followed by a memory dump
    buffer.
    
    The total length of the register dump is defined as (1 + E100_PHY_REGS)
    * sizeof(u32) + sizeof(nic->mem->dump_buf).
    
    The logic for filling in the PHY registers uses a convoluted inverted
    count for loop which counts from E100_PHY_REGS (0x1C) down to 0, and
    assigns the slots 1 + E100_PHY_REGS - i. The first loop iteration will
    fill in [1] and the final loop iteration will fill in [1 + 0x1C]. This
    is actually one more than the supposed number of PHY registers.
    
    The memory dump buffer is then filled into the space at
    [2 + E100_PHY_REGS] which will cause that memcpy to assign 4 bytes past
    the total size.
    
    The end result is that we overrun the total buffer size allocated by the
    kernel, which could lead to a panic or other issues due to memory
    corruption.
    
    It is difficult to determine the actual total number of registers
    here. The only 8255x datasheet I could find indicates there are 28 total
    MDI registers. However, we're reading 29 here, and reading them in
    reverse!
    
    In addition, the ethtool e100 register dump interface appears to read
    the first PHY register to determine if the device is in MDI or MDIx
    mode. This doesn't appear to be documented anywhere within the 8255x
    datasheet. I can only assume it must be in register 28 (the extra
    register we're reading here).
    
    Lets not change any of the intended meaning of what we copy here. Just
    extend the space by 4 bytes to account for the extra register and
    continue copying the data out in the same order.
    
    Change the E100_PHY_REGS value to be the correct total (29) so that the
    total register dump size is calculated properly. Fix the offset for
    where we copy the dump buffer so that it doesn't overrun the total size.
    
    Re-write the for loop to use counting up instead of the convoluted
    down-counting. Correct the mdio_read offset to use the 0-based register
    offsets, but maintain the bizarre reverse ordering so that we have the
    ABI expected by applications like ethtool. This requires and additional
    subtraction of 1. It seems a bit odd but it makes the flow of assignment
    into the register buffer easier to follow.
    
    Fixes: 1da177e ("Linux-2.6.12-rc2")
    Reported-by: Felicitas Hetzelt <felicitashetzelt@gmail.com>
    Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
    Tested-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    jacob-keller authored and anguy11 committed Sep 27, 2021
  2. e100: fix length calculation in e100_get_regs_len

    commit abf9b90 ("e100: cleanup unneeded math") tried to simplify
    e100_get_regs_len and remove a double 'divide and then multiply'
    calculation that the e100_reg_regs_len function did.
    
    This change broke the size calculation entirely as it failed to account
    for the fact that the numbered registers are actually 4 bytes wide and
    not 1 byte. This resulted in a significant under allocation of the
    register buffer used by e100_get_regs.
    
    Fix this by properly multiplying the register count by u32 first before
    adding the size of the dump buffer.
    
    Fixes: abf9b90 ("e100: cleanup unneeded math")
    Reported-by: Felicitas Hetzelt <felicitashetzelt@gmail.com>
    Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    jacob-keller authored and anguy11 committed Sep 27, 2021
  3. net: phy: enhance GPY115 loopback disable function

    GPY115 need reset PHY when it comes out from loopback mode if the firmware
    version number (lower 8 bits) is equal to or below 0x76.
    
    Fixes: 7d901a1 ("net: phy: add Maxlinear GPY115/21x/24x driver")
    Signed-off-by: Xu Liang <lxu@maxlinear.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Xu Liang authored and davem330 committed Sep 27, 2021
  4. Merge tag 'mac80211-for-net-2021-09-27' of git://git.kernel.org/pub/s…

    …cm/linux/kernel/git/jberg/mac80211
    
    Johannes berg says:
    
    ====================
    Some fixes:
     * potential use-after-free in CCMP/GCMP RX processing
     * potential use-after-free in TX A-MSDU processing
     * revert to low data rates for no-ack as the commit
       broke other things
     * limit VHT MCS/NSS in radiotap injection
     * drop frames with invalid addresses in IBSS mode
     * check rhashtable_init() return value in mesh
     * fix potentially unaligned access in mesh
     * fix late beacon hrtimer handling in hwsim (syzbot)
     * fix documentation for PTK0 rekeying
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Sep 27, 2021
  5. Merge branch 'mv88e6xxx-mtu-fixes'

    Andrew Lunn says:
    
    ====================
    mv88e6xxx: MTU fixes
    
    These three patches fix MTU issues reported by 曹煜.
    
    There are two different ways of configuring the MTU in the hardware.
    The 6161 family is using the wrong method. Some of the marvell switch
    enforce the MTU when the port is used for CPU/DSA, some don't.
    Because of the extra header, the MTU needs increasing with this
    overhead.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Sep 27, 2021
  6. dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and …

    …CPU ports
    
    Same members of the Marvell Ethernet switches impose MTU restrictions
    on ports used for connecting to the CPU or another switch for DSA. If
    the MTU is set too low, tagged frames will be discarded. Ensure the
    worst case tagger overhead is included in setting the MTU for DSA and
    CPU ports.
    
    Fixes: 1baf0fa ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
    Reported by: 曹煜 <cao88yu@gmail.com>
    Signed-off-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    lunn authored and davem330 committed Sep 27, 2021
  7. dsa: mv88e6xxx: Fix MTU definition

    The MTU passed to the DSA driver is the payload size, typically 1500.
    However, the switch uses the frame size when applying restrictions.
    Adjust the MTU with the size of the Ethernet header and the frame
    checksum. The VLAN header also needs to be included when the frame
    size it per port, but not when it is global.
    
    Fixes: 1baf0fa ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
    Reported by: 曹煜 <cao88yu@gmail.com>
    Signed-off-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    lunn authored and davem330 committed Sep 27, 2021
  8. dsa: mv88e6xxx: 6161: Use chip wide MAX MTU

    The datasheets suggests the 6161 uses a per port setting for jumbo
    frames. Testing has however shown this is not correct, it uses the old
    style chip wide MTU control. Change the ops in the 6161 structure to
    reflect this.
    
    Fixes: 1baf0fa ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
    Reported by: 曹煜 <cao88yu@gmail.com>
    Signed-off-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    lunn authored and davem330 committed Sep 27, 2021
  9. net: mdiobus: Fix memory leak in __mdiobus_register

    Once device_register() failed, we should call put_device() to
    decrement reference count for cleanup. Or it will cause memory
    leak.
    
    BUG: memory leak
    unreferenced object 0xffff888114032e00 (size 256):
      comm "kworker/1:3", pid 2960, jiffies 4294943572 (age 15.920s)
      hex dump (first 32 bytes):
        00 00 00 00 00 00 00 00 08 2e 03 14 81 88 ff ff  ................
        08 2e 03 14 81 88 ff ff 90 76 65 82 ff ff ff ff  .........ve.....
      backtrace:
        [<ffffffff8265cfab>] kmalloc include/linux/slab.h:591 [inline]
        [<ffffffff8265cfab>] kzalloc include/linux/slab.h:721 [inline]
        [<ffffffff8265cfab>] device_private_init drivers/base/core.c:3203 [inline]
        [<ffffffff8265cfab>] device_add+0x89b/0xdf0 drivers/base/core.c:3253
        [<ffffffff828dd643>] __mdiobus_register+0xc3/0x450 drivers/net/phy/mdio_bus.c:537
        [<ffffffff828cb835>] __devm_mdiobus_register+0x75/0xf0 drivers/net/phy/mdio_devres.c:87
        [<ffffffff82b92a00>] ax88772_init_mdio drivers/net/usb/asix_devices.c:676 [inline]
        [<ffffffff82b92a00>] ax88772_bind+0x330/0x480 drivers/net/usb/asix_devices.c:786
        [<ffffffff82baa33f>] usbnet_probe+0x3ff/0xdf0 drivers/net/usb/usbnet.c:1745
        [<ffffffff82c36e17>] usb_probe_interface+0x177/0x370 drivers/usb/core/driver.c:396
        [<ffffffff82661d17>] call_driver_probe drivers/base/dd.c:517 [inline]
        [<ffffffff82661d17>] really_probe.part.0+0xe7/0x380 drivers/base/dd.c:596
        [<ffffffff826620bc>] really_probe drivers/base/dd.c:558 [inline]
        [<ffffffff826620bc>] __driver_probe_device+0x10c/0x1e0 drivers/base/dd.c:751
        [<ffffffff826621ba>] driver_probe_device+0x2a/0x120 drivers/base/dd.c:781
        [<ffffffff82662a26>] __device_attach_driver+0xf6/0x140 drivers/base/dd.c:898
        [<ffffffff8265eca7>] bus_for_each_drv+0xb7/0x100 drivers/base/bus.c:427
        [<ffffffff826625a2>] __device_attach+0x122/0x260 drivers/base/dd.c:969
        [<ffffffff82660916>] bus_probe_device+0xc6/0xe0 drivers/base/bus.c:487
        [<ffffffff8265cd0b>] device_add+0x5fb/0xdf0 drivers/base/core.c:3359
        [<ffffffff82c343b9>] usb_set_configuration+0x9d9/0xb90 drivers/usb/core/message.c:2170
        [<ffffffff82c4473c>] usb_generic_driver_probe+0x8c/0xc0 drivers/usb/core/generic.c:238
    
    BUG: memory leak
    unreferenced object 0xffff888116f06900 (size 32):
      comm "kworker/0:2", pid 2670, jiffies 4294944448 (age 7.160s)
      hex dump (first 32 bytes):
        75 73 62 2d 30 30 31 3a 30 30 33 00 00 00 00 00  usb-001:003.....
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
        [<ffffffff81484516>] kstrdup+0x36/0x70 mm/util.c:60
        [<ffffffff814845a3>] kstrdup_const+0x53/0x80 mm/util.c:83
        [<ffffffff82296ba2>] kvasprintf_const+0xc2/0x110 lib/kasprintf.c:48
        [<ffffffff82358d4b>] kobject_set_name_vargs+0x3b/0xe0 lib/kobject.c:289
        [<ffffffff826575f3>] dev_set_name+0x63/0x90 drivers/base/core.c:3147
        [<ffffffff828dd63b>] __mdiobus_register+0xbb/0x450 drivers/net/phy/mdio_bus.c:535
        [<ffffffff828cb835>] __devm_mdiobus_register+0x75/0xf0 drivers/net/phy/mdio_devres.c:87
        [<ffffffff82b92a00>] ax88772_init_mdio drivers/net/usb/asix_devices.c:676 [inline]
        [<ffffffff82b92a00>] ax88772_bind+0x330/0x480 drivers/net/usb/asix_devices.c:786
        [<ffffffff82baa33f>] usbnet_probe+0x3ff/0xdf0 drivers/net/usb/usbnet.c:1745
        [<ffffffff82c36e17>] usb_probe_interface+0x177/0x370 drivers/usb/core/driver.c:396
        [<ffffffff82661d17>] call_driver_probe drivers/base/dd.c:517 [inline]
        [<ffffffff82661d17>] really_probe.part.0+0xe7/0x380 drivers/base/dd.c:596
        [<ffffffff826620bc>] really_probe drivers/base/dd.c:558 [inline]
        [<ffffffff826620bc>] __driver_probe_device+0x10c/0x1e0 drivers/base/dd.c:751
        [<ffffffff826621ba>] driver_probe_device+0x2a/0x120 drivers/base/dd.c:781
        [<ffffffff82662a26>] __device_attach_driver+0xf6/0x140 drivers/base/dd.c:898
        [<ffffffff8265eca7>] bus_for_each_drv+0xb7/0x100 drivers/base/bus.c:427
        [<ffffffff826625a2>] __device_attach+0x122/0x260 drivers/base/dd.c:969
    
    Reported-by: syzbot+398e7dc692ddbbb4cfec@syzkaller.appspotmail.com
    Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Yanfei Xu authored and davem330 committed Sep 27, 2021
  10. Revert "ibmvnic: check failover_pending in login response"

    This reverts commit d437f5a.
    
    Code has been duplicated through commit <273c29e944bd> "ibmvnic: check
    failover_pending in login response"
    
    Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Desnes A. Nunes do Rosario authored and davem330 committed Sep 27, 2021
  11. net: bgmac-platform: handle mac-address deferral

    This patch is a replication of Christian Lamparter's "net: bgmac-bcma:
    handle deferred probe error due to mac-address" patch for the
    bgmac-platform driver [1].
    
    As is the case with the bgmac-bcma driver, this change is to cover the
    scenario where the MAC address cannot yet be discovered due to reliance
    on an nvmem provider which is yet to be instantiated, resulting in a
    random address being assigned that has to be manually overridden.
    
    [1] https://lore.kernel.org/netdev/20210919115725.29064-1-chunkeey@gmail.com
    
    Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    clayface authored and davem330 committed Sep 27, 2021
  12. net: hns: Fix spelling mistake "maped" -> "mapped"

    There is a spelling mistake in a dev_err error message. Fix it.
    
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Colin Ian King authored and davem330 committed Sep 27, 2021
Older