Skip to content
Permalink
Marcin-Szycik/…
Switch branches/tags

Commits on Feb 4, 2022

  1. ice: Support GTP-U and GTP-C offload in switchdev

    Add support for creating filters for GTP-U and GTP-C in switchdev mode. Add
    support for parsing GTP-specific options (QFI and PDU type) and TEID.
    
    By default, a filter for GTP-U will be added. To add a filter for GTP-C,
    specify enc_dst_port = 2123, e.g.:
    
    tc filter add dev $GTP0 ingress prio 1 flower enc_key_id 1337 \
    enc_dst_port 2123 action mirred egress redirect dev $VF1_PR
    
    Note: IPv6 offload is not supported yet.
    Note: GTP-U with no payload offload is not supported yet.
    
    Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
    Marcin Szycik authored and intel-lab-lkp committed Feb 4, 2022
  2. ice: Fix FV offset searching

    Checking only protocol ids while searching for correct FVs can lead to a
    situation, when incorrect FV will be added to the list. Incorrect means
    that FV has correct protocol id but incorrect offset.
    
    Call ice_get_sw_fv_list with ice_prot_lkup_ext struct which contains all
    protocol ids with offsets.
    
    With this modification allocating and collecting protocol ids list is
    not longer needed.
    
    Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
    Michal Swiatkowski authored and intel-lab-lkp committed Feb 4, 2022
  3. gtp: Implement GTP echo response

    Adding GTP device through ip link creates the situation where
    there is no userspace daemon which would handle GTP messages
    (Echo Request for example). GTP-U instance which would not respond
    to echo requests would violate GTP specification.
    
    When GTP packet arrives with GTP_ECHO_REQ message type,
    GTP_ECHO_RSP is send to the sender. GTP_ECHO_RSP message
    should contain information element with GTPIE_RECOVERY tag and
    restart counter value. For GTPv1 restart counter is not used
    and should be equal to 0, for GTPv0 restart counter contains
    information provided from userspace(IFLA_GTP_RESTART_COUNT).
    
    Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
    WojDrew authored and intel-lab-lkp committed Feb 4, 2022
  4. net/sched: Allow flower to match on GTP options

    Options are as follows: PDU_TYPE:QFI and they refernce to
    the fields from the  PDU Session Protocol. PDU Session data
    is conveyed in GTP-U Extension Header.
    
    GTP-U Extension Header is described in 3GPP TS 29.281.
    PDU Session Protocol is described in 3GPP TS 38.415.
    
    PDU_TYPE -  indicates the type of the PDU Session Information (4 bits)
    QFI      -  QoS Flow Identifier (6 bits)
    
      # ip link add gtp_dev type gtp role sgsn
      # tc qdisc add dev gtp_dev ingress
      # tc filter add dev gtp_dev protocol ip parent ffff: \
          flower \
            enc_key_id 11 \
            gtp_opts 1:8/ff:ff \
          action mirred egress redirect dev eth0
    
    Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
    WojDrew authored and intel-lab-lkp committed Feb 4, 2022
  5. gtp: Add support for checking GTP device type

    Add a function that checks if a net device type is GTP.
    
    Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
    WojDrew authored and intel-lab-lkp committed Feb 4, 2022
  6. gtp: Allow to create GTP device without FDs

    Currently, when the user wants to create GTP device, he has to
    provide file handles to the sockets created in userspace (IFLA_GTP_FD0,
    IFLA_GTP_FD1). This behaviour is not ideal, considering the option of
    adding support for GTP device creation through ip link. Ip link
    application is not a good place to create such sockets.
    
    This patch allows to create GTP device without providing
    IFLA_GTP_FD0 and IFLA_GTP_FD1 arguments. If the user does not
    provide file handles to the sockets, then GTP module takes care
    of creating UDP sockets by itself. Sockets are created with the
    commonly known UDP ports used for GTP protocol (GTP0_PORT and
    GTP1U_PORT). In this case we don't have to provide encap_destroy
    because no extra deinitialization is needed, everything is covered
    by udp_tunnel_sock_release.
    
    Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
    WojDrew authored and intel-lab-lkp committed Feb 4, 2022
  7. net: lan966x: use .mac_select_pcs() interface

    Convert lan966x to use the mac_select_interface instead of
    phylink_set_pcs.
    
    Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Link: https://lore.kernel.org/r/20220202114949.833075-1-horatiu.vultur@microchip.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    HoratiuVultur authored and Jakub Kicinski committed Feb 4, 2022
  8. selftests: rtnetlink: Use more sensible tos values

    Using tos 0x1 with 'ip route get <IPv4 address> ...' doesn't test much
    of the tos option handling: 0x1 just sets an ECN bit, which is cleared
    by inet_rtm_getroute() before doing the fib lookup. Let's use 0x10
    instead, which is actually taken into account in the route lookup (and
    is less surprising for the reader).
    
    For consistency, use 0x10 for the IPv6 route lookup too (IPv6 currently
    doesn't clear ECN bits, but might do so in the future).
    
    Signed-off-by: Guillaume Nault <gnault@redhat.com>
    Link: https://lore.kernel.org/r/d61119e68d01ba7ef3ba50c1345a5123a11de123.1643815297.git.gnault@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Guillaume Nault authored and Jakub Kicinski committed Feb 4, 2022
  9. selftests: fib offload: use sensible tos values

    Although both iproute2 and the kernel accept 1 and 2 as tos values for
    new routes, those are invalid. These values only set ECN bits, which
    are ignored during IPv4 fib lookups. Therefore, no packet can actually
    match such routes. This selftest therefore only succeeds because it
    doesn't verify that the new routes do actually work in practice (it
    just checks if the routes are offloaded or not).
    
    It makes more sense to use tos values that don't conflict with ECN.
    This way, the selftest won't be affected if we later decide to warn or
    even reject invalid tos configurations for new routes.
    
    Signed-off-by: Guillaume Nault <gnault@redhat.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Link: https://lore.kernel.org/r/5e43b343720360a1c0e4f5947d9e917b26f30fbf.1643826556.git.gnault@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Guillaume Nault authored and Jakub Kicinski committed Feb 4, 2022
  10. net: minor __dev_alloc_name() optimization

    __dev_alloc_name() allocates a private zeroed page,
    then sets bits in it while iterating through net devices.
    
    It can use __set_bit() to avoid unnecessary locked operations.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20220203064609.3242863-1-eric.dumazet@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    neebe000 authored and Jakub Kicinski committed Feb 4, 2022
  11. Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

    No conflicts.
    
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Feb 4, 2022
  12. gcc-plugins/stackleak: Use noinstr in favor of notrace

    While the stackleak plugin was already using notrace, objtool is now a
    bit more picky.  Update the notrace uses to noinstr.  Silences the
    following objtool warnings when building with:
    
    CONFIG_DEBUG_ENTRY=y
    CONFIG_STACK_VALIDATION=y
    CONFIG_VMLINUX_VALIDATION=y
    CONFIG_GCC_PLUGIN_STACKLEAK=y
    
      vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
      vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
      vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section
      vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section
      vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section
      vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section
      vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section
      vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section
      vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section
    
    Note that the plugin's addition of calls to stackleak_track_stack() from
    noinstr functions is expected to be safe, as it isn't runtime
    instrumentation and is self-contained.
    
    Cc: Alexander Popov <alex.popov@linux.com>
    Suggested-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    kees authored and torvalds committed Feb 4, 2022
  13. Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/netdev/net
    
    Pull networking fixes from Jakub Kicinski:
     "Including fixes from bpf, netfilter, and ieee802154.
    
      Current release - regressions:
    
       - Partially revert "net/smc: Add netlink net namespace support", fix
         uABI breakage
    
       - netfilter:
          - nft_ct: fix use after free when attaching zone template
          - nft_byteorder: track register operations
    
      Previous releases - regressions:
    
       - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
    
       - phy: qca8081: fix speeds lower than 2.5Gb/s
    
       - sched: fix use-after-free in tc_new_tfilter()
    
      Previous releases - always broken:
    
       - tcp: fix mem under-charging with zerocopy sendmsg()
    
       - tcp: add missing tcp_skb_can_collapse() test in
         tcp_shift_skb_data()
    
       - neigh: do not trigger immediate probes on NUD_FAILED from
         neigh_managed_work, avoid a deadlock
    
       - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
         false-positives
    
       - netfilter: nft_reject_bridge: fix for missing reply from prerouting
    
       - smc: forward wakeup to smc socket waitqueue after fallback
    
       - ieee802154:
          - return meaningful error codes from the netlink helpers
          - mcr20a: fix lifs/sifs periods
          - at86rf230, ca8210: stop leaking skbs on error paths
    
       - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
    
       - ax25: add refcount in ax25_dev to avoid UAF bugs
    
       - eth: mlx5e:
          - fix SFP module EEPROM query
          - fix broken SKB allocation in HW-GRO
          - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
    
       - eth: amd-xgbe:
          - fix skb data length underflow
          - ensure reset of the tx_timer_active flag, avoid Tx timeouts
    
       - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
    
       - eth: e1000e: handshake with CSME starts from Alder Lake platforms"
    
    * tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
      ax25: fix reference count leaks of ax25_dev
      net: stmmac: ensure PTP time register reads are consistent
      net: ipa: request IPA register values be retained
      dt-bindings: net: qcom,ipa: add optional qcom,qmp property
      tools/resolve_btfids: Do not print any commands when building silently
      bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
      net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
      tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
      net: sparx5: do not refer to skb after passing it on
      Partially revert "net/smc: Add netlink net namespace support"
      net/mlx5e: Avoid field-overflowing memcpy()
      net/mlx5e: Use struct_group() for memcpy() region
      net/mlx5e: Avoid implicit modify hdr for decap drop rule
      net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
      net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
      net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
      net/mlx5: E-Switch, Fix uninitialized variable modact
      net/mlx5e: Fix handling of wrong devices during bond netevent
      net/mlx5e: Fix broken SKB allocation in HW-GRO
      net/mlx5e: Fix wrong calculation of header index in HW_GRO
      ...
    torvalds committed Feb 4, 2022
  14. Merge tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux…

    …/kernel/git/pcmoore/selinux
    
    Pull selinux fix from Paul Moore:
     "One small SELinux patch to ensure that a policy structure field is
      properly reset after freeing so that we don't inadvertently do a
      double-free on certain error conditions"
    
    * tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
      selinux: fix double free of cond_list on error paths
    torvalds committed Feb 4, 2022
  15. Merge tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pu…

    …b/scm/linux/kernel/git/shuah/linux-kselftest
    
    Pull Kselftest fixes from Shuah Khan:
     "Important fixes to several tests and documentation clarification on
      running mainline kselftest on stable releases. A few notable fixes:
    
       - fix kselftest run hang due to child processes that haven't been
         terminated. Fix signals all child processes
    
       - fix false pass/fail results from vdso_test_abi, openat2, mincore
    
       - build failures when using -j (multiple jobs) option
    
       - exec test build failure due to incorrect build rule for a run-time
         created "pipe"
    
       - zram test fixes related to interaction with zram-generator to make
         sure zram test to coordinate deleted with zram-generator
    
       - zram test compression ratio calculation fix and skipping
         max_comp_streams.
    
       - increasing rtc test timeout
    
       - cpufreq test to write test results to stdout which will necessary
         on automated test systems"
    
    * tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
      kselftest: Fix vdso_test_abi return status
      selftests: skip mincore.check_file_mmap when fs lacks needed support
      selftests: openat2: Skip testcases that fail with EOPNOTSUPP
      selftests: openat2: Add missing dependency in Makefile
      selftests: openat2: Print also errno in failure messages
      selftests: futex: Use variable MAKE instead of make
      selftests/exec: Remove pipe from TEST_GEN_FILES
      selftests/zram: Adapt the situation that /dev/zram0 is being used
      selftests/zram01.sh: Fix compression ratio calculation
      selftests/zram: Skip max_comp_streams interface on newer kernel
      docs/kselftest: clarify running mainline tests on stables
      kselftest: signal all child processes
      selftests: cpufreq: Write test output to stdout as well
      selftests: rtc: Increase test timeout so that all tests run
    torvalds committed Feb 4, 2022

Commits on Feb 3, 2022

  1. ax25: fix reference count leaks of ax25_dev

    The previous commit d01ffb9 ("ax25: add refcount in ax25_dev
    to avoid UAF bugs") introduces refcount into ax25_dev, but there
    are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(),
    ax25_rt_add(), ax25_rt_del() and ax25_rt_opt().
    
    This patch uses ax25_dev_put() and adjusts the position of
    ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev.
    
    Fixes: d01ffb9 ("ax25: add refcount in ax25_dev to avoid UAF bugs")
    Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
    Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
    Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Duoming Zhou authored and Jakub Kicinski committed Feb 3, 2022
  2. net: stmmac: ensure PTP time register reads are consistent

    Even if protected from preemption and interrupts, a small time window
    remains when the 2 register reads could return inconsistent values,
    each time the "seconds" register changes. This could lead to an about
    1-second error in the reported time.
    
    Add logic to ensure the "seconds" and "nanoseconds" values are consistent.
    
    Fixes: 92ba688 ("stmmac: add the support for PTP hw clock driver")
    Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com>
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Link: https://lore.kernel.org/r/20220203160025.750632-1-yannick.vignon@oss.nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Yackou authored and Jakub Kicinski committed Feb 3, 2022
  3. Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

    Daniel Borkmann says:
    
    ====================
    pull-request: bpf 2022-02-03
    
    We've added 6 non-merge commits during the last 10 day(s) which contain
    a total of 7 files changed, 11 insertions(+), 236 deletions(-).
    
    The main changes are:
    
    1) Fix BPF ringbuf to allocate its area with VM_MAP instead of VM_ALLOC
       flag which otherwise trips over KASAN, from Hou Tao.
    
    2) Fix unresolved symbol warning in resolve_btfids due to LSM callback
       rename, from Alexei Starovoitov.
    
    3) Fix a possible race in inc_misses_counter() when IRQ would trigger
       during counter update, from He Fengqing.
    
    4) Fix tooling infra for cross-building with clang upon probing whether
       gcc provides the standard libraries, from Jean-Philippe Brucker.
    
    5) Fix silent mode build for resolve_btfids, from Nathan Chancellor.
    
    6) Drop unneeded and outdated lirc.h header copy from tooling infra as
       BPF does not require it anymore, from Sean Young.
    
    * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
      tools/resolve_btfids: Do not print any commands when building silently
      bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
      tools: Ignore errors from `which' when searching a GCC toolchain
      tools headers UAPI: remove stale lirc.h
      bpf: Fix possible race in inc_misses_counter
      bpf: Fix renaming task_getsecid_subj->current_getsecid_subj.
    ====================
    
    Link: https://lore.kernel.org/r/20220203155815.25689-1-daniel@iogearbox.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Feb 3, 2022
  4. printk: Fix incorrect __user type in proc_dointvec_minmax_sysadmin()

    The move of proc_dointvec_minmax_sysadmin() from kernel/sysctl.c to
    kernel/printk/sysctl.c introduced an incorrect __user attribute to the
    buffer argument.  I spotted this change in [1] as well as the kernel
    test robot.  Revert this change to please sparse:
    
      kernel/printk/sysctl.c:20:51: warning: incorrect type in argument 3 (different address spaces)
      kernel/printk/sysctl.c:20:51:    expected void *
      kernel/printk/sysctl.c:20:51:    got void [noderef] __user *buffer
    
    Fixes: faaa357 ("printk: move printk sysctl to printk/sysctl.c")
    Link: https://lore.kernel.org/r/20220104155024.48023-2-mic@digikod.net [1]
    Reported-by: kernel test robot <lkp@intel.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: John Ogness <john.ogness@linutronix.de>
    Cc: Luis Chamberlain <mcgrof@kernel.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Xiaoming Ni <nixiaoming@huawei.com>
    Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
    Link: https://lore.kernel.org/r/20220203145029.272640-1-mic@digikod.net
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    l0kod authored and torvalds committed Feb 3, 2022
  5. Revert "module, async: async_synchronize_full() on module init iff as…

    …ync is used"
    
    This reverts commit 774a122.
    
    We need to finish all async code before the module init sequence is
    done.  In the reverted commit the PF_USED_ASYNC flag was added to mark a
    thread that called async_schedule().  Then the PF_USED_ASYNC flag was
    used to determine whether or not async_synchronize_full() needs to be
    invoked.  This works when modprobe thread is calling async_schedule(),
    but it does not work if module dispatches init code to a worker thread
    which then calls async_schedule().
    
    For example, PCI driver probing is invoked from a worker thread based on
    a node where device is attached:
    
    	if (cpu < nr_cpu_ids)
    		error = work_on_cpu(cpu, local_pci_probe, &ddi);
    	else
    		error = local_pci_probe(&ddi);
    
    We end up in a situation where a worker thread gets the PF_USED_ASYNC
    flag set instead of the modprobe thread.  As a result,
    async_synchronize_full() is not invoked and modprobe completes without
    waiting for the async code to finish.
    
    The issue was discovered while loading the pm80xx driver:
    (scsi_mod.scan=async)
    
    modprobe pm80xx                      worker
    ...
      do_init_module()
      ...
        pci_call_probe()
          work_on_cpu(local_pci_probe)
                                         local_pci_probe()
                                           pm8001_pci_probe()
                                             scsi_scan_host()
                                               async_schedule()
                                               worker->flags |= PF_USED_ASYNC;
                                         ...
          < return from worker >
      ...
      if (current->flags & PF_USED_ASYNC) <--- false
      	async_synchronize_full();
    
    Commit 21c3c5d ("block: don't request module during elevator init")
    fixed the deadlock issue which the reverted commit 774a122
    ("module, async: async_synchronize_full() on module init iff async is
    used") tried to fix.
    
    Since commit 0fdff3e ("async, kmod: warn on synchronous
    request_module() from async workers") synchronous module loading from
    async is not allowed.
    
    Given that the original deadlock issue is fixed and it is no longer
    allowed to call synchronous request_module() from async we can remove
    PF_USED_ASYNC flag to make module init consistently invoke
    async_synchronize_full() unless async module probe is requested.
    
    Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
    Reviewed-by: Changyuan Lyu <changyuanl@google.com>
    Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
    Acked-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    ipylypiv authored and torvalds committed Feb 3, 2022
  6. Merge branch 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/tj/cgroup
    
    Pull cgroup fixes from Tejun Heo:
    
     - Eric's fix for a long standing cgroup1 permission issue where it only
       checks for uid 0 instead of CAP which inadvertently allows
       unprivileged userns roots to modify release_agent userhelper
    
     - Fixes for the fallout from Waiman's recent cpuset work
    
    * 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
      cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning
      cgroup-v1: Require capabilities to set release_agent
      cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask()
      cgroup/cpuset: Make child cpusets restrict parents on v1 hierarchy
    torvalds committed Feb 3, 2022
  7. Merge branch 'net-ipa-enable-register-retention'

    Alex Elder says:
    
    ====================
    net: ipa: enable register retention
    
    With runtime power management in place, we sometimes need to issue
    a command to enable retention of IPA register values before power
    collapse.  This requires a new Device Tree property, whose presence
    will also be used to signal that the command is required.
    ====================
    
    Link: https://lore.kernel.org/r/20220201150205.468403-1-elder@linaro.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Feb 3, 2022
  8. net: ipa: request IPA register values be retained

    In some cases, the IPA hardware needs to request the always-on
    subsystem (AOSS) to coordinate with the IPA microcontroller to
    retain IPA register values at power collapse.  This is done by
    issuing a QMP request to the AOSS microcontroller.  A similar
    request ondoes that request.
    
    We must get and hold the "QMP" handle early, because we might get
    back EPROBE_DEFER for that.  But the actual request should be sent
    while we know the IPA clock is active, and when we know the
    microcontroller is operational.
    
    Fixes: 1aac309 ("net: ipa: use autosuspend")
    Signed-off-by: Alex Elder <elder@linaro.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    alexelder authored and Jakub Kicinski committed Feb 3, 2022
  9. dt-bindings: net: qcom,ipa: add optional qcom,qmp property

    For some systems, the IPA driver must make a request to ensure that
    its registers are retained across power collapse of the IPA hardware.
    On such systems, we'll use the existence of the "qcom,qmp" property
    as a signal that this request is required.
    
    Signed-off-by: Alex Elder <elder@linaro.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    alexelder authored and Jakub Kicinski committed Feb 3, 2022
  10. cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning

    It was found that a "suspicious RCU usage" lockdep warning was issued
    with the rcu_read_lock() call in update_sibling_cpumasks().  It is
    because the update_cpumasks_hier() function may sleep. So we have
    to release the RCU lock, call update_cpumasks_hier() and reacquire
    it afterward.
    
    Also add a percpu_rwsem_assert_held() in update_sibling_cpumasks()
    instead of stating that in the comment.
    
    Fixes: 4716909 ("cpuset: Track cpusets that use parent's effective_cpus")
    Signed-off-by: Waiman Long <longman@redhat.com>
    Tested-by: Phil Auld <pauld@redhat.com>
    Reviewed-by: Phil Auld <pauld@redhat.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Waiman Long authored and htejun committed Feb 3, 2022
  11. tools/resolve_btfids: Do not print any commands when building silently

    When building with 'make -s', there is some output from resolve_btfids:
    
    $ make -sj"$(nproc)" oldconfig prepare
      MKDIR     .../tools/bpf/resolve_btfids/libbpf/
      MKDIR     .../tools/bpf/resolve_btfids//libsubcmd
      LINK     resolve_btfids
    
    Silent mode means that no information should be emitted about what is
    currently being done. Use the $(silent) variable from Makefile.include
    to avoid defining the msg macro so that there is no information printed.
    
    Fixes: fbbb68d ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object")
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20220201212503.731732-1-nathan@kernel.org
    nathanchance authored and borkmann committed Feb 3, 2022
  12. Revert "mm/gup: small refactoring: simplify try_grab_page()"

    This reverts commit 54d516b
    
    That commit did a refactoring that effectively combined fast and slow
    gup paths (again).  And that was again incorrect, for two reasons:
    
     a) Fast gup and slow gup get reference counts on pages in different
        ways and with different goals: see Linus' writeup in commit
        cd1adf1 ("Revert "mm/gup: remove try_get_page(), call
        try_get_compound_head() directly""), and
    
     b) try_grab_compound_head() also has a specific check for
        "FOLL_LONGTERM && !is_pinned(page)", that assumes that the caller
        can fall back to slow gup. This resulted in new failures, as
        recently report by Will McVicker [1].
    
    But (a) has problems too, even though they may not have been reported
    yet.  So just revert this.
    
    Link: https://lore.kernel.org/r/20220131203504.3458775-1-willmcvicker@google.com [1]
    Fixes: 54d516b ("mm/gup: small refactoring: simplify try_grab_page()")
    Reported-and-tested-by: Will McVicker <willmcvicker@google.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Minchan Kim <minchan@google.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: stable@vger.kernel.org # 5.15
    Signed-off-by: John Hubbard <jhubbard@nvidia.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    johnhubbard authored and torvalds committed Feb 3, 2022
  13. Merge tag 'mips-fixes-5.17_2' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/mips/linux
    
    Pull MIPS fixes from Thomas Bogendoerfer:
    
     - fix missed change for PTR->PTR_WD conversion
    
     - kernel-doc fixes
    
    * tag 'mips-fixes-5.17_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
      MIPS: KVM: fix vz.c kernel-doc notation
      MIPS: octeon: Fix missed PTR->PTR_WD conversion
    torvalds committed Feb 3, 2022
  14. Merge branch 'dsa-mv88e6xxx-phylink_generic_validate'

    Russell King says:
    
    ====================
    net: dsa: mv88e6xxx: convert to phylink_generic_validate()
    
    The overall objective of this series is to convert the mv88e6xxx DSA
    driver to use phylink_generic_validate().
    
    Patch 1 adds a new helper mv88e6352_g2_scratch_port_has_serdes() which
    indicates whether an 88e6352 port has a serdes associated with it. This
    is necessary as ports 4 and 5 will normally be in automedia mode, where
    the CMODE field in the port status register will change e.g. between 15
    (internal PHY) and 9 (1000base-X) depending on whether the serdes has
    link.
    
    The existing code caches the cmode field, and depending whether the
    serdes has link at probe time, determines whether we allow things such
    as the serdes statistics to be accessed. This means if the link isn't
    up at probe time, the serdes is essentially unavailable.
    
    Patch 1 addresses this by reading the pin configuration to find out
    whether the serdes is attached to port 4 or port 5.
    
    Patch 2 is a joint effort between myself and Marek Behún, adding the
    supported interfaces and MAC capabilities to all mv88e6xxx supported
    switch devices. This is slightly more restrictive than the original
    code as we didn't used to care too much about the interface mode, but
    with this we do - which is why we must know if there's a serdes
    associated now.
    
    Patch 3 switches mv88e6xxx to use the generic validation by removing
    the initialisation of the phylink_validate pointer in the dsa_ops
    struct.
    
    Patch 4 updates the statistics code to use the new helper in patch 1,
    so the serdes statistics are available even if the link was down at
    driver probe time.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Feb 3, 2022
  15. net: dsa: mv88e6xxx: improve 88e6352 serdes statistics detection

    The decision whether to report serdes statistics currently depends on
    the cached C_Mode value for the port, read at probe time or updated by
    configuration. However, port 4 can be in "automedia" mode when it is
    used as a serdes port, meaning it switches between the internal PHY and
    the serdes, changing the read-only C_Mode value depending on which
    first gains link. Consequently, the C_Mode value read at probe does not
    accurately reflect whether the port has the serdes associated with it.
    
    In "net: dsa: mv88e6xxx: add mv88e6352_g2_scratch_port_has_serdes()",
    we added a way to read the hardware configuration to determine which
    port has the serdes associated with it. Use this to determine which
    port reports the serdes statistics.
    
    Reviewed-by: Marek Behún <kabel@kernel.org>
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Feb 3, 2022
  16. net: dsa: mv88e6xxx: convert to phylink_generic_validate()

    Now that the mv88e6xxx chip drivers are supplying the supported
    interfaces and MAC capabilities, switch the driver to use the generic
    phylink validation implementation by removing our own validation
    implementations. This causes DSA to call phylink_generic_validate()
    on our behalf.
    
    Reviewed-by: Marek Behún <kabel@kernel.org>
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Feb 3, 2022
  17. net: dsa: mv88e6xxx: populate supported_interfaces and mac_capabilities

    Populate the supported interfaces and MAC capabilities for the
    Marvell MV88E6xxx DSA switches in preparation to using these for the
    validation functionality.
    
    Patch co-authored by Marek.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: Marek Behún <kabel@kernel.org> [ fixed 6341 and 6393x ]
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Feb 3, 2022
  18. net: dsa: mv88e6xxx: add mv88e6352_g2_scratch_port_has_serdes()

    Read the hardware configuration to determine which port is attached
    to the serdes.
    
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Russell King (Oracle) authored and davem330 committed Feb 3, 2022
  19. Merge branch 'dsa-mv88e6xxx-port-isolation'

    Tobias Waldekranz says:
    
    ====================
    net: dsa: mv88e6xxx: Improve standalone port isolation
    
    The ideal isolation between standalone ports satisfies two properties:
    1. Packets from one standalone port must not be forwarded to any other
       port.
    2. Packets from a standalone port must be sent to the CPU port.
    
    mv88e6xxx solves (1) by isolating standalone ports using the PVT. Up
    to this point though, (2) has not guaranteed; as the ATU is still
    consulted, there is a chance that incoming packets never reach the CPU
    if its DA has previously been used as the SA of an earlier packet (see
    1/5 for more details). This is typically not a problem, except for one
    very useful setup in which switch ports are looped in order to run the
    bridge kselftests in tools/testing/selftests/net/forwarding. This
    series attempts to solve (2).
    
    Ideally, we could simply use the "ForceMap" bit of more modern chips
    (Agate and newer) to classify all incoming packets as MGMT. This is
    not available on older silicon that is still widely used (Opal Plus
    chips like the 6097 for example).
    
    Instead, this series takes a two pronged approach:
    
    1/5: Always clear MapDA on standalone ports to make sure that no ATU
         entry can lead packets astray. This solves (2) for single-chip
         systems.
    
    2/5: Trivial prep work for 4/5.
    3/5: Trivial prep work for 4/5.
    
    4/5: On multi-chip systems though, this is not enough. On the incoming
         chip, the packet will be forced out towards the CPU thanks to
         1/5, but on any intermediate chips the ATU is still consulted. We
         override this behavior by marking the reserved standalone VID (0)
         as a policy VID, the DSA ports' VID policy is set to TRAP. This
         will cause the packet to be reclassified as MGMT on the first
         intermediate chip, after which it's a straight shot towards the
         CPU.
    
    Finally, we allow more tests to be run on mv88e6xxx:
    
    5/5: The bridge_vlan{,un}aware suites sets an ageing_time of 10s on
         the bridge it creates, but mv88e6xxx has a minimum supported time
         of 15s. Allow this time to be overridden in forwarding.config.
    
    With this series in place, mv88e6xxx passes the following kselftest
    suites:
    
    - bridge_port_isolation.sh
    - bridge_sticky_fdb.sh
    - bridge_vlan_aware.sh
    - bridge_vlan_unaware.sh
    
    v1 -> v2:
      - Wording/spelling (Vladimir)
      - Use standard iterator in dsa_switch_upstream_port (Vladimir)
      - Limit enabling of VTU port policy to downstream DSA ports (Vladimir)
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Feb 3, 2022
  20. selftests: net: bridge: Parameterize ageing timeout

    Allow the ageing timeout that is set on bridges to be customized from
    forwarding.config. This allows the tests to be run on hardware which
    does not support a 10s timeout (e.g. mv88e6xxx).
    
    Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
    Reviewed-by: Petr Machata <petrm@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    wkz authored and davem330 committed Feb 3, 2022
Older