Skip to content
Permalink
Dai-Ngo/nfsd-I…
Switch branches/tags

Commits on Dec 11, 2021

  1. nfsd: Initial implementation of NFSv4 Courteous Server

    Currently an NFSv4 client must maintain its lease by using the at least
    one of the state tokens or if nothing else, by issuing a RENEW (4.0), or
    a singleton SEQUENCE (4.1) at least once during each lease period. If the
    client fails to renew the lease, for any reason, the Linux server expunges
    the state tokens immediately upon detection of the "failure to renew the
    lease" condition and begins returning NFS4ERR_EXPIRED if the client should
    reconnect and attempt to use the (now) expired state.
    
    The default lease period for the Linux server is 90 seconds.  The typical
    client cuts that in half and will issue a lease renewing operation every
    45 seconds. The 90 second lease period is very short considering the
    potential for moderately long term network partitions.  A network partition
    refers to any loss of network connectivity between the NFS client and the
    NFS server, regardless of its root cause.  This includes NIC failures, NIC
    driver bugs, network misconfigurations & administrative errors, routers &
    switches crashing and/or having software updates applied, even down to
    cables being physically pulled.  In most cases, these network failures are
    transient, although the duration is unknown.
    
    A server which does not immediately expunge the state on lease expiration
    is known as a Courteous Server.  A Courteous Server continues to recognize
    previously generated state tokens as valid until conflict arises between
    the expired state and the requests from another client, or the server
    reboots.
    
    The initial implementation of the Courteous Server will do the following:
    
    . when the laundromat thread detects an expired client and if that client
    still has established states on the Linux server and there is no waiters
    for the client's locks then mark the client as a COURTESY_CLIENT and skip
    destroying the client and all its states, otherwise destroy the client as
    usual.
    
    . detects conflict of OPEN request with COURTESY_CLIENT, destroys the
    expired client and all its states, skips the delegation recall then allows
    the conflicting request to succeed.
    
    . detects conflict of LOCK/LOCKT, NLM LOCK and TEST, and local locks
    requests with COURTESY_CLIENT, destroys the expired client and all its
    states then allows the conflicting request to succeed.
    
    . detects conflict of LOCK/LOCKT, NLM LOCK and TEST, and local locks
    requests with COURTESY_CLIENT, destroys the expired client and all its
    states then allows the conflicting request to succeed.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Dai Ngo authored and intel-lab-lkp committed Dec 11, 2021
  2. fs/lock: add new callback, lm_expire_lock, to lock_manager_operations

    Add new callback, lm_expire_lock, to lock_manager_operations to allow
    the lock manager to take appropriate action to resolve the lock conflict
    if possible. The callback takes 2 arguments, file_lock of the blocker
    and a testonly flag:
    
    testonly = 1  check and return lock manager's private data if lock conflict
                  can be resolved else return NULL.
    testonly = 0  resolve the conflict if possible, return true if conflict
                  was resolved esle return false.
    
    Lock manager, such as NFSv4 courteous server, uses this callback to
    resolve conflict by destroying lock owner, or the NFSv4 courtesy client
    (client that has expired but allowed to maintains its states) that owns
    the lock.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Dai Ngo authored and intel-lab-lkp committed Dec 11, 2021

Commits on Dec 10, 2021

  1. Merge tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/rafael/linux-pm
    
    Pull ACPI fix from Rafael Wysocki:
     "Create the output directory for the ACPI tools during build if it has
      not been present before and prevent the compilation from failing in
      that case (Chen Yu)"
    
    * tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      ACPI: tools: Fix compilation when output directory is not present
    torvalds committed Dec 10, 2021
  2. Merge tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/rafael/linux-pm
    
    Pull power management fix from Rafael Wysocki:
     "Fix a kernedoc comment that doesn't match the behavior of the function
      documented by it"
    
    * tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      PM: runtime: Fix pm_runtime_active() kerneldoc comment
    torvalds committed Dec 10, 2021
  3. Merge tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux…

    …/kernel/git/groeck/linux-staging
    
    Pull hwmon fixes from Guenter Roeck:
    
     - In the pwm-fan driver, ensure that the internal pwm state matches the
       state assumed by the pwm code.
    
     - Avoid EREMOTEIO errors in sht4 driver
    
     - In the nct6775 driver, make it explicit that the register value
       passed to nct6775_asuswmi_read() is an 8-bit value
    
     - Avoid WARNing in dell-smm driver removal after failing to create
       /proc/i8k
    
     - Stop using a plain integer as NULL pointer in corsair-psu driver
    
    * tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
      hwmon: (pwm-fan) Ensure the fan going on in .probe()
      hwmon: (sht4x) Fix EREMOTEIO errors
      hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value()
      hwmon: (dell-smm) Fix warning on /proc/i8k creation error
      hwmon: (corsair-psu) fix plain integer used as NULL pointer
    torvalds committed Dec 10, 2021
  4. Merge tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/rostedt/linux-trace
    
    Pull tracing fixes from Steven Rostedt:
     "Tracing, ftrace and tracefs fixes:
    
       - Have tracefs honor the gid mount option
    
       - Have new files in tracefs inherit the parent ownership
    
       - Have direct_ops unregister when it has no more functions
    
       - Properly clean up the ops when unregistering multi direct ops
    
       - Add a sample module to test the multiple direct ops
    
       - Fix memory leak in error path of __create_synth_event()"
    
    * tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
      tracing: Fix possible memory leak in __create_synth_event() error path
      ftrace/samples: Add module to test multi direct modify interface
      ftrace: Add cleanup to unregister_ftrace_direct_multi
      ftrace: Use direct_ops hash in unregister_ftrace_direct
      tracefs: Set all files to the same group ownership as the mount option
      tracefs: Have new files inherit the ownership of their parent
    torvalds committed Dec 10, 2021
  5. Merge tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/ebiggers/linux
    
    Pull aio poll fixes from Eric Biggers:
     "Fix three bugs in aio poll, and one issue with POLLFREE more broadly:
    
       - aio poll didn't handle POLLFREE, causing a use-after-free.
    
       - aio poll could block while the file is ready.
    
       - aio poll called eventfd_signal() when it isn't allowed.
    
       - POLLFREE didn't handle multiple exclusive waiters correctly.
    
      This has been tested with the libaio test suite, as well as with test
      programs I wrote that reproduce the first two bugs. I am sending this
      pull request myself as no one seems to be maintaining this code"
    
    * tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
      aio: Fix incorrect usage of eventfd_signal_allowed()
      aio: fix use-after-free due to missing POLLFREE handling
      aio: keep poll requests on waitqueue until completed
      signalfd: use wake_up_pollfree()
      binder: use wake_up_pollfree()
      wait: add wake_up_pollfree()
    torvalds committed Dec 10, 2021
  6. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

    Pull kvm fixes from Paolo Bonzini:
     "More x86 fixes:
    
       - Logic bugs in CR0 writes and Hyper-V hypercalls
    
       - Don't use Enlightened MSR Bitmap for L3
    
       - Remove user-triggerable WARN
    
      Plus a few selftest fixes and a regression test for the
      user-triggerable WARN"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
      selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
      KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
      KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
      selftests: KVM: avoid failures due to reserved HyperTransport region
      KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
      KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
      KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
      KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
    torvalds committed Dec 10, 2021
  7. Merge tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/helgaas/pci
    
    Pull PCI fixes from Bjorn Helgaas:
    
     - Revert emulation of Marvell Armada A3720 expansion ROM because it
       doesn't work as expected (Marek Behún)
    
     - Assert PERST# in Apple M1 driver to fix initialization when booting
       from bootloaders using PCIe, such as U-Boot (Marc Zyngier)
    
     - Describe PERST# as active low in Apple T8103 DT and update driver to
       match (Marc Zyngier)
    
    * tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
      PCI: apple: Fix PERST# polarity
      arm64: dts: apple: t8103: Mark PCIe PERST# polarity active low in DT
      PCI: apple: Follow the PCIe specifications when resetting the port
      Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge"
    torvalds committed Dec 10, 2021
  8. Merge tag 'mmc-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/ulfh/mmc
    
    Pull MMC host fixes from Ulf Hansson:
    
     - mtk-sd: Fix memory leak during tuning
    
     - renesas_sdhi: Initialize variable properly when tuning
    
    * tag 'mmc-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
      mmc: mediatek: free the ext_csd when mmc_get_ext_csd success
      mmc: renesas_sdhi: initialize variable properly when tuning
    torvalds committed Dec 10, 2021
  9. Merge tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/dlemoal/libata
    
    Pull libata fixes from Damien Le Moal:
    
     - Fix a sparse warning in the ahci_ceva driver (me)
    
     - Disable the ASMedia 1092 non-functional device (Hannes)
    
    * tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
      libata: add horkage for ASMedia 1092
      ata: ahci_ceva: Fix id array access in ceva_ahci_read_id()
    torvalds committed Dec 10, 2021
  10. Merge tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kern…

    …el/git/tiwai/sound
    
    Pull sound fixes from Takashi Iwai:
     "Another collection of small fixes. It's still not quite calm yet, but
      nothing looks scary.
    
      ALSA core got a few fixes for covering the issues detected by fuzzer
      and the 32bit compat problem of control API, while the rest are all
      device-specific small fixes, including the continued fixes for Tegra"
    
    * tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
      ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform
      ALSA: usb-audio: Reorder snd_djm_devices[] entries
      ALSA: hda/realtek: Fix quirk for TongFang PHxTxX1
      ALSA: ctl: Fix copy of updated id with element read/write
      ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*()
      ALSA: pcm: oss: Limit the period size to 16MB
      ALSA: pcm: oss: Fix negative period/buffer sizes
      ASoC: codecs: wsa881x: fix return values from kcontrol put
      ASoC: codecs: wcd934x: return correct value from mixer put
      ASoC: codecs: wcd934x: handle channel mappping list correctly
      ASoC: qdsp6: q6routing: Fix return value from msm_routing_put_audio_mixer
      ASoC: SOF: Intel: Retry codec probing if it fails
      ASoC: amd: fix uninitialized variable in snd_acp6x_probe()
      ASoC: rockchip: i2s_tdm: Dup static DAI template
      ASoC: rt5682s: Fix crash due to out of scope stack vars
      ASoC: rt5682: Fix crash due to out of scope stack vars
      ASoC: tegra: Use normal system sleep for ADX
      ASoC: tegra: Use normal system sleep for AMX
      ASoC: tegra: Use normal system sleep for Mixer
      ASoC: tegra: Use normal system sleep for MVC
      ...
    torvalds committed Dec 10, 2021
  11. Merge tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm…

    …/drm
    
    Pull drm fixes from Dave Airlie:
     "Regular fixes, pretty small overall, couple of core fixes, two i915
      and two amdgpu, hopefully it stays this quiet.
    
      ttm:
       - fix ttm_bo_swapout
    
      syncobj:
       - fix fence find bug with signalled fences
    
      i915:
       - fix error pointer deref in gem execbuffer
       - fix for GT init with GuC/HuC on ICL
    
      amdgpu:
       - DPIA fix
       - eDP fix"
    
    * tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drm:
      drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()
      drm/amd/display: prevent reading unitialized links
      drm/amd/display: Fix DPIA outbox timeout after S3/S4/reset
      drm/i915: Fix error pointer dereference in i915_gem_do_execbuffer()
      drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence.
      drm/ttm: fix ttm_bo_swapout
    torvalds committed Dec 10, 2021
  12. selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O

    Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode
    if userspace modifies RCX during a userspace exit to handle string I/O.
    This is a regression test for a user-triggerable WARN introduced by
    commit 3b27de2 ("KVM: x86: split the two parts of emulator_pio_in").
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20211025201311.1881846-3-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Dec 10, 2021
  13. KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit

    Replace a WARN with a comment to call out that userspace can modify RCX
    during an exit to userspace to handle string I/O.  KVM doesn't actually
    support changing the rep count during an exit, i.e. the scenario can be
    ignored, but the WARN needs to go as it's trivial to trigger from
    userspace.
    
    Cc: stable@vger.kernel.org
    Fixes: 3b27de2 ("KVM: x86: split the two parts of emulator_pio_in")
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20211025201311.1881846-2-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Dec 10, 2021
  14. KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode

    In the SDM:
    If the logical processor is in 64-bit mode or if CR4.PCIDE = 1, an
    attempt to clear CR0.PG causes a general-protection exception (#GP).
    Software should transition to compatibility mode and clear CR4.PCIDE
    before attempting to disable paging.
    
    Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
    Message-Id: <20211207095230.53437-1-jiangshanlai@gmail.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Lai Jiangshan authored and bonzini committed Dec 10, 2021
  15. selftests: KVM: avoid failures due to reserved HyperTransport region

    AMD proceessors define an address range that is reserved by HyperTransport
    and causes a failure if used for guest physical addresses.  Avoid
    selftests failures by reserving those guest physical addresses; the
    rules are:
    
    - On parts with <40 bits, its fully hidden from software.
    
    - Before Fam17h, it was always 12G just below 1T, even if there was more
    RAM above this location.  In this case we just not use any RAM above 1T.
    
    - On Fam17h and later, it is variable based on SME, and is either just
    below 2^48 (no encryption) or 2^43 (encryption).
    
    Fixes: ef4c9f4 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
    Cc: stable@vger.kernel.org
    Cc: David Matlack <dmatlack@google.com>
    Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Message-Id: <20210805105423.412878-1-pbonzini@redhat.com>
    Reviewed-by: Sean Christopherson <seanjc@google.com>
    Tested-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    bonzini committed Dec 10, 2021
  16. KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req

    Do not bail early if there are no bits set in the sparse banks for a
    non-sparse, a.k.a. "all CPUs", IPI request.  Per the Hyper-V spec, it is
    legal to have a variable length of '0', e.g. VP_SET's BankContents in
    this case, if the request can be serviced without the extra info.
    
      It is possible that for a given invocation of a hypercall that does
      accept variable sized input headers that all the header input fits
      entirely within the fixed size header. In such cases the variable sized
      input header is zero-sized and the corresponding bits in the hypercall
      input should be set to zero.
    
    Bailing early results in KVM failing to send IPIs to all CPUs as expected
    by the guest.
    
    Fixes: 214ff83 ("KVM: x86: hyperv: implement PV IPI send hypercalls")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Message-Id: <20211207220926.718794-2-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Dec 10, 2021
  17. KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flu…

    …sh hypercall
    
    Prior to commit 0baedd7 ("KVM: x86: make Hyper-V PV TLB flush use
    tlb_flush_guest()"), kvm_hv_flush_tlb() was using 'KVM_REQ_TLB_FLUSH |
    KVM_REQUEST_NO_WAKEUP' when making a request to flush TLBs on other vCPUs
    and KVM_REQ_TLB_FLUSH is/was defined as:
    
     (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
    
    so KVM_REQUEST_WAIT was lost. Hyper-V TLFS, however, requires that
    "This call guarantees that by the time control returns back to the
    caller, the observable effects of all flushes on the specified virtual
    processors have occurred." and without KVM_REQUEST_WAIT there's a small
    chance that the vCPU making the TLB flush will resume running before
    all IPIs get delivered to other vCPUs and a stale mapping can get read
    there.
    
    Fix the issue by adding KVM_REQUEST_WAIT flag to KVM_REQ_TLB_FLUSH_GUEST:
    kvm_hv_flush_tlb() is the sole caller which uses it for
    kvm_make_all_cpus_request()/kvm_make_vcpus_request_mask() where
    KVM_REQUEST_WAIT makes a difference.
    
    Cc: stable@kernel.org
    Fixes: 0baedd7 ("KVM: x86: make Hyper-V PV TLB flush use tlb_flush_guest()")
    Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Message-Id: <20211209102937.584397-1-vkuznets@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    vittyvk authored and bonzini committed Dec 10, 2021
  18. Merge tag 'amd-drm-fixes-5.16-2021-12-08' of https://gitlab.freedeskt…

    …op.org/agd5f/linux into drm-fixes
    
    amd-drm-fixes-5.16-2021-12-08:
    
    amdgpu:
    - DPIA fix
    - eDP fix
    
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    From: Alex Deucher <alexander.deucher@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20211209042824.6720-1-alexander.deucher@amd.com
    airlied committed Dec 10, 2021
  19. Merge tag 'drm-intel-fixes-2021-12-09' of git://anongit.freedesktop.o…

    …rg/drm/drm-intel into drm-fixes
    
    A fix to a error pointer dereference in gem_execbuffer and
    a fix for GT initialization when GuC/HuC are used on ICL.
    
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    
    From: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/YbJVWYAd/jeERCYY@intel.com
    airlied committed Dec 10, 2021
  20. Merge tag 'drm-misc-fixes-2021-12-09' of git://anongit.freedesktop.or…

    …g/drm/drm-misc into drm-fixes
    
    A fix in syncobj to handle fence already signalled better, and a fix for
    a ttm_bo_swapout eviction check.
    
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    
    From: Maxime Ripard <maxime@cerno.tech>
    Link: https://patchwork.freedesktop.org/patch/msgid/20211209124305.gxhid5zwf7m4oasn@houat
    airlied committed Dec 10, 2021

Commits on Dec 9, 2021

  1. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/rdma/rdma
    
    Pull rdma fixes from Jason Gunthorpe:
     "Quite a few small bug fixes old and new, also Doug Ledford is retiring
      now, we thank him for his work. Details:
    
       - Use after free in rxe
    
       - mlx5 DM regression
    
       - hns bugs triggred by device reset
    
       - Two fixes for CONFIG_DEBUG_PREEMPT
    
       - Several longstanding corner case bugs in hfi1
    
       - Two irdma data path bugs in rare cases and some memory issues"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
      RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
      RDMA/irdma: Report correct WC errors
      RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
      RDMA/irdma: Fix a user-after-free in add_pble_prm
      IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr
      IB/hfi1: Fix early init panic
      IB/hfi1: Insure use of smp_processor_id() is preempt disabled
      IB/hfi1: Correct guard on eager buffer deallocation
      RDMA/rtrs: Call {get,put}_cpu_ptr to silence a debug kernel warning
      RDMA/hns: Do not destroy QP resources in the hw resetting phase
      RDMA/hns: Do not halt commands during reset until later
      Remove Doug Ledford from MAINTAINERS
      RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow
      RDMA: Fix use-after-free in rxe_queue_cleanup
    torvalds committed Dec 9, 2021
  2. Merge tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/netdev/net
    
    Pull networking fixes from Jakub Kicinski:
     "Including fixes from bpf, can and netfilter.
    
      Current release - regressions:
    
       - bpf, sockmap: re-evaluate proto ops when psock is removed from
         sockmap
    
      Current release - new code bugs:
    
       - bpf: fix bpf_check_mod_kfunc_call for built-in modules
    
       - ice: fixes for TC classifier offloads
    
       - vrf: don't run conntrack on vrf with !dflt qdisc
    
      Previous releases - regressions:
    
       - bpf: fix the off-by-two error in range markings
    
       - seg6: fix the iif in the IPv6 socket control block
    
       - devlink: fix netns refcount leak in devlink_nl_cmd_reload()
    
       - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
    
       - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
    
      Previous releases - always broken:
    
       - ethtool: do not perform operations on net devices being
         unregistered
    
       - udp: use datalen to cap max gso segments
    
       - ice: fix races in stats collection
    
       - fec: only clear interrupt of handling queue in fec_enet_rx_queue()
    
       - m_can: pci: fix incorrect reference clock rate
    
       - m_can: disable and ignore ELO interrupt
    
       - mvpp2: fix XDP rx queues registering
    
      Misc:
    
       - treewide: add missing includes masked by cgroup -> bpf.h
         dependency"
    
    * tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
      net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
      net: wwan: iosm: fixes unable to send AT command during mbim tx
      net: wwan: iosm: fixes net interface nonfunctional after fw flash
      net: wwan: iosm: fixes unnecessary doorbell send
      net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering
      MAINTAINERS: s390/net: remove myself as maintainer
      net/sched: fq_pie: prevent dismantle issue
      net: mana: Fix memory leak in mana_hwc_create_wq
      seg6: fix the iif in the IPv6 socket control block
      nfp: Fix memory leak in nfp_cpp_area_cache_add()
      nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
      nfc: fix segfault in nfc_genl_dump_devices_done
      udp: using datalen to cap max gso segments
      net: dsa: mv88e6xxx: error handling for serdes_power functions
      can: kvaser_usb: get CAN clock frequency from device
      can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter
      net: mvpp2: fix XDP rx queues registering
      vmxnet3: fix minimum vectors alloc issue
      net, neigh: clear whole pneigh_entry at alloc time
      net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
      ...
    torvalds committed Dec 9, 2021
  3. Merge tag 'mtd/fixes-for-5.16-rc5' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/mtd/linux
    
    Pull mtd fixes from Miquel Raynal:
     "MTD fixes:
    
       - dataflash: Add device-tree SPI IDs to avoid new warnings
    
      Raw NAND fixes:
    
       - Fix nand_choose_best_timings() on unsupported interface
    
       - Fix nand_erase_op delay (wrong unit)
    
       - fsmc:
          - Fix timing computation
          - Take instruction delay into account
    
       - denali:
          - Add the dependency on HAS_IOMEM to silence robots"
    
    * tag 'mtd/fixes-for-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
      mtd: dataflash: Add device-tree SPI IDs
      mtd: rawnand: fsmc: Fix timing computation
      mtd: rawnand: fsmc: Take instruction delay into account
      mtd: rawnand: Fix nand_choose_best_timings() on unsupported interface
      mtd: rawnand: Fix nand_erase_op delay
      mtd: rawnand: denali: Add the dependency on HAS_IOMEM
    torvalds committed Dec 9, 2021
  4. Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/hid/hid
    
    Pull HID fixes from Jiri Kosina:
    
     - fixes for various drivers which assume that a HID device is on USB
       transport, but that might not necessarily be the case, as the device
       can be faked by uhid. (Greg, Benjamin Tissoires)
    
     - fix for spurious wakeups on certain Lenovo notebooks (Thomas
       Weißschuh)
    
     - a few other device-specific quirks
    
    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
      HID: Ignore battery for Elan touchscreen on Asus UX550VE
      HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested
      HID: google: add eel USB id
      HID: add USB_HID dependancy to hid-prodikeys
      HID: add USB_HID dependancy to hid-chicony
      HID: bigbenff: prevent null pointer dereference
      HID: sony: fix error path in probe
      HID: add USB_HID dependancy on some USB HID drivers
      HID: check for valid USB device for many HID drivers
      HID: wacom: fix problems when device is not a valid USB device
      HID: add hid_is_usb() function to make it simpler for USB detection
      HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
    torvalds committed Dec 9, 2021
  5. aio: Fix incorrect usage of eventfd_signal_allowed()

    We should defer eventfd_signal() to the workqueue when
    eventfd_signal_allowed() return false rather than return
    true.
    
    Fixes: b542e38 ("eventfd: Make signal recursion protection a task bit")
    Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
    Link: https://lore.kernel.org/r/20210913111928.98-1-xieyongji@bytedance.com
    Reviewed-by: Eric Biggers <ebiggers@google.com>
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    YongjiXie authored and ebiggers committed Dec 9, 2021
  6. aio: fix use-after-free due to missing POLLFREE handling

    signalfd_poll() and binder_poll() are special in that they use a
    waitqueue whose lifetime is the current task, rather than the struct
    file as is normally the case.  This is okay for blocking polls, since a
    blocking poll occurs within one task; however, non-blocking polls
    require another solution.  This solution is for the queue to be cleared
    before it is freed, by sending a POLLFREE notification to all waiters.
    
    Unfortunately, only eventpoll handles POLLFREE.  A second type of
    non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't
    handle POLLFREE.  This allows a use-after-free to occur if a signalfd or
    binder fd is polled with aio poll, and the waitqueue gets freed.
    
    Fix this by making aio poll handle POLLFREE.
    
    A patch by Ramji Jiyani <ramjiyani@google.com>
    (https://lore.kernel.org/r/20211027011834.2497484-1-ramjiyani@google.com)
    tried to do this by making aio_poll_wake() always complete the request
    inline if POLLFREE is seen.  However, that solution had two bugs.
    First, it introduced a deadlock, as it unconditionally locked the aio
    context while holding the waitqueue lock, which inverts the normal
    locking order.  Second, it didn't consider that POLLFREE notifications
    are missed while the request has been temporarily de-queued.
    
    The second problem was solved by my previous patch.  This patch then
    properly fixes the use-after-free by handling POLLFREE in a
    deadlock-free way.  It does this by taking advantage of the fact that
    freeing of the waitqueue is RCU-delayed, similar to what eventpoll does.
    
    Fixes: 2c14fa8 ("aio: implement IOCB_CMD_POLL")
    Cc: <stable@vger.kernel.org> # v4.18+
    Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    ebiggers committed Dec 9, 2021
  7. aio: keep poll requests on waitqueue until completed

    Currently, aio_poll_wake() will always remove the poll request from the
    waitqueue.  Then, if aio_poll_complete_work() sees that none of the
    polled events are ready and the request isn't cancelled, it re-adds the
    request to the waitqueue.  (This can easily happen when polling a file
    that doesn't pass an event mask when waking up its waitqueue.)
    
    This is fundamentally broken for two reasons:
    
      1. If a wakeup occurs between vfs_poll() and the request being
         re-added to the waitqueue, it will be missed because the request
         wasn't on the waitqueue at the time.  Therefore, IOCB_CMD_POLL
         might never complete even if the polled file is ready.
    
      2. When the request isn't on the waitqueue, there is no way to be
         notified that the waitqueue is being freed (which happens when its
         lifetime is shorter than the struct file's).  This is supposed to
         happen via the waitqueue entries being woken up with POLLFREE.
    
    Therefore, leave the requests on the waitqueue until they are actually
    completed (or cancelled).  To keep track of when aio_poll_complete_work
    needs to be scheduled, use new fields in struct poll_iocb.  Remove the
    'done' field which is now redundant.
    
    Note that this is consistent with how sys_poll() and eventpoll work;
    their wakeup functions do *not* remove the waitqueue entries.
    
    Fixes: 2c14fa8 ("aio: implement IOCB_CMD_POLL")
    Cc: <stable@vger.kernel.org> # v4.18+
    Link: https://lore.kernel.org/r/20211209010455.42744-5-ebiggers@kernel.org
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    ebiggers committed Dec 9, 2021
  8. signalfd: use wake_up_pollfree()

    wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
    all exclusive waiters.  Yet, POLLFREE *must* wake up all waiters.  epoll
    and aio poll are fortunately not affected by this, but it's very
    fragile.  Thus, the new function wake_up_pollfree() has been introduced.
    
    Convert signalfd to use wake_up_pollfree().
    
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Fixes: d80e731 ("epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20211209010455.42744-4-ebiggers@kernel.org
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    ebiggers committed Dec 9, 2021
  9. binder: use wake_up_pollfree()

    wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
    all exclusive waiters.  Yet, POLLFREE *must* wake up all waiters.  epoll
    and aio poll are fortunately not affected by this, but it's very
    fragile.  Thus, the new function wake_up_pollfree() has been introduced.
    
    Convert binder to use wake_up_pollfree().
    
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Fixes: f5cb779 ("ANDROID: binder: remove waitqueue when thread exits.")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    ebiggers committed Dec 9, 2021
  10. wait: add wake_up_pollfree()

    Several ->poll() implementations are special in that they use a
    waitqueue whose lifetime is the current task, rather than the struct
    file as is normally the case.  This is okay for blocking polls, since a
    blocking poll occurs within one task; however, non-blocking polls
    require another solution.  This solution is for the queue to be cleared
    before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
    
    However, that has a bug: wake_up_poll() calls __wake_up() with
    nr_exclusive=1.  Therefore, if there are multiple "exclusive" waiters,
    and the wakeup function for the first one returns a positive value, only
    that one will be called.  That's *not* what's needed for POLLFREE;
    POLLFREE is special in that it really needs to wake up everyone.
    
    Considering the three non-blocking poll systems:
    
    - io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.
    
    - aio poll is unaffected, since it doesn't support exclusive waits.
      However, that's fragile, as someone could add this feature later.
    
    - epoll doesn't appear to be broken by this, since its wakeup function
      returns 0 when it sees POLLFREE.  But this is fragile.
    
    Although there is a workaround (see epoll), it's better to define a
    function which always sends POLLFREE to all waiters.  Add such a
    function.  Also make it verify that the queue really becomes empty after
    all waiters have been woken up.
    
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    ebiggers committed Dec 9, 2021
  11. Merge tag 'netfs-fixes-20211207' of git://git.kernel.org/pub/scm/linu…

    …x/kernel/git/dhowells/linux-fs
    
    Pull netfslib fixes from David Howells:
    
     - Fix a lockdep warning and potential deadlock. This is takes the
       simple approach of offloading the write-to-cache done from within a
       network filesystem read to a worker thread to avoid taking the
       sb_writer lock from the cache backing filesystem whilst holding the
       mmap lock on an inode from the network filesystem.
    
       Jan Kara posits a scenario whereby this can cause deadlock[1], though
       it's quite complex and I think requires someone in userspace to
       actually do I/O on the cache files. Matthew Wilcox isn't so certain,
       though[2].
    
       An alternative way to fix this, suggested by Darrick Wong, might be
       to allow cachefiles to prevent userspace from performing I/O upon the
       file - something like an exclusive open - but that's beyond the scope
       of a fix here if we do want to make such a facility in the future.
    
     - In some of the error handling paths where netfs_ops->cleanup() is
       called, the arguments are transposed[3]. gcc doesn't complain because
       one of the parameters is void* and one of the values is void*.
    
    Link: https://lore.kernel.org/r/20210922110420.GA21576@quack2.suse.cz/ [1]
    Link: https://lore.kernel.org/r/Ya9eDiFCE2fO7K/S@casper.infradead.org/ [2]
    Link: https://lore.kernel.org/r/20211207031449.100510-1-jefflexu@linux.alibaba.com/ [3]
    
    * tag 'netfs-fixes-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
      netfs: fix parameter of cleanup()
      netfs: Fix lockdep warning from taking sb_writers whilst holding mmap_lock
    torvalds committed Dec 9, 2021
  12. tracing: Fix possible memory leak in __create_synth_event() error path

    There's error paths in __create_synth_event() after the argv is allocated
    that fail to free it. Add a jump to free it when necessary.
    
    Link: https://lkml.kernel.org/r/20211209024317.11783-1-linmq006@gmail.com
    
    Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
    [ Fixed up the patch and change log ]
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Yuuoniy authored and rostedt committed Dec 9, 2021
  13. ftrace/samples: Add module to test multi direct modify interface

    Adding ftrace-direct-multi-modify.ko kernel module that uses
    modify_ftrace_direct_multi API. The core functionality is taken
    from ftrace-direct-modify.ko kernel module and changed to fit
    multi direct interface.
    
    The init function creates kthread that periodically calls
    modify_ftrace_direct_multi to change the trampoline address
    for the direct ftrace_ops. The ftrace trace_pipe then shows
    trace from both trampolines.
    
    Link: https://lkml.kernel.org/r/20211206182032.87248-4-jolsa@kernel.org
    
    Cc: Ingo Molnar <mingo@redhat.com>
    Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
    Tested-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Jiri Olsa authored and rostedt committed Dec 9, 2021
Older