Skip to content
Permalink
Milind-Changir…
Switch branches/tags

Commits on Jan 11, 2022

  1. ceph: add getvxattr op

    Problem:
    Directory vxattrs like ceph.dir.pin* and ceph.dir.layout* may not be
    propagated to the client as frequently to keep them updated. This
    creates vxattr availability problems.
    
    Solution:
    Adds new getvxattr op to fetch ceph.dir.pin*, ceph.dir.layout* and
    ceph.file.layout* vxattrs.
    If the entire layout for a dir or a file is being set, then it is
    expected that the layout be set in standard JSON format. Individual
    field value retrieval is not wrapped in JSON. The JSON format also
    applies while setting the vxattr if the entire layout is being set in
    one go.
    As a temporary measure, setting a vxattr can also be done in the old
    format. The old format will be deprecated in the future.
    
    URL: https://tracker.ceph.com/issues/51062
    Signed-off-by: Milind Changire <mchangir@redhat.com>
    energon0 authored and intel-lab-lkp committed Jan 11, 2022

Commits on Dec 1, 2021

  1. ceph: fix up non-directory creation in SGID directories

    Ceph always inherits the SGID bit if it is set on the parent inode,
    while the generic inode_init_owner does not do this in a few cases where
    it can create a possible security problem (cf. [1]).
    
    Update ceph to strip the SGID bit just as inode_init_owner would.
    
    This bug was detected by the mapped mount testsuite in [3]. The
    testsuite tests all core VFS functionality and semantics with and
    without mapped mounts. That is to say it functions as a generic VFS
    testsuite in addition to a mapped mount testsuite. While working on
    mapped mount support for ceph, SIGD inheritance was the only failing
    test for ceph after the port.
    
    The same bug was detected by the mapped mount testsuite in XFS in
    January 2021 (cf. [2]).
    
    [1]: commit 0fa3ecd ("Fix up non-directory creation in SGID directories")
    [2]: commit 01ea173 ("xfs: fix up non-directory creation in SGID directories")
    [3]: https://git.kernel.org/fs/xfs/xfstests-dev.git
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    brauner authored and idryomov committed Dec 1, 2021
  2. ceph: initialize pathlen variable in reconnect_caps_cb

    The smatch static checker warned about an uninitialized symbol usage in
    this function, in the case where ceph_mdsc_build_path returns an error.
    
    It turns out that that case is harmless, but it just looks sketchy.
    Initialize the variable at declaration time, and remove the unneeded
    setting of it later.
    
    Fixes: a33f643 ("ceph: encode inodes' parent/d_name in cap reconnect message")
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Xiubo Li <xiubli@redhat.com>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    lxbsz authored and idryomov committed Dec 1, 2021
  3. ceph: initialize i_size variable in ceph_sync_read

    Newer compilers seem to determine that this variable being uninitialized
    isn't a problem, but older compilers (from the RHEL8 era) seem to choke
    on it and complain that it could be used uninitialized.
    
    Go ahead and initialize the variable at declaration time to silence
    potential compiler warnings.
    
    Fixes: c3d8e0b ("ceph: return the real size read when it hits EOF")
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Xiubo Li <xiubli@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    jtlayton authored and idryomov committed Dec 1, 2021
  4. ceph: fix duplicate increment of opened_inodes metric

    opened_inodes is incremented twice when the same inode is opened twice
    with O_RDONLY and O_WRONLY respectively.
    
    To reproduce, run this python script, then check the metrics:
    
    import os
    for _ in range(10000):
        fd_r = os.open('a', os.O_RDONLY)
        fd_w = os.open('a', os.O_WRONLY)
        os.close(fd_r)
        os.close(fd_w)
    
    Fixes: 1dd8d47 ("ceph: metrics for opened files, pinned caps and opened inodes")
    Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
    Reviewed-by: Xiubo Li <xiubli@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    huww98 authored and idryomov committed Dec 1, 2021

Commits on Nov 28, 2021

  1. Linux 5.16-rc3

    torvalds committed Nov 28, 2021
  2. Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/mst/vhost
    
    Pull vhost,virtio,vdpa bugfixes from Michael Tsirkin:
     "Misc fixes all over the place.
    
      Revert of virtio used length validation series: the approach taken
      does not seem to work, breaking too many guests in the process. We'll
      need to do length validation using some other approach"
    
    [ This merge also ends up reverting commit f7a36b0 ("vsock/virtio:
      suppress used length validation"), which came in through the
      networking tree in the meantime, and was part of that whole used
      length validation series   - Linus ]
    
    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
      vdpa_sim: avoid putting an uninitialized iova_domain
      vhost-vdpa: clean irqs before reseting vdpa device
      virtio-blk: modify the value type of num in virtio_queue_rq()
      vhost/vsock: cleanup removing `len` variable
      vhost/vsock: fix incorrect used length reported to the guest
      Revert "virtio_ring: validate used buffer length"
      Revert "virtio-net: don't let virtio core to validate used length"
      Revert "virtio-blk: don't let virtio core to validate used length"
      Revert "virtio-scsi: don't let virtio core to validate used buffer length"
    torvalds committed Nov 28, 2021
  3. Merge tag 'x86-urgent-2021-11-28' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/tip/tip
    
    Pull x86 build fix from Thomas Gleixner:
     "A single fix for a missing __init annotation of prepare_command_line()"
    
    * tag 'x86-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      x86/boot: Mark prepare_command_line() __init
    torvalds committed Nov 28, 2021
  4. Merge tag 'sched-urgent-2021-11-28' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/tip/tip
    
    Pull scheduler fix from Thomas Gleixner:
     "A single scheduler fix to ensure that there is no stale KASAN shadow
      state left on the idle task's stack when a CPU is brought up after it
      was brought down before"
    
    * tag 'sched-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      sched/scs: Reset task stack state in bringup_cpu()
    torvalds committed Nov 28, 2021
  5. Merge tag 'perf-urgent-2021-11-28' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/tip/tip
    
    Pull perf fix from Thomas Gleixner:
     "A single fix for perf to prevent it from sending SIGTRAP to another
      task from a trace point event as it's not possible to deliver a
      synchronous signal to a different task from there"
    
    * tag 'perf-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      perf: Ignore sigtrap for tracepoints destined for other tasks
    torvalds committed Nov 28, 2021
  6. Merge tag 'locking-urgent-2021-11-28' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/tip/tip
    
    Pull locking fixes from Thomas Gleixner:
     "Two regression fixes for reader writer semaphores:
    
       - Plug a race in the lock handoff which is caused by inconsistency of
         the reader and writer path and can lead to corruption of the
         underlying counter.
    
       - down_read_trylock() is suboptimal when the lock is contended and
         multiple readers trylock concurrently. That's due to the initial
         value being read non-atomically which results in at least two
         compare exchange loops. Making the initial readout atomic reduces
         this significantly. Whith 40 readers by 11% in a benchmark which
         enforces contention on mmap_sem"
    
    * tag 'locking-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      locking/rwsem: Optimize down_read_trylock() under highly contended case
      locking/rwsem: Make handoff bit handling more consistent
    torvalds committed Nov 28, 2021
  7. Merge tag 'trace-v5.16-rc2-3' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/rostedt/linux-trace
    
    Pull another tracing fix from Steven Rostedt:
     "Fix the fix of pid filtering
    
      The setting of the pid filtering flag tested the "trace only this pid"
      case twice, and ignored the "trace everything but this pid" case.
    
      The 5.15 kernel does things a little differently due to the new sparse
      pid mask introduced in 5.16, and as the bug was discovered running the
      5.15 kernel, and the first fix was initially done for that kernel,
      that fix handled both cases (only pid and all but pid), but the
      forward port to 5.16 created this bug"
    
    * tag 'trace-v5.16-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
      tracing: Test the 'Do not trace this pid' case in create event
    torvalds committed Nov 28, 2021
  8. Merge tag 'iommu-fixes-v5.16-rc2' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/joro/iommu
    
    Pull iommu fixes from Joerg Roedel:
    
     - Intel VT-d fixes:
         - Remove unused PASID_DISABLED
         - Fix RCU locking
         - Fix for the unmap_pages call-back
    
     - Rockchip RK3568 address mask fix
    
     - AMD IOMMUv2 log message clarification
    
    * tag 'iommu-fixes-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
      iommu/vt-d: Fix unmap_pages support
      iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock()
      iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568
      iommu/amd: Clarify AMD IOMMUv2 initialization messages
      iommu/vt-d: Remove unused PASID_DISABLED
    torvalds committed Nov 28, 2021

Commits on Nov 27, 2021

  1. Merge tag '5.16-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd

    Pull ksmbd fixes from Steve French:
     "Five ksmbd server fixes, four of them for stable:
    
       - memleak fix
    
       - fix for default data stream on filesystems that don't support xattr
    
       - error logging fix
    
       - session setup fix
    
       - minor doc cleanup"
    
    * tag '5.16-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd:
      ksmbd: fix memleak in get_file_stream_info()
      ksmbd: contain default data stream even if xattr is empty
      ksmbd: downgrade addition info error msg to debug in smb2_get_info_sec()
      docs: filesystem: cifs: ksmbd: Fix small layout issues
      ksmbd: Fix an error handling path in 'smb2_sess_setup()'
    torvalds committed Nov 27, 2021
  2. vmxnet3: Use generic Kconfig option for page size limit

    Use the architecture independent Kconfig option PAGE_SIZE_LESS_THAN_64KB
    to indicate that VMXNET3 requires a page size smaller than 64kB.
    
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    groeck authored and torvalds committed Nov 27, 2021
  3. fs: ntfs: Limit NTFS_RW to page sizes smaller than 64k

    NTFS_RW code allocates page size dependent arrays on the stack. This
    results in build failures if the page size is 64k or larger.
    
      fs/ntfs/aops.c: In function 'ntfs_write_mst_block':
      fs/ntfs/aops.c:1311:1: error:
    	the frame size of 2240 bytes is larger than 2048 bytes
    
    Since commit f22969a ("powerpc/64s: Default to 64K pages for 64 bit
    book3s") this affects ppc:allmodconfig builds, but other architectures
    supporting page sizes of 64k or larger are also affected.
    
    Increasing the maximum frame size for affected architectures just to
    silence this error does not really help.  The frame size would have to
    be set to a really large value for 256k pages.  Also, a large frame size
    could potentially result in stack overruns in this code and elsewhere
    and is therefore not desirable.  Make NTFS_RW dependent on page sizes
    smaller than 64k instead.
    
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Cc: Anton Altaparmakov <anton@tuxera.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    groeck authored and torvalds committed Nov 27, 2021
  4. arch: Add generic Kconfig option indicating page size smaller than 64k

    NTFS_RW and VMXNET3 require a page size smaller than 64kB.  Add generic
    Kconfig option for use outside architecture code to avoid architecture
    specific Kconfig options in that code.
    
    Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Cc: Anton Altaparmakov <anton@tuxera.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    groeck authored and torvalds committed Nov 27, 2021
  5. tracing: Test the 'Do not trace this pid' case in create event

    When creating a new event (via a module, kprobe, eprobe, etc), the
    descriptors that are created must add flags for pid filtering if an
    instance has pid filtering enabled, as the flags are used at the time the
    event is executed to know if pid filtering should be done or not.
    
    The "Only trace this pid" case was added, but a cut and paste error made
    that case checked twice, instead of checking the "Trace all but this pid"
    case.
    
    Link: https://lore.kernel.org/all/202111280401.qC0z99JB-lkp@intel.com/
    
    Fixes: 6cb2065 ("tracing: Check pid filtering when creating events")
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Nov 27, 2021
  6. Merge tag 'xfs-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/x…

    …fs-linux
    
    Pull xfs fixes from Darrick Wong:
     "Fixes for a resource leak and a build robot complaint about totally
      dead code:
    
       - Fix buffer resource leak that could lead to livelock on corrupt fs.
    
       - Remove unused function xfs_inew_wait to shut up the build robots"
    
    * tag 'xfs-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
      xfs: remove xfs_inew_wait
      xfs: Fix the free logic of state in xfs_attr_node_hasname
    torvalds committed Nov 27, 2021
  7. Merge tag 'iomap-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs…

    …/xfs-linux
    
    Pull iomap fixes from Darrick Wong:
     "A single iomap bug fix and a cleanup for 5.16-rc2.
    
      The bug fix changes how iomap deals with reading from an inline data
      region -- whereas the current code (incorrectly) lets the iomap read
      iter try for more bytes after reading the inline region (which zeroes
      the rest of the page!) and hopes the next iteration terminates, we
      surveyed the inlinedata implementations and realized that all
      inlinedata implementations also require that the inlinedata region end
      at EOF, so we can simply terminate the read.
    
      The second patch documents these assumptions in the code so that
      they're not subtle implications anymore, and cleans up some of the
      grosser parts of that function.
    
      Summary:
    
       - Fix an accounting problem where unaligned inline data reads can run
         off the end of the read iomap iterator. iomap has historically
         required that inline data mappings only exist at the end of a file,
         though this wasn't documented anywhere.
    
       - Document iomap_read_inline_data and change its return type to be
         appropriate for the information that it's actually returning"
    
    * tag 'iomap-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
      iomap: iomap_read_inline_data cleanup
      iomap: Fix inline extent handling in iomap_readpage
    torvalds committed Nov 27, 2021
  8. Merge tag 'trace-v5.16-rc2-2' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/rostedt/linux-trace
    
    Pull tracing fixes from Steven Rostedt:
     "Two fixes to event pid filtering:
    
       - Make sure newly created events reflect the current state of pid
         filtering
    
       - Take pid filtering into account when recording trigger events.
         (Also clean up the if statement to be cleaner)"
    
    * tag 'trace-v5.16-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
      tracing: Fix pid filtering when triggers are attached
      tracing: Check pid filtering when creating events
    torvalds committed Nov 27, 2021
  9. Merge tag 'io_uring-5.16-2021-11-27' of git://git.kernel.dk/linux-block

    Pull more io_uring fixes from Jens Axboe:
     "The locking fixup that was applied earlier this rc has both a deadlock
      and IRQ safety issue, let's get that ironed out before -rc3. This
      contains:
    
       - Link traversal locking fix (Pavel)
    
       - Cancelation fix (Pavel)
    
       - Relocate cond_resched() for huge buffer chain freeing, avoiding a
         softlockup warning (Ye)
    
       - Fix timespec validation (Ye)"
    
    * tag 'io_uring-5.16-2021-11-27' of git://git.kernel.dk/linux-block:
      io_uring: Fix undefined-behaviour in io_issue_sqe
      io_uring: fix soft lockup when call __io_remove_buffers
      io_uring: fix link traversal locking
      io_uring: fail cancellation for EXITING tasks
    torvalds committed Nov 27, 2021
  10. Merge tag 'block-5.16-2021-11-27' of git://git.kernel.dk/linux-block

    Pull more block fixes from Jens Axboe:
     "Turns out that the flushing out of pending fixes before the
      Thanksgiving break didn't quite work out in terms of timing, so here's
      a followup set of fixes:
    
       - rq_qos_done() should be called regardless of whether or not we're
         the final put of the request, it's not related to the freeing of
         the state. This fixes an IO stall with wbt that a few users have
         reported, a regression in this release.
    
       - Only define zram_wb_devops if it's used, fixing a compilation
         warning for some compilers"
    
    * tag 'block-5.16-2021-11-27' of git://git.kernel.dk/linux-block:
      zram: only make zram_wb_devops for CONFIG_ZRAM_WRITEBACK
      block: call rq_qos_done() before ref check in batch completions
    torvalds committed Nov 27, 2021
  11. Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/g…

    …it/jejb/scsi
    
    Pull SCSI fixes from James Bottomley:
     "Twelve fixes, eleven in drivers (target, qla2xx, scsi_debug, mpt3sas,
      ufs). The core fix is a minor correction to the previous state update
      fix for the iscsi daemons"
    
    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
      scsi: scsi_debug: Zero clear zones at reset write pointer
      scsi: core: sysfs: Fix setting device state to SDEV_RUNNING
      scsi: scsi_debug: Sanity check block descriptor length in resp_mode_select()
      scsi: target: configfs: Delete unnecessary checks for NULL
      scsi: target: core: Use RCU helpers for INQUIRY t10_alua_tg_pt_gp
      scsi: mpt3sas: Fix incorrect system timestamp
      scsi: mpt3sas: Fix system going into read-only mode
      scsi: mpt3sas: Fix kernel panic during drive powercycle test
      scsi: ufs: ufs-mediatek: Add put_device() after of_find_device_by_node()
      scsi: scsi_debug: Fix type in min_t to avoid stack OOB
      scsi: qla2xxx: edif: Fix off by one bug in qla_edif_app_getfcinfo()
      scsi: ufs: ufshpb: Fix warning in ufshpb_set_hpb_read_to_upiu()
    torvalds committed Nov 27, 2021
  12. Merge tag 'nfs-for-5.16-2' of git://git.linux-nfs.org/projects/trondm…

    …y/linux-nfs
    
    Pull NFS client fixes from Trond Myklebust:
     "Highlights include:
    
      Stable fixes:
    
       - NFSv42: Fix pagecache invalidation after COPY/CLONE
    
      Bugfixes:
    
       - NFSv42: Don't fail clone() just because the server failed to return
         post-op attributes
    
       - SUNRPC: use different lockdep keys for INET6 and LOCAL
    
       - NFSv4.1: handle NFS4ERR_NOSPC from CREATE_SESSION
    
       - SUNRPC: fix header include guard in trace header"
    
    * tag 'nfs-for-5.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
      SUNRPC: use different lock keys for INET6 and LOCAL
      sunrpc: fix header include guard in trace header
      NFSv4.1: handle NFS4ERR_NOSPC by CREATE_SESSION
      NFSv42: Fix pagecache invalidation after COPY/CLONE
      NFS: Add a tracepoint to show the results of nfs_set_cache_invalid()
      NFSv42: Don't fail clone() unless the OP_CLONE operation failed
    torvalds committed Nov 27, 2021
  13. Merge tag 'erofs-for-5.16-rc3-fixes' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/xiang/erofs
    
    Pull erofs fix from Gao Xiang:
     "Fix an ABBA deadlock introduced by XArray conversion"
    
    * tag 'erofs-for-5.16-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
      erofs: fix deadlock when shrink erofs slab
    torvalds committed Nov 27, 2021
  14. Merge tag 'powerpc-5.16-3' of git://git.kernel.org/pub/scm/linux/kern…

    …el/git/powerpc/linux
    
    Pull powerpc fixes from Michael Ellerman:
     "Fix KVM using a Power9 instruction on earlier CPUs, which could lead
      to the host SLB being incorrectly invalidated and a subsequent host
      crash.
    
      Fix kernel hardlockup on vmap stack overflow on 32-bit.
    
      Thanks to Christophe Leroy, Nicholas Piggin, and Fabiano Rosas"
    
    * tag 'powerpc-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc/32: Fix hardlockup on vmap stack overflow
      KVM: PPC: Book3S HV: Prevent POWER7/8 TLB flush flushing SLB
    torvalds committed Nov 27, 2021
  15. Merge tag 'mips-fixes_5.16_2' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/mips/linux
    
    Pull MIPS fixes from Thomas Bogendoerfer:
    
     - build fix for ZSTD enabled configs
    
     - fix for preempt warning
    
     - fix for loongson FTLB detection
    
     - fix for page table level selection
    
    * tag 'mips-fixes_5.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
      MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48
      MIPS: loongson64: fix FTLB configuration
      MIPS: Fix using smp_processor_id() in preemptible in show_cpuinfo()
      MIPS: boot/compressed/: add __ashldi3 to target for ZSTD compression
    torvalds committed Nov 27, 2021
  16. io_uring: Fix undefined-behaviour in io_issue_sqe

    We got issue as follows:
    ================================================================================
    UBSAN: Undefined behaviour in ./include/linux/ktime.h:42:14
    signed integer overflow:
    -4966321760114568020 * 1000000000 cannot be represented in type 'long long int'
    CPU: 1 PID: 2186 Comm: syz-executor.2 Not tainted 4.19.90+ torvalds#12
    Hardware name: linux,dummy-virt (DT)
    Call trace:
     dump_backtrace+0x0/0x3f0 arch/arm64/kernel/time.c:78
     show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0x170/0x1dc lib/dump_stack.c:118
     ubsan_epilogue+0x18/0xb4 lib/ubsan.c:161
     handle_overflow+0x188/0x1dc lib/ubsan.c:192
     __ubsan_handle_mul_overflow+0x34/0x44 lib/ubsan.c:213
     ktime_set include/linux/ktime.h:42 [inline]
     timespec64_to_ktime include/linux/ktime.h:78 [inline]
     io_timeout fs/io_uring.c:5153 [inline]
     io_issue_sqe+0x42c8/0x4550 fs/io_uring.c:5599
     __io_queue_sqe+0x1b0/0xbc0 fs/io_uring.c:5988
     io_queue_sqe+0x1ac/0x248 fs/io_uring.c:6067
     io_submit_sqe fs/io_uring.c:6137 [inline]
     io_submit_sqes+0xed8/0x1c88 fs/io_uring.c:6331
     __do_sys_io_uring_enter fs/io_uring.c:8170 [inline]
     __se_sys_io_uring_enter fs/io_uring.c:8129 [inline]
     __arm64_sys_io_uring_enter+0x490/0x980 fs/io_uring.c:8129
     invoke_syscall arch/arm64/kernel/syscall.c:53 [inline]
     el0_svc_common+0x374/0x570 arch/arm64/kernel/syscall.c:121
     el0_svc_handler+0x190/0x260 arch/arm64/kernel/syscall.c:190
     el0_svc+0x10/0x218 arch/arm64/kernel/entry.S:1017
    ================================================================================
    
    As ktime_set only judge 'secs' if big than KTIME_SEC_MAX, but if we pass
    negative value maybe lead to overflow.
    To address this issue, we must check if 'sec' is negative.
    
    Signed-off-by: Ye Bin <yebin10@huawei.com>
    Link: https://lore.kernel.org/r/20211118015907.844807-1-yebin10@huawei.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Ye Bin authored and axboe committed Nov 27, 2021
  17. io_uring: fix soft lockup when call __io_remove_buffers

    I got issue as follows:
    [ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
    [  594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
    [  594.364987] Modules linked in:
    [  594.365405] irq event stamp: 604180238
    [  594.365906] hardirqs last  enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50
    [  594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0
    [  594.368420] softirqs last  enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e
    [  594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250
    [  594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G            L    5.15.0-next-20211112+ torvalds#88
    [  594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
    [  594.373604] Workqueue: events_unbound io_ring_exit_work
    [  594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50
    [  594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474
    [  594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202
    [  594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106
    [  594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001
    [  594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab
    [  594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0
    [  594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000
    [  594.382787] FS:  0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000
    [  594.383851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0
    [  594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  594.387403] Call Trace:
    [  594.387738]  <TASK>
    [  594.388042]  find_and_remove_object+0x118/0x160
    [  594.389321]  delete_object_full+0xc/0x20
    [  594.389852]  kfree+0x193/0x470
    [  594.390275]  __io_remove_buffers.part.0+0xed/0x147
    [  594.390931]  io_ring_ctx_free+0x342/0x6a2
    [  594.392159]  io_ring_exit_work+0x41e/0x486
    [  594.396419]  process_one_work+0x906/0x15a0
    [  594.399185]  worker_thread+0x8b/0xd80
    [  594.400259]  kthread+0x3bf/0x4a0
    [  594.401847]  ret_from_fork+0x22/0x30
    [  594.402343]  </TASK>
    
    Message from syslogd@localhost at Nov 13 09:09:54 ...
    kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
    [  596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
    
    We can reproduce this issue by follow syzkaller log:
    r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0)
    sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0)
    syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0)
    io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0)
    
    The reason above issue  is 'buf->list' has 2,100,000 nodes, occupied cpu lead
    to soft lockup.
    To solve this issue, we need add schedule point when do while loop in
    '__io_remove_buffers'.
    After add  schedule point we do regression, get follow data.
    [  240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
    [  268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
    [  275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
    [  296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
    [  305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
    [  325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00
    [  333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
    ...
    
    Fixes:8bab4c09f24e("io_uring: allow conditional reschedule for intensive iterators")
    Signed-off-by: Ye Bin <yebin10@huawei.com>
    Link: https://lore.kernel.org/r/20211122024737.2198530-1-yebin10@huawei.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Ye Bin authored and axboe committed Nov 27, 2021

Commits on Nov 26, 2021

  1. tracing: Fix pid filtering when triggers are attached

    If a event is filtered by pid and a trigger that requires processing of
    the event to happen is a attached to the event, the discard portion does
    not take the pid filtering into account, and the event will then be
    recorded when it should not have been.
    
    Cc: stable@vger.kernel.org
    Fixes: 3fdaf80 ("tracing: Implement event pid filtering")
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    rostedt committed Nov 26, 2021
  2. iommu/vt-d: Fix unmap_pages support

    When supporting only the .map and .unmap callbacks of iommu_ops,
    the IOMMU driver can make assumptions about the size and alignment
    used for mappings based on the driver provided pgsize_bitmap.  VT-d
    previously used essentially PAGE_MASK for this bitmap as any power
    of two mapping was acceptably filled by native page sizes.
    
    However, with the .map_pages and .unmap_pages interface we're now
    getting page-size and count arguments.  If we simply combine these
    as (page-size * count) and make use of the previous map/unmap
    functions internally, any size and alignment assumptions are very
    different.
    
    As an example, a given vfio device assignment VM will often create
    a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
    does not support IOMMU super pages, the unmap_pages interface will
    ask to unmap 1024 4KB pages at the base IOVA.  dma_pte_clear_level()
    will recurse down to level 2 of the page table where the first half
    of the pfn range exactly matches the entire pte level.  We clear the
    pte, increment the pfn by the level size, but (oops) the next pte is
    on a new page, so we exit the loop an pop back up a level.  When we
    then update the pfn based on that higher level, we seem to assume
    that the previous pfn value was at the start of the level.  In this
    case the level size is 256K pfns, which we add to the base pfn and
    get a results of 0x7fe00, which is clearly greater than 0x401ff,
    so we're done.  Meanwhile we never cleared the ptes for the remainder
    of the range.  When the VM remaps this range, we're overwriting valid
    ptes and the VT-d driver complains loudly, as reported by the user
    report linked below.
    
    The fix for this seems relatively simple, if each iteration of the
    loop in dma_pte_clear_level() is assumed to clear to the end of the
    level pte page, then our next pfn should be calculated from level_pfn
    rather than our working pfn.
    
    Fixes: 3f34f12 ("iommu/vt-d: Implement map/unmap_pages() iommu_ops callback")
    Reported-by: Ajay Garg <ajaygargnsit@gmail.com>
    Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
    Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Link: https://lore.kernel.org/all/20211002124012.18186-1-ajaygargnsit@gmail.com/
    Link: https://lore.kernel.org/r/163659074748.1617923.12716161410774184024.stgit@omen
    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Link: https://lore.kernel.org/r/20211126135556.397932-3-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    awilliam authored and joergroedel committed Nov 26, 2021
  3. iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock()

    If we return -EOPNOTSUPP, the rcu lock remains lock. This is spurious.
    Go through the end of the function instead. This way, the missing
    'rcu_read_unlock()' is called.
    
    Fixes: 7afd7f6 ("iommu/vt-d: Check FL and SL capability sanity in scalable mode")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://lore.kernel.org/r/40cc077ca5f543614eab2a10e84d29dd190273f6.1636217517.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Link: https://lore.kernel.org/r/20211126135556.397932-2-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    tititiou36 authored and joergroedel committed Nov 26, 2021
  4. iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568

    With the submission of iommu driver for RK3568 a subtle bug was
    introduced: PAGE_DESC_HI_MASK1 and PAGE_DESC_HI_MASK2 have to be
    the other way arround - that leads to random errors, especially when
    addresses beyond 32 bit are used.
    
    Fix it.
    
    Fixes: c55356c ("iommu: rockchip: Add support for iommu v2")
    Signed-off-by: Alex Bee <knaerzche@gmail.com>
    Tested-by: Peter Geis <pgwipeout@gmail.com>
    Reviewed-by: Heiko Stuebner <heiko@sntech.de>
    Tested-by: Dan Johansen <strit@manjaro.org>
    Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
    Link: https://lore.kernel.org/r/20211124021325.858139-1-knaerzche@gmail.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    knaerzche authored and joergroedel committed Nov 26, 2021
  5. iommu/amd: Clarify AMD IOMMUv2 initialization messages

    The messages printed on the initialization of the AMD IOMMUv2 driver
    have caused some confusion in the past. Clarify the messages to lower
    the confusion in the future.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Link: https://lore.kernel.org/r/20211123105507.7654-3-joro@8bytes.org
    joergroedel committed Nov 26, 2021
Older