Skip to content
Permalink
Damien-Le-Moal…
Switch branches/tags

Commits on Mar 15, 2021

  1. zonefs: Fix O_APPEND async write handling

    zonefs updates the size of a sequential zone file inode only on
    completion of direct writes. When executing asynchronous append writes
    (with a file open with O_APPEND or using RWF_APPEND), the use of the
    current inode size in generic_write_checks() to set an iocb offset thus
    leads to unaligned write if an application issues an append write
    operation with another write already being executed.
    
    Fix this problem by introducing zonefs_write_checks() as a modified
    version of generic_write_checks() using the file inode wp_offset for an
    append write iocb offset. Also introduce zonefs_write_check_limits() to
    replace generic_write_check_limits() call. This zonefs special helper
    makes sure that the maximum file limit used is the maximum size of the
    file being accessed.
    
    Since zonefs_write_checks() already truncates the iov_iter, the calls
    to iov_iter_truncate() in zonefs_file_dio_write() and
    zonefs_file_buffered_write() are removed.
    
    Fixes: 8dcc1a9 ("fs: New zonefs file system")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
    damien-lemoal authored and intel-lab-lkp committed Mar 15, 2021
  2. zonefs: prevent use of seq files as swap file

    The sequential write constraint of sequential zone file prevent their
    use as swap files. Only allow conventional zone files to be used as swap
    files.
    
    Fixes: 8dcc1a9 ("fs: New zonefs file system")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
    damien-lemoal authored and intel-lab-lkp committed Mar 15, 2021

Commits on Mar 14, 2021

  1. Linux 5.12-rc3

    torvalds committed Mar 14, 2021
  2. prctl: fix PR_SET_MM_AUXV kernel stack leak

    Doing a
    
    	prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1);
    
    will copy 1 byte from userspace to (quite big) on-stack array
    and then stash everything to mm->saved_auxv.
    AT_NULL terminator will be inserted at the very end.
    
    /proc/*/auxv handler will find that AT_NULL terminator
    and copy original stack contents to userspace.
    
    This devious scheme requires CAP_SYS_RESOURCE.
    
    Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Alexey Dobriyan authored and torvalds committed Mar 14, 2021
  3. Merge tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/tip/tip
    
    Pull irq fixes from Thomas Gleixner:
     "A set of irqchip updates:
    
       - Make the GENERIC_IRQ_MULTI_HANDLER configuration correct
    
       - Add a missing DT compatible string for the Ingenic driver
    
       - Remove the pointless debugfs_file pointer from struct irqdomain"
    
    * tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      irqchip/ingenic: Add support for the JZ4760
      dt-bindings/irq: Add compatible string for the JZ4760B
      irqchip: Do not blindly select CONFIG_GENERIC_IRQ_MULTI_HANDLER
      ARM: ep93xx: Select GENERIC_IRQ_MULTI_HANDLER directly
      irqdomain: Remove debugfs_file from struct irq_domain
    torvalds committed Mar 14, 2021
  4. Merge tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/tip/tip
    
    Pull timer fix from Thomas Gleixner:
     "A single fix in for hrtimers to prevent an interrupt storm caused by
      the lack of reevaluation of the timers which expire in softirq context
      under certain circumstances, e.g. when the clock was set"
    
    * tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
    torvalds committed Mar 14, 2021
  5. Merge tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/tip/tip
    
    Pull scheduler fixes from Thomas Gleixner:
     "A set of scheduler updates:
    
       - Prevent a NULL pointer dereference in the migration_stop_cpu()
         mechanims
    
       - Prevent self concurrency of affine_move_task()
    
       - Small fixes and cleanups related to task migration/affinity setting
    
       - Ensure that sync_runqueues_membarrier_state() is invoked on the
         current CPU when it is in the cpu mask"
    
    * tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      sched/membarrier: fix missing local execution of ipi_sync_rq_state()
      sched: Simplify set_affinity_pending refcounts
      sched: Fix affine_move_task() self-concurrency
      sched: Optimize migration_cpu_stop()
      sched: Collate affine_move_task() stoppers
      sched: Simplify migration_cpu_stop()
      sched: Fix migration_cpu_stop() requeueing
    torvalds committed Mar 14, 2021
  6. Merge tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/tip/tip
    
    Pull objtool fix from Thomas Gleixner:
     "A single objtool fix to handle the PUSHF/POPF validation correctly for
      the paravirt changes which modified arch_local_irq_restore not to use
      popf"
    
    * tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      objtool,x86: Fix uaccess PUSHF/POPF validation
    torvalds committed Mar 14, 2021
  7. Merge tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/tip/tip
    
    Pull locking fixes from Thomas Gleixner:
     "A couple of locking fixes:
    
       - A fix for the static_call mechanism so it handles unaligned
         addresses correctly.
    
       - Make u64_stats_init() a macro so every instance gets a seperate
         lockdep key.
    
       - Make seqcount_latch_init() a macro as well to preserve the static
         variable which is used for the lockdep key"
    
    * tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      seqlock,lockdep: Fix seqcount_latch_init()
      u64_stats,lockdep: Fix u64_stats_init() vs lockdep
      static_call: Fix the module key fixup
    torvalds committed Mar 14, 2021
  8. Merge tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/tip/tip
    
    Pull perf fixes from Borislav Petkov:
    
     - Make sure PMU internal buffers are flushed for per-CPU events too and
       properly handle PID/TID for large PEBS.
    
     - Handle the case properly when there's no PMU and therefore return an
       empty list of perf MSRs for VMX to switch instead of reading random
       garbage from the stack.
    
    * tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case
      perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR
      perf/core: Flush PMU internal buffers for per-CPU events
    torvalds committed Mar 14, 2021
  9. Merge tag 'efi-urgent-for-v5.12-rc2' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/tip/tip
    
    Pull EFI fix from Ard Biesheuvel via Borislav Petkov:
     "Fix an oversight in the handling of EFI_RT_PROPERTIES_TABLE, which was
      added v5.10, but failed to take the SetVirtualAddressMap() RT service
      into account"
    
    * tag 'efi-urgent-for-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      efi: stub: omit SetVirtualAddressMap() if marked unsupported in RT_PROP table
    torvalds committed Mar 14, 2021
  10. Merge tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/tip/tip
    
    Pull x86 fixes from Borislav Petkov:
    
     - A couple of SEV-ES fixes and robustifications: verify usermode stack
       pointer in NMI is not coming from the syscall gap, correctly track
       IRQ states in the #VC handler and access user insn bytes atomically
       in same handler as latter cannot sleep.
    
     - Balance 32-bit fast syscall exit path to do the proper work on exit
       and thus not confuse audit and ptrace frameworks.
    
     - Two fixes for the ORC unwinder going "off the rails" into KASAN
       redzones and when ORC data is missing.
    
    * tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      x86/sev-es: Use __copy_from_user_inatomic()
      x86/sev-es: Correctly track IRQ states in runtime #VC handler
      x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack
      x86/sev-es: Introduce ip_within_syscall_gap() helper
      x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls
      x86/unwind/orc: Silence warnings caused by missing ORC data
      x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
    torvalds committed Mar 14, 2021
  11. Merge tag 'powerpc-5.12-3' of git://git.kernel.org/pub/scm/linux/kern…

    …el/git/powerpc/linux
    
    Pull powerpc fixes from Michael Ellerman:
     "Some more powerpc fixes for 5.12:
    
       - Fix wrong instruction encoding for lis in ppc_function_entry(),
         which could potentially lead to missed kprobes.
    
       - Fix SET_FULL_REGS on 32-bit and 64e, which prevented ptrace of
         non-volatile GPRs immediately after exec.
    
       - Clean up a missed SRR specifier in the recent interrupt rework.
    
       - Don't treat unrecoverable_exception() as an interrupt handler, it's
         called from other handlers so shouldn't do the interrupt entry/exit
         accounting itself.
    
       - Fix build errors caused by missing declarations for
         [en/dis]able_kernel_vsx().
    
      Thanks to Christophe Leroy, Daniel Axtens, Geert Uytterhoeven, Jiri
      Olsa, Naveen N. Rao, and Nicholas Piggin"
    
    * tag 'powerpc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc/traps: unrecoverable_exception() is not an interrupt handler
      powerpc: Fix missing declaration of [en/dis]able_kernel_vsx()
      powerpc/64s/exception: Clean up a missed SRR specifier
      powerpc: Fix inverted SET_FULL_REGS bitop
      powerpc/64s: Use symbolic macros for function entry encoding
      powerpc/64s: Fix instruction encoding for lis in ppc_function_entry()
    torvalds committed Mar 14, 2021
  12. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

    Pull KVM fixes from Paolo Bonzini:
     "More fixes for ARM and x86"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
      KVM: LAPIC: Advancing the timer expiration on guest initiated write
      KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode
      KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged
      kvm: x86: annotate RCU pointers
      KVM: arm64: Fix exclusive limit for IPA size
      KVM: arm64: Reject VM creation when the default IPA size is unsupported
      KVM: arm64: Ensure I-cache isolation between vcpus of a same VM
      KVM: arm64: Don't use cbz/adr with external symbols
      KVM: arm64: Fix range alignment when walking page tables
      KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
      KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config()
      KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available
      KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key
      KVM: arm64: Fix nVHE hyp panic host context restore
      KVM: arm64: Avoid corrupting vCPU context register in guest exit
      KVM: arm64: nvhe: Save the SPE context early
      kvm: x86: use NULL instead of using plain integer as pointer
      KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled'
      KVM: x86: Ensure deadline timer has truly expired before posting its IRQ
    torvalds committed Mar 14, 2021
  13. Merge branch 'akpm' (patches from Andrew)

    Merge misc fixes from Andrew Morton:
     "28 patches.
    
      Subsystems affected by this series: mm (memblock, pagealloc, hugetlb,
      highmem, kfence, oom-kill, madvise, kasan, userfaultfd, memcg, and
      zram), core-kernel, kconfig, fork, binfmt, MAINTAINERS, kbuild, and
      ia64"
    
    * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (28 commits)
      zram: fix broken page writeback
      zram: fix return value on writeback_store
      mm/memcg: set memcg when splitting page
      mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument
      ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign
      ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
      mm/userfaultfd: fix memory corruption due to writeprotect
      kasan: fix KASAN_STACK dependency for HW_TAGS
      kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
      mm/madvise: replace ptrace attach requirement for process_madvise
      include/linux/sched/mm.h: use rcu_dereference in in_vfork()
      kfence: fix reports if constant function prefixes exist
      kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations
      kfence: fix printk format for ptrdiff_t
      linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
      MAINTAINERS: exclude uapi directories in API/ABI section
      binfmt_misc: fix possible deadlock in bm_register_write
      mm/highmem.c: fix zero_user_segments() with start > end
      hugetlb: do early cow when page pinned on src mm
      mm: use is_cow_mapping() across tree where proper
      ...
    torvalds committed Mar 14, 2021
  14. Merge tag 'irqchip-fixes-5.12-1' of git://git.kernel.org/pub/scm/linu…

    …x/kernel/git/maz/arm-platforms into irq/urgent
    
    Pull irqchip fixes from Marc Zyngier:
    
      - More compatible strings for the Ingenic irqchip (introducing the
        JZ4760B SoC)
      - Select GENERIC_IRQ_MULTI_HANDLER on the ARM ep93xx platform
      - Drop all GENERIC_IRQ_MULTI_HANDLER selections from the irqchip
        Kconfig, now relying on the architecture to get it right
      - Drop the debugfs_file field from struct irq_domain, now that
        debugfs can track things on its own
    Thomas Gleixner committed Mar 14, 2021

Commits on Mar 13, 2021

  1. Merge tag 'char-misc-5.12-rc3' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/gregkh/char-misc
    
    Pull char/misc driver fixes from Greg KH:
     "Here are some small misc/char driver fixes to resolve some reported
      problems:
    
       - habanalabs driver fixes
    
       - Acrn build fixes (reported many times)
    
       - pvpanic module table export fix
    
      All of these have been in linux-next for a while with no reported
      issues"
    
    * tag 'char-misc-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
      misc/pvpanic: Export module FDT device table
      misc: fastrpc: restrict user apps from sending kernel RPC messages
      virt: acrn: Correct type casting of argument of copy_from_user()
      virt: acrn: Use EPOLLIN instead of POLLIN
      virt: acrn: Use vfs_poll() instead of f_op->poll()
      virt: acrn: Make remove_cpu sysfs invisible with !CONFIG_HOTPLUG_CPU
      cpu/hotplug: Fix build error of using {add,remove}_cpu() with !CONFIG_SMP
      habanalabs: fix debugfs address translation
      habanalabs: Disable file operations after device is removed
      habanalabs: Call put_pid() when releasing control device
      drivers: habanalabs: remove unused dentry pointer for debugfs files
      habanalabs: mark hl_eq_inc_ptr() as static
    torvalds committed Mar 13, 2021
  2. Merge tag 'staging-5.12-rc3' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/gregkh/staging
    
    Pull staging driver fixes from Greg KH:
     "Here are some small staging driver fixes for reported problems. They
      include:
    
       - wfx header file cleanup patch reverted as it could cause problems
    
       - comedi driver endian fixes
    
       - buffer overflow problems for staging wifi drivers
    
       - build dependency issue for rtl8192e driver
    
      All have been in linux-next for a while with no reported problems"
    
    * tag 'staging-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (23 commits)
      Revert "staging: wfx: remove unused included header files"
      staging: rtl8188eu: prevent ->ssid overflow in rtw_wx_set_scan()
      staging: rtl8188eu: fix potential memory corruption in rtw_check_beacon_data()
      staging: rtl8192u: fix ->ssid overflow in r8192_wx_set_scan()
      staging: comedi: pcl726: Use 16-bit 0 for interrupt data
      staging: comedi: ni_65xx: Use 16-bit 0 for interrupt data
      staging: comedi: ni_6527: Use 16-bit 0 for interrupt data
      staging: comedi: comedi_parport: Use 16-bit 0 for interrupt data
      staging: comedi: amplc_pc236_common: Use 16-bit 0 for interrupt data
      staging: comedi: pcl818: Fix endian problem for AI command data
      staging: comedi: pcl711: Fix endian problem for AI command data
      staging: comedi: me4000: Fix endian problem for AI command data
      staging: comedi: dmm32at: Fix endian problem for AI command data
      staging: comedi: das800: Fix endian problem for AI command data
      staging: comedi: das6402: Fix endian problem for AI command data
      staging: comedi: adv_pci1710: Fix endian problem for AI command data
      staging: comedi: addi_apci_1500: Fix endian problem for command sample
      staging: comedi: addi_apci_1032: Fix endian problem for COS sample
      staging: ks7010: prevent buffer overflow in ks_wlan_set_scan()
      staging: rtl8712: Fix possible buffer overflow in r8712_sitesurvey_cmd
      ...
    torvalds committed Mar 13, 2021
  3. Merge tag 'tty-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/gregkh/tty
    
    Pull tty/serial fixes from Greg KH:
     "Here are some small tty and serial driver fixes to resolve some
      reported problems:
    
       - led tty trigger fixes based on review and were acked by the led
         maintainer
    
       - revert a max310x serial driver patch as it was causing problems
    
       - revert a pty change as it was also causing problems
    
      All of these have been in linux-next for a while with no reported
      problems"
    
    * tag 'tty-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
      Revert "drivers:tty:pty: Fix a race causing data loss on close"
      Revert "serial: max310x: rework RX interrupt handling"
      leds: trigger/tty: Use led_set_brightness_sync() from workqueue
      leds: trigger: Fix error path to not unlock the unlocked mutex
    torvalds committed Mar 13, 2021
  4. Merge tag 'usb-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/gregkh/usb
    
    Pull USB fixes from Greg KH:
     "Here are a small number of USB fixes for 5.12-rc3 to resolve a bunch
      of reported issues:
    
       - usbip fixups for issues found by syzbot
    
       - xhci driver fixes and quirk additions
    
       - gadget driver fixes
    
       - dwc3 QCOM driver fix
    
       - usb-serial new ids and fixes
    
       - usblp fix for a long-time issue
    
       - cdc-acm quirk addition
    
       - other tiny fixes for reported problems
    
      All of these have been in linux-next for a while with no reported
      issues"
    
    * tag 'usb-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
      xhci: Fix repeated xhci wake after suspend due to uncleared internal wake state
      usb: xhci: Fix ASMedia ASM1042A and ASM3242 DMA addressing
      xhci: Improve detection of device initiated wake signal.
      usb: xhci: do not perform Soft Retry for some xHCI hosts
      usbip: fix vudc usbip_sockfd_store races leading to gpf
      usbip: fix vhci_hcd attach_store() races leading to gpf
      usbip: fix stub_dev usbip_sockfd_store() races leading to gpf
      usbip: fix vudc to check for stream socket
      usbip: fix vhci_hcd to check for stream socket
      usbip: fix stub_dev to check for stream socket
      usb: dwc3: qcom: Add missing DWC3 OF node refcount decrement
      USB: usblp: fix a hang in poll() if disconnected
      USB: gadget: udc: s3c2410_udc: fix return value check in s3c2410_udc_probe()
      usb: renesas_usbhs: Clear PIPECFG for re-enabling pipe with other EPNUM
      usb: dwc3: qcom: Honor wakeup enabled/disabled state
      usb: gadget: f_uac1: stop playback on function disable
      usb: gadget: f_uac2: always increase endpoint max_packet_size by one audio slot
      USB: gadget: u_ether: Fix a configfs return code
      usb: dwc3: qcom: add ACPI device id for sc8180x
      Goodix Fingerprint device is not a modem
      ...
    torvalds committed Mar 13, 2021
  5. Merge tag 'erofs-for-5.12-rc3' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/xiang/erofs
    
    Pull erofs fix from Gao Xiang:
     "Fix an urgent regression introduced by commit baa2c7c ("block:
      set .bi_max_vecs as actual allocated vector number"), which could
      cause unexpected hung since linux 5.12-rc1.
    
      Resolve it by avoiding using bio->bi_max_vecs completely"
    
    * tag 'erofs-for-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
      erofs: fix bio->bi_max_vecs behavior change
    torvalds committed Mar 13, 2021
  6. Merge tag 'kbuild-fixes-v5.12-2' of git://git.kernel.org/pub/scm/linu…

    …x/kernel/git/masahiroy/linux-kbuild
    
    Pull Kbuild fixes from Masahiro Yamada:
    
     - avoid 'make image_name' invoking syncconfig
    
     - fix a couple of bugs in scripts/dummy-tools
    
     - fix LLD_VENDOR and locale issues in scripts/ld-version.sh
    
     - rebuild GCC plugins when the compiler is upgraded
    
     - allow LTO to be enabled with KASAN_HW_TAGS
    
     - allow LTO to be enabled without LLVM=1
    
    * tag 'kbuild-fixes-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
      kbuild: fix ld-version.sh to not be affected by locale
      kbuild: remove meaningless parameter to $(call if_changed_rule,dtc)
      kbuild: remove LLVM=1 test from HAS_LTO_CLANG
      kbuild: remove unneeded -O option to dtc
      kbuild: dummy-tools: adjust to scripts/cc-version.sh
      kbuild: Allow LTO to be selected with KASAN_HW_TAGS
      kbuild: dummy-tools: support MPROFILE_KERNEL checks for ppc
      kbuild: rebuild GCC plugins when the compiler is upgraded
      kbuild: Fix ld-version.sh script if LLD was built with LLD_VENDOR
      kbuild: dummy-tools: fix inverted tests for gcc
      kbuild: add image_name to no-sync-config-targets
    torvalds committed Mar 13, 2021
  7. zram: fix broken page writeback

    commit 0d83596 ("zram: support page writeback") introduced two
    problems.  It overwrites writeback_store's return value as kstrtol's
    return value, which makes return value zero so user could see zero as
    return value of write syscall even though it wrote data successfully.
    
    It also breaks index value in the loop in that it doesn't increase the
    index any longer.  It means it can write only first starting block index
    so user couldn't write all idle pages in the zram so lose memory saving
    chance.
    
    This patch fixes those issues.
    
    Link: https://lkml.kernel.org/r/20210312173949.2197662-2-minchan@kernel.org
    Fixes: 0d83596("zram: support page writeback")
    Signed-off-by: Minchan Kim <minchan@kernel.org>
    Reported-by: Amos Bianchi <amosbianchi@google.com>
    Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Cc: John Dias <joaodias@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    minchank authored and torvalds committed Mar 13, 2021
  8. zram: fix return value on writeback_store

    writeback_store's return value is overwritten by submit_bio_wait's return
    value.  Thus, writeback_store will return zero since there was no IO
    error.  In the end, write syscall from userspace will see the zero as
    return value, which could make the process stall to keep trying the write
    until it will succeed.
    
    Link: https://lkml.kernel.org/r/20210312173949.2197662-1-minchan@kernel.org
    Fixes: 3b82a05("drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store")
    Signed-off-by: Minchan Kim <minchan@kernel.org>
    Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Cc: Colin Ian King <colin.king@canonical.com>
    Cc: John Dias <joaodias@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    minchank authored and torvalds committed Mar 13, 2021
  9. mm/memcg: set memcg when splitting page

    As described in the split_page() comment, for the non-compound high order
    page, the sub-pages must be freed individually.  If the memcg of the first
    page is valid, the tail pages cannot be uncharged when be freed.
    
    For example, when alloc_pages_exact is used to allocate 1MB continuous
    physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is
    set).  When make_alloc_exact free the unused 1MB and free_pages_exact free
    the applied 1MB, actually, only 4KB(one page) is uncharged.
    
    Therefore, the memcg of the tail page needs to be set when splitting a
    page.
    
    Michel:
    
    There are at least two explicit users of __GFP_ACCOUNT with
    alloc_exact_pages added recently.  See 7efe8ef ("KVM: arm64:
    Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT") and c419621
    ("KVM: s390: Add memcg accounting to KVM allocations"), so this is not
    just a theoretical issue.
    
    Link: https://lkml.kernel.org/r/20210304074053.65527-3-zhouguanghui1@huawei.com
    Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Reviewed-by: Shakeel Butt <shakeelb@google.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Hanjun Guo <guohanjun@huawei.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Rui Xiang <rui.xiang@huawei.com>
    Cc: Tianhong Ding <dingtianhong@huawei.com>
    Cc: Weilong Chen <chenweilong@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Zhou Guanghui authored and torvalds committed Mar 13, 2021
  10. mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and …

    …add nr_pages argument
    
    Rename mem_cgroup_split_huge_fixup to split_page_memcg and explicitly pass
    in page number argument.
    
    In this way, the interface name is more common and can be used by
    potential users.  In addition, the complete info(memcg and flag) of the
    memcg needs to be set to the tail pages.
    
    Link: https://lkml.kernel.org/r/20210304074053.65527-2-zhouguanghui1@huawei.com
    Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Reviewed-by: Shakeel Butt <shakeelb@google.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Hanjun Guo <guohanjun@huawei.com>
    Cc: Tianhong Ding <dingtianhong@huawei.com>
    Cc: Weilong Chen <chenweilong@huawei.com>
    Cc: Rui Xiang <rui.xiang@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Zhou Guanghui authored and torvalds committed Mar 13, 2021
  11. ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

    In https://bugs.gentoo.org/769614 Dmitry noticed that
    `ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly.
    
    The bug is in mismatch between get/set errors:
    
    static inline long syscall_get_error(struct task_struct *task,
                                         struct pt_regs *regs)
    {
            return regs->r10 == -1 ? regs->r8:0;
    }
    
    static inline long syscall_get_return_value(struct task_struct *task,
                                                struct pt_regs *regs)
    {
            return regs->r8;
    }
    
    static inline void syscall_set_return_value(struct task_struct *task,
                                                struct pt_regs *regs,
                                                int error, long val)
    {
            if (error) {
                    /* error < 0, but ia64 uses > 0 return value */
                    regs->r8 = -error;
                    regs->r10 = -1;
            } else {
                    regs->r8 = val;
                    regs->r10 = 0;
            }
    }
    
    Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
    
    Link: https://lkml.kernel.org/r/20210221002554.333076-2-slyfox@gentoo.org
    Link: https://bugs.gentoo.org/769614
    Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
    Reported-by: Dmitry V. Levin <ldv@altlinux.org>
    Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
    Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Sergei Trofimovich authored and torvalds committed Mar 13, 2021
  12. ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls

    In https://bugs.gentoo.org/769614 Dmitry noticed that
    `ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called via
    glibc's syscall() wrapper.
    
    ia64 has two ways to call syscalls from userspace: via `break` and via
    `eps` instructions.
    
    The difference is in stack layout:
    
    1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8}
    2. `break` uses userspace stack frame: may be locals (glibc provides
       one), in{0..7} == out{0..8}.
    
    Both work fine in syscall handling cde itself.
    
    But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to
    re-extract syscall arguments but it does not account for locals.
    
    The change always skips locals registers. It should not change `eps`
    path as kernel's handler already enforces locals=0 and fixes `break`.
    
    Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
    
    Link: https://lkml.kernel.org/r/20210221002554.333076-1-slyfox@gentoo.org
    Link: https://bugs.gentoo.org/769614
    Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
    Reported-by: Dmitry V. Levin <ldv@altlinux.org>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Sergei Trofimovich authored and torvalds committed Mar 13, 2021
  13. mm/userfaultfd: fix memory corruption due to writeprotect

    Userfaultfd self-test fails occasionally, indicating a memory corruption.
    
    Analyzing this problem indicates that there is a real bug since mmap_lock
    is only taken for read in mwriteprotect_range() and defers flushes, and
    since there is insufficient consideration of concurrent deferred TLB
    flushes in wp_page_copy().  Although the PTE is flushed from the TLBs in
    wp_page_copy(), this flush takes place after the copy has already been
    performed, and therefore changes of the page are possible between the time
    of the copy and the time in which the PTE is flushed.
    
    To make matters worse, memory-unprotection using userfaultfd also poses a
    problem.  Although memory unprotection is logically a promotion of PTE
    permissions, and therefore should not require a TLB flush, the current
    userrfaultfd code might actually cause a demotion of the architectural PTE
    permission: when userfaultfd_writeprotect() unprotects memory region, it
    unintentionally *clears* the RW-bit if it was already set.  Note that this
    unprotecting a PTE that is not write-protected is a valid use-case: the
    userfaultfd monitor might ask to unprotect a region that holds both
    write-protected and write-unprotected PTEs.
    
    The scenario that happens in selftests/vm/userfaultfd is as follows:
    
    cpu0				cpu1			cpu2
    ----				----			----
    							[ Writable PTE
    							  cached in TLB ]
    userfaultfd_writeprotect()
    [ write-*unprotect* ]
    mwriteprotect_range()
    mmap_read_lock()
    change_protection()
    
    change_protection_range()
    ...
    change_pte_range()
    [ *clear* “write”-bit ]
    [ defer TLB flushes ]
    				[ page-fault ]
    				...
    				wp_page_copy()
    				 cow_user_page()
    				  [ copy page ]
    							[ write to old
    							  page ]
    				...
    				 set_pte_at_notify()
    
    A similar scenario can happen:
    
    cpu0		cpu1		cpu2		cpu3
    ----		----		----		----
    						[ Writable PTE
    				  		  cached in TLB ]
    userfaultfd_writeprotect()
    [ write-protect ]
    [ deferred TLB flush ]
    		userfaultfd_writeprotect()
    		[ write-unprotect ]
    		[ deferred TLB flush]
    				[ page-fault ]
    				wp_page_copy()
    				 cow_user_page()
    				 [ copy page ]
    				 ...		[ write to page ]
    				set_pte_at_notify()
    
    This race exists since commit 292924b ("userfaultfd: wp: apply
    _PAGE_UFFD_WP bit").  Yet, as Yu Zhao pointed, these races became apparent
    since commit 09854ba ("mm: do_wp_page() simplification") which made
    wp_page_copy() more likely to take place, specifically if page_count(page)
    > 1.
    
    To resolve the aforementioned races, check whether there are pending
    flushes on uffd-write-protected VMAs, and if there are, perform a flush
    before doing the COW.
    
    Further optimizations will follow to avoid during uffd-write-unprotect
    unnecassary PTE write-protection and TLB flushes.
    
    Link: https://lkml.kernel.org/r/20210304095423.3825684-1-namit@vmware.com
    Fixes: 09854ba ("mm: do_wp_page() simplification")
    Signed-off-by: Nadav Amit <namit@vmware.com>
    Suggested-by: Yu Zhao <yuzhao@google.com>
    Reviewed-by: Peter Xu <peterx@redhat.com>
    Tested-by: Peter Xu <peterx@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Pavel Emelyanov <xemul@openvz.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: <stable@vger.kernel.org>	[5.9+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Nadav Amit authored and torvalds committed Mar 13, 2021
  14. kasan: fix KASAN_STACK dependency for HW_TAGS

    There's a runtime failure when running HW_TAGS-enabled kernel built with
    GCC on hardware that doesn't support MTE.  GCC-built kernels always have
    CONFIG_KASAN_STACK enabled, even though stack instrumentation isn't
    supported by HW_TAGS.  Having that config enabled causes KASAN to issue
    MTE-only instructions to unpoison kernel stacks, which causes the failure.
    
    Fix the issue by disallowing CONFIG_KASAN_STACK when HW_TAGS is used.
    
    (The commit that introduced CONFIG_KASAN_HW_TAGS specified proper
     dependency for CONFIG_KASAN_STACK_ENABLE but not for CONFIG_KASAN_STACK.)
    
    Link: https://lkml.kernel.org/r/59e75426241dbb5611277758c8d4d6f5f9298dac.1615215441.git.andreyknvl@google.com
    Fixes: 6a63a63 ("kasan: introduce CONFIG_KASAN_HW_TAGS")
    Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
    Reported-by: Catalin Marinas <catalin.marinas@arm.com>
    Cc: <stable@vger.kernel.org>
    Cc: Will Deacon <will.deacon@arm.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Marco Elver <elver@google.com>
    Cc: Peter Collingbourne <pcc@google.com>
    Cc: Evgenii Stepanov <eugenis@google.com>
    Cc: Branislav Rankov <Branislav.Rankov@arm.com>
    Cc: Kevin Brodsky <kevin.brodsky@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    xairy authored and torvalds committed Mar 13, 2021
  15. kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC

    Currently, kasan_free_nondeferred_pages()->kasan_free_pages() is called
    after debug_pagealloc_unmap_pages(). This causes a crash when
    debug_pagealloc is enabled, as HW_TAGS KASAN can't set tags on an
    unmapped page.
    
    This patch puts kasan_free_nondeferred_pages() before
    debug_pagealloc_unmap_pages() and arch_free_page(), which can also make
    the page unavailable.
    
    Link: https://lkml.kernel.org/r/24cd7db274090f0e5bc3adcdc7399243668e3171.1614987311.git.andreyknvl@google.com
    Fixes: 94ab5b6 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
    Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Will Deacon <will.deacon@arm.com>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Marco Elver <elver@google.com>
    Cc: Peter Collingbourne <pcc@google.com>
    Cc: Evgenii Stepanov <eugenis@google.com>
    Cc: Branislav Rankov <Branislav.Rankov@arm.com>
    Cc: Kevin Brodsky <kevin.brodsky@arm.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    xairy authored and torvalds committed Mar 13, 2021
  16. mm/madvise: replace ptrace attach requirement for process_madvise

    process_madvise currently requires ptrace attach capability.
    PTRACE_MODE_ATTACH gives one process complete control over another
    process.  It effectively removes the security boundary between the two
    processes (in one direction).  Granting ptrace attach capability even to a
    system process is considered dangerous since it creates an attack surface.
    This severely limits the usage of this API.
    
    The operations process_madvise can perform do not affect the correctness
    of the operation of the target process; they only affect where the data is
    physically located (and therefore, how fast it can be accessed).  What we
    want is the ability for one process to influence another process in order
    to optimize performance across the entire system while leaving the
    security boundary intact.
    
    Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ and
    CAP_SYS_NICE.  PTRACE_MODE_READ to prevent leaking ASLR metadata and
    CAP_SYS_NICE for influencing process performance.
    
    Link: https://lkml.kernel.org/r/20210303185807.2160264-1-surenb@google.com
    Signed-off-by: Suren Baghdasaryan <surenb@google.com>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Acked-by: Minchan Kim <minchan@kernel.org>
    Acked-by: David Rientjes <rientjes@google.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jeff Vander Stoep <jeffv@google.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Tim Murray <timmurray@google.com>
    Cc: Florian Weimer <fweimer@redhat.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: James Morris <jmorris@namei.org>
    Cc: <stable@vger.kernel.org>	[5.10+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    surenbaghdasaryan authored and torvalds committed Mar 13, 2021
  17. include/linux/sched/mm.h: use rcu_dereference in in_vfork()

    Fix a sparse warning by using rcu_dereference().  Technically this is a
    bug and a sufficiently aggressive compiler could reload the `real_parent'
    pointer outside the protection of the rcu lock (and access freed memory),
    but I think it's pretty unlikely to happen.
    
    Link: https://lkml.kernel.org/r/20210221194207.1351703-1-willy@infradead.org
    Fixes: b18dc5f ("mm, oom: skip vforked tasks from being selected")
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Matthew Wilcox (Oracle) authored and torvalds committed Mar 13, 2021
  18. kfence: fix reports if constant function prefixes exist

    Some architectures prefix all functions with a constant string ('.' on
    ppc64).  Add ARCH_FUNC_PREFIX, which may optionally be defined in
    <asm/kfence.h>, so that get_stack_skipnr() can work properly.
    
    Link: https://lkml.kernel.org/r/f036c53d-7e81-763c-47f4-6024c6c5f058@csgroup.eu
    Link: https://lkml.kernel.org/r/20210304144000.1148590-1-elver@google.com
    Signed-off-by: Marco Elver <elver@google.com>
    Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Andrey Konovalov <andreyknvl@google.com>
    Cc: Jann Horn <jannh@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    melver authored and torvalds committed Mar 13, 2021
  19. kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations

    cache_alloc_debugcheck_after() performs checks on an object, including
    adjusting the returned pointer.  None of this should apply to KFENCE
    objects.  While for non-bulk allocations, the checks are skipped when we
    allocate via KFENCE, for bulk allocations cache_alloc_debugcheck_after()
    is called via cache_alloc_debugcheck_after_bulk().
    
    Fix it by skipping cache_alloc_debugcheck_after() for KFENCE objects.
    
    Link: https://lkml.kernel.org/r/20210304205256.2162309-1-elver@google.com
    Signed-off-by: Marco Elver <elver@google.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Andrey Konovalov <andreyknvl@google.com>
    Cc: Jann Horn <jannh@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    melver authored and torvalds committed Mar 13, 2021
Older