Skip to content
Permalink
Frederic-Weisb…
Switch branches/tags

Commits on Sep 13, 2021

  1. sched: Provide Kconfig support for default dynamic preempt mode

    Currently the boot defined preempt behaviour (aka dynamic preempt)
    selects full preemption by default when the "preempt=" boot parameter
    is omitted. However distros may rather want to default to either
    no preemption or voluntary preemption.
    
    To provide with this flexibility, make dynamic preemption a visible
    Kconfig option and adapt the preemption behaviour selected by the user
    to either static or dynamic preemption.
    
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Frederic Weisbecker authored and intel-lab-lkp committed Sep 13, 2021

Commits on Sep 9, 2021

  1. Merge branch 'x86/urgent'

    Thomas Gleixner committed Sep 9, 2021
  2. Merge branch 'timers/urgent'

    Thomas Gleixner committed Sep 9, 2021
  3. Merge branch 'smp/urgent'

    Thomas Gleixner committed Sep 9, 2021
  4. Merge branch 'sched/urgent'

    Thomas Gleixner committed Sep 9, 2021
  5. Merge branch 'locking/urgent'

    Thomas Gleixner committed Sep 9, 2021
  6. sched: Prevent balance_push() on remote runqueues

    sched_setscheduler() and rt_mutex_setprio() invoke the run-queue balance
    callback after changing priorities or the scheduling class of a task. The
    run-queue for which the callback is invoked can be local or remote.
    
    That's not a problem for the regular rq::push_work which is serialized with
    a busy flag in the run-queue struct, but for the balance_push() work which
    is only valid to be invoked on the outgoing CPU that's wrong. It not only
    triggers the debug warning, but also leaves the per CPU variable push_work
    unprotected, which can result in double enqueues on the stop machine list.
    
    Remove the warning and validate that the function is invoked on the
    outgoing CPU.
    
    Fixes: ae79270 ("sched: Optimize finish_lock_switch()")
    Reported-by: Sebastian Siewior <bigeasy@linutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/87zgt1hdw7.ffs@tglx
    Thomas Gleixner authored and Peter Zijlstra committed Sep 9, 2021
  7. sched/idle: Make the idle timer expire in hard interrupt context

    The intel powerclamp driver will setup a per-CPU worker with RT
    priority. The worker will then invoke play_idle() in which it remains in
    the idle poll loop until it is stopped by the timer it started earlier.
    
    That timer needs to expire in hard interrupt context on PREEMPT_RT.
    Otherwise the timer will expire in ksoftirqd as a SOFT timer but that task
    won't be scheduled on the CPU because its priority is lower than the
    priority of the worker which is in the idle loop.
    
    Always expire the idle timer in hard interrupt context.
    
    Reported-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/r/20210906113034.jgfxrjdvxnjqgtmc@linutronix.de
    Sebastian Andrzej Siewior authored and Thomas Gleixner committed Sep 9, 2021
  8. locking/rtmutex: Fix ww_mutex deadlock check

    Dan reported that rt_mutex_adjust_prio_chain() can be called with
    .orig_waiter == NULL however commit a055fcc ("locking/rtmutex: Return
    success on deadlock for ww_mutex waiters") unconditionally dereferences it.
    
    Since both call-sites that have .orig_waiter == NULL don't care for the
    return value, simply disable the deadlock squash by adding the NULL check.
    
    Notably, both callers use the deadlock condition as a termination condition
    for the iteration; once detected, it is sure that (de)boosting is done.
    Arguably step [3] would be a more natural termination point, but it's
    dubious whether adding a third deadlock detection state would improve the
    code.
    
    Fixes: a055fcc ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Link: https://lore.kernel.org/r/YS9La56fHMiCCo75@hirez.programming.kicks-ass.net
    Peter Zijlstra authored and Thomas Gleixner committed Sep 9, 2021
  9. Merge branches 'akpm' and 'akpm-hotfixes' (patches from Andrew)

    Merge yet more updates and hotfixes from Andrew Morton:
     "Post-linux-next material, based upon latest upstream to catch the
      now-merged dependencies:
    
       - 10 patches.
    
         Subsystems affected by this patch series: mm (vmstat and migration)
         and compat.
    
      And bunch of hotfixes, mostly cc:stable:
    
       - 8 patches.
    
         Subsystems affected by this patch series: mm (hmm, hugetlb, vmscan,
         pagealloc, pagemap, kmemleak, mempolicy, and memblock)"
    
    * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
      arch: remove compat_alloc_user_space
      compat: remove some compat entry points
      mm: simplify compat numa syscalls
      mm: simplify compat_sys_move_pages
      kexec: avoid compat_alloc_user_space
      kexec: move locking into do_kexec_load
      mm: migrate: change to use bool type for 'page_was_mapped'
      mm: migrate: fix the incorrect function name in comments
      mm: migrate: introduce a local variable to get the number of pages
      mm/vmstat: protect per cpu variables with preempt disable on RT
    
    * emailed hotfixes from Andrew Morton <akpm@linux-foundation.org>:
      nds32/setup: remove unused memblock_region variable in setup_memory()
      mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task
      mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp
      mmap_lock: change trace and locking order
      mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype
      mm,vmscan: fix divide by zero in get_scan_count
      mm/hugetlb: initialize hugetlb_usage in mm_init
      mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled
    torvalds committed Sep 9, 2021
  10. nds32/setup: remove unused memblock_region variable in setup_memory()

    kernel test robot reports unused variable warning:
    
       arch/nds32/kernel/setup.c:247:26: warning: Unused variable: region
       [unusedVariable]
        struct memblock_region *region;
                                ^
    
    Remove the unused variable.
    
    Link: https://lkml.kernel.org/r/20210712125218.28951-1-rppt@kernel.org
    Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
    Reported-by: kernel test robot <lkp@intel.com>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Cc: Greentime Hu <green.hu@gmail.com>
    Cc: Nick Hu <nickhu@andestech.com>
    Cc: Vincent Chen <deanbo422@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    rppt authored and torvalds committed Sep 9, 2021
  11. mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task

    Servers happened below panic:
    
      Kernel version:5.4.56
      BUG: unable to handle page fault for address: 0000000000002c48
      RIP: 0010:__next_zones_zonelist+0x1d/0x40
      Call Trace:
        __alloc_pages_nodemask+0x277/0x310
        alloc_page_interleave+0x13/0x70
        handle_mm_fault+0xf99/0x1390
        __do_page_fault+0x288/0x500
        do_page_fault+0x30/0x110
        page_fault+0x3e/0x50
    
    The reason for the panic is that MAX_NUMNODES is passed in the third
    parameter in __alloc_pages_nodemask(preferred_nid).  So access to
    zonelist->zoneref->zone_idx in __next_zones_zonelist will cause a panic.
    
    In offset_il_node(), first_node() returns nid from pol->v.nodes, after
    this other threads may chang pol->v.nodes before next_node().  This race
    condition will let next_node return MAX_NUMNODES.  So put pol->nodes in
    a local variable.
    
    The race condition is between offset_il_node and cpuset_change_task_nodemask:
    
      CPU0:                                     CPU1:
      alloc_pages_vma()
        interleave_nid(pol,)
          offset_il_node(pol,)
            first_node(pol->v.nodes)            cpuset_change_task_nodemask
                            //nodes==0xc          mpol_rebind_task
                                                    mpol_rebind_policy
                                                      mpol_rebind_nodemask(pol,nodes)
                            //nodes==0x3
            next_node(nid, pol->v.nodes)//return MAX_NUMNODES
    
    Link: https://lkml.kernel.org/r/20210906034658.48721-1-yanghui.def@bytedance.com
    Signed-off-by: yanghui <yanghui.def@bytedance.com>
    Reviewed-by: Muchun Song <songmuchun@bytedance.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    yanghui authored and torvalds committed Sep 9, 2021
  12. mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp

    In a memory pressure situation, I'm seeing the lockdep WARNING below.
    Actually, this is similar to a known false positive which is already
    addressed by commit 6dcde60 ("xfs: more lockdep whackamole with
    kmem_alloc*").
    
    This warning still persists because it's not from kmalloc() itself but
    from an allocation for kmemleak object.  While kmalloc() itself suppress
    the warning with __GFP_NOLOCKDEP, gfp_kmemleak_mask() is dropping the
    flag for the kmemleak's allocation.
    
    Allow __GFP_NOLOCKDEP to be passed to kmemleak's allocation, so that the
    warning for it is also suppressed.
    
      ======================================================
      WARNING: possible circular locking dependency detected
      5.14.0-rc7-BTRFS-ZNS+ torvalds#37 Not tainted
      ------------------------------------------------------
      kswapd0/288 is trying to acquire lock:
      ffff88825ab45df0 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0x8a/0x250
    
      but task is already holding lock:
      ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
    
      which lock already depends on the new lock.
    
      the existing dependency chain (in reverse order) is:
    
      -> #1 (fs_reclaim){+.+.}-{0:0}:
             fs_reclaim_acquire+0x112/0x160
             kmem_cache_alloc+0x48/0x400
             create_object.isra.0+0x42/0xb10
             kmemleak_alloc+0x48/0x80
             __kmalloc+0x228/0x440
             kmem_alloc+0xd3/0x2b0
             kmem_alloc_large+0x5a/0x1c0
             xfs_attr_copy_value+0x112/0x190
             xfs_attr_shortform_getvalue+0x1fc/0x300
             xfs_attr_get_ilocked+0x125/0x170
             xfs_attr_get+0x329/0x450
             xfs_get_acl+0x18d/0x430
             get_acl.part.0+0xb6/0x1e0
             posix_acl_xattr_get+0x13a/0x230
             vfs_getxattr+0x21d/0x270
             getxattr+0x126/0x310
             __x64_sys_fgetxattr+0x1a6/0x2a0
             do_syscall_64+0x3b/0x90
             entry_SYSCALL_64_after_hwframe+0x44/0xae
    
      -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
             __lock_acquire+0x2c0f/0x5a00
             lock_acquire+0x1a1/0x4b0
             down_read_nested+0x50/0x90
             xfs_ilock+0x8a/0x250
             xfs_can_free_eofblocks+0x34f/0x570
             xfs_inactive+0x411/0x520
             xfs_fs_destroy_inode+0x2c8/0x710
             destroy_inode+0xc5/0x1a0
             evict+0x444/0x620
             dispose_list+0xfe/0x1c0
             prune_icache_sb+0xdc/0x160
             super_cache_scan+0x31e/0x510
             do_shrink_slab+0x337/0x8e0
             shrink_slab+0x362/0x5c0
             shrink_node+0x7a7/0x1a40
             balance_pgdat+0x64e/0xfe0
             kswapd+0x590/0xa80
             kthread+0x38c/0x460
             ret_from_fork+0x22/0x30
    
      other info that might help us debug this:
       Possible unsafe locking scenario:
             CPU0                    CPU1
             ----                    ----
        lock(fs_reclaim);
                                     lock(&xfs_nondir_ilock_class);
                                     lock(fs_reclaim);
        lock(&xfs_nondir_ilock_class);
    
       *** DEADLOCK ***
      3 locks held by kswapd0/288:
       #0: ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
       #1: ffffffff848a08d8 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x269/0x5c0
       #2: ffff8881a7a820e8 (&type->s_umount_key#60){++++}-{3:3}, at: super_cache_scan+0x5a/0x510
    
    Link: https://lkml.kernel.org/r/20210907055659.3182992-1-naohiro.aota@wdc.com
    Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Cc: "Darrick J . Wong" <djwong@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    naota authored and torvalds committed Sep 9, 2021
  13. mmap_lock: change trace and locking order

    Print to the trace log before releasing the lock to avoid racing with
    other trace log printers of the same lock type.
    
    Link: https://lkml.kernel.org/r/20210903022041.1843024-1-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Michel Lespinasse <walken.cr@gmail.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    howlett authored and torvalds committed Sep 9, 2021
  14. mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype

    If it's not prepared to free unref page, the pcp page migratetype is
    unset.  Thus we will get rubbish from get_pcppage_migratetype() and
    might list_del(&page->lru) again after it's already deleted from the list
    leading to grumble about data corruption.
    
    Link: https://lkml.kernel.org/r/20210902115447.57050-1-linmiaohe@huawei.com
    Fixes: df1acc8 ("mm/page_alloc: avoid conflating IRQs disabled with zone->lock")
    Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    MiaoheLin authored and torvalds committed Sep 9, 2021
  15. mm,vmscan: fix divide by zero in get_scan_count

    Commit f56ce41 ("mm: memcontrol: fix occasional OOMs due to
    proportional memory.low reclaim") introduced a divide by zero corner
    case when oomd is being used in combination with cgroup memory.low
    protection.
    
    When oomd decides to kill a cgroup, it will force the cgroup memory to
    be reclaimed after killing the tasks, by writing to the memory.max file
    for that cgroup, forcing the remaining page cache and reclaimable slab
    to be reclaimed down to zero.
    
    Previously, on cgroups with some memory.low protection that would result
    in the memory being reclaimed down to the memory.low limit, or likely
    not at all, having the page cache reclaimed asynchronously later.
    
    With f56ce41 the oomd write to memory.max tries to reclaim all the
    way down to zero, which may race with another reclaimer, to the point of
    ending up with the divide by zero below.
    
    This patch implements the obvious fix.
    
    Link: https://lkml.kernel.org/r/20210826220149.058089c6@imladris.surriel.com
    Fixes: f56ce41 ("mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim")
    Signed-off-by: Rik van Riel <riel@surriel.com>
    Acked-by: Roman Gushchin <guro@fb.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Chris Down <chris@chrisdown.name>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    rikvanriel authored and torvalds committed Sep 9, 2021
  16. mm/hugetlb: initialize hugetlb_usage in mm_init

    After fork, the child process will get incorrect (2x) hugetlb_usage.  If
    a process uses 5 2MB hugetlb pages in an anonymous mapping,
    
    	HugetlbPages:	   10240 kB
    
    and then forks, the child will show,
    
    	HugetlbPages:	   20480 kB
    
    The reason for double the amount is because hugetlb_usage will be copied
    from the parent and then increased when we copy page tables from parent
    to child.  Child will have 2x actual usage.
    
    Fix this by adding hugetlb_count_init in mm_init.
    
    Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com
    Fixes: 5d317b2 ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status")
    Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
    Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Liu Zixian authored and torvalds committed Sep 9, 2021
  17. mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled

    Previously, we noticed the one rpma example was failed[1] since commit
    36f30e4 ("IB/core: Improve ODP to use hmm_range_fault()"), where it
    will use ODP feature to do RDMA WRITE between fsdax files.
    
    After digging into the code, we found hmm_vma_handle_pte() will still
    return EFAULT even though all the its requesting flags has been
    fulfilled.  That's because a DAX page will be marked as (_PAGE_SPECIAL |
    PAGE_DEVMAP) by pte_mkdevmap().
    
    Link: pmem/rpma#1142 [1]
    Link: https://lkml.kernel.org/r/20210830094232.203029-1-lizhijian@cn.fujitsu.com
    Fixes: 4055062 ("mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL handling")
    Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    zhijianli88 authored and torvalds committed Sep 9, 2021

Commits on Sep 8, 2021

  1. Merge tag 'tag-chrome-platform-for-v5.15' of git://git.kernel.org/pub…

    …/scm/linux/kernel/git/chrome-platform/linux
    
    Pull chrome platform updates from Benson Leung:
     "cros_ec_typec:
    
       - make the cros_ec_typec driver to use the pre-existing
         cros_ec_check_features() function
    
      sensorhub:
    
       - add trace events for sample
    
      misc:
    
       - cros_ec_proto - re-send commands in the event of a timeout (for the
         FPMCU)
    
       - fix warnings in cros_ec_trace related to format output"
    
    * tag 'tag-chrome-platform-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
      platform/chrome: cros_ec_trace: Fix format warnings
      platform/chrome: cros_ec_typec: Use existing feature check
      platform/chrome: cros_ec_proto: Send command again when timeout occurs
      platform/chrome: sensorhub: Add trace events for sample
    torvalds committed Sep 8, 2021
  2. Merge tag 'pm-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/rafael/linux-pm
    
    Pull more power management updates from Rafael Wysocki:
     "These are mostly ARM cpufreq driver updates, including one new
      MediaTek driver that has just passed all of the reviews, with the
      addition of a revert of a recent intel_pstate commit, some core
      cpufreq changes and a DT-related update of the operating performance
      points (OPP) support code.
    
      Specifics:
    
       - Add new cpufreq driver for the MediaTek MT6779 platform called
         mediatek-hw along with corresponding DT bindings (Hector.Yuan).
    
       - Add DCVS interrupt support to the qcom-cpufreq-hw driver (Thara
         Gopinath).
    
       - Make the qcom-cpufreq-hw driver set the dvfs_possible_from_any_cpu
         policy flag (Taniya Das).
    
       - Blocklist more Qualcomm platforms in cpufreq-dt-platdev (Bjorn
         Andersson).
    
       - Make the vexpress cpufreq driver set the CPUFREQ_IS_COOLING_DEV
         flag (Viresh Kumar).
    
       - Add new cpufreq driver callback to allow drivers to register with
         the Energy Model in a consistent way and make several drivers use
         it (Viresh Kumar).
    
       - Change the remaining users of the .ready() cpufreq driver callback
         to move the code from it elsewhere and drop it from the cpufreq
         core (Viresh Kumar).
    
       - Revert recent intel_pstate change adding HWP guaranteed performance
         change notification support to it that led to problems, because the
         notification in question is triggered prematurely on some systems
         (Rafael Wysocki).
    
       - Convert the OPP DT bindings to DT schema and clean them up while at
         it (Rob Herring)"
    
    * tag 'pm-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
      Revert "cpufreq: intel_pstate: Process HWP Guaranteed change notification"
      cpufreq: mediatek-hw: Add support for CPUFREQ HW
      cpufreq: Add of_perf_domain_get_sharing_cpumask
      dt-bindings: cpufreq: add bindings for MediaTek cpufreq HW
      cpufreq: Remove ready() callback
      cpufreq: sh: Remove sh_cpufreq_cpu_ready()
      cpufreq: acpi: Remove acpi_cpufreq_cpu_ready()
      cpufreq: qcom-hw: Set dvfs_possible_from_any_cpu cpufreq driver flag
      cpufreq: blocklist more Qualcomm platforms in cpufreq-dt-platdev
      cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support
      cpufreq: scmi: Use .register_em() to register with energy model
      cpufreq: vexpress: Use .register_em() to register with energy model
      cpufreq: scpi: Use .register_em() to register with energy model
      dt-bindings: opp: Convert to DT schema
      dt-bindings: Clean-up OPP binding node names in examples
      ARM: dts: omap: Drop references to opp.txt
      cpufreq: qcom-cpufreq-hw: Use .register_em() to register with energy model
      cpufreq: omap: Use .register_em() to register with energy model
      cpufreq: mediatek: Use .register_em() to register with energy model
      cpufreq: imx6q: Use .register_em() to register with energy model
      ...
    torvalds committed Sep 8, 2021
  3. Merge tag 'acpi-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/rafael/linux-pm
    
    Pull more ACPI updates from Rafael Wysocki:
     "These add ACPI support to the PCI VMD driver, improve suspend-to-idle
      support for AMD platforms and update documentation.
    
      Specifics:
    
       - Add ACPI support to the PCI VMD driver (Rafael Wysocki)
    
       - Rearrange suspend-to-idle support code to reflect the platform
         firmware expectations on some AMD platforms (Mario Limonciello)
    
       - Make SSDT overlays documentation follow the code documented by it
         more closely (Andy Shevchenko)"
    
    * tag 'acpi-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      ACPI: PM: s2idle: Run both AMD and Microsoft methods if both are supported
      Documentation: ACPI: Align the SSDT overlays file with the code
      PCI: VMD: ACPI: Make ACPI companion lookup work for VMD bus
    torvalds committed Sep 8, 2021
  4. Merge tag 'docs-5.15-2' of git://git.lwn.net/linux

    Pull more documentation updates from Jonathan Corbet:
     "Another collection of documentation patches, mostly fixes but also
      includes another set of traditional Chinese translations"
    
    * tag 'docs-5.15-2' of git://git.lwn.net/linux:
      docs: pdfdocs: Fix typo in CJK-language specific font settings
      docs: kernel-hacking: Remove inappropriate text
      docs/zh_TW: add translations for zh_TW/filesystems
      docs/zh_TW: add translations for zh_TW/cpu-freq
      docs/zh_TW: add translations for zh_TW/arm64
      docs/zh_CN: Modify the translator tag and fix the wrong word
      Documentation/features/vm: correct huge-vmap APIs
      Documentation: block: blk-mq: Fix small typo in multi-queue docs
      Documentation: in_irq() cleanup
      Documentation: arm: marvell: Add 88F6825 model into list
      Documentation/process/maintainer-pgp-guide: Replace broken link to PGP path finder
      Documentation: locking: fix references
      Documentation: Update details of The Linux Kernel Module Programming Guide
      docs: x86: Remove obsolete information about x86_64 vmalloc() faulting
      Documentation/process/applying-patches: Activate linux-next man hyperlink
    torvalds committed Sep 8, 2021
  5. Merge tag 'modules-for-v5.15' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/jeyu/linux
    
    Pull module updates from Jessica Yu:
     "The only main change I have for this round of updates is the modules
      MAINTAINERS update.
    
      As I find myself with less time to devote to upstream these days, Luis
      has kindly agreed to help maintain the module loader, to eventually
      transition to being the primary maintainer. Since Luis is already very
      involved upstream with experience maintaining various areas of the
      kernel including the kmod usermode helper, I think he is a great fit
      for this area of the kernel.
    
      Summary:
    
       - Add Luis Chamberlain as modules maintainer
    
       - Fix for .ctors sections in module linker script"
    
    * tag 'modules-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
      MAINTAINERS: Add Luis Chamberlain as modules maintainer
      module: combine constructors in module linker script
    torvalds committed Sep 8, 2021
  6. Merge tag 'microblaze-v5.15' of git://git.monstr.eu/linux-2.6-microblaze

    Pull microblaze update from Michal Simek:
    
     - Kbuild clean up
    
    * tag 'microblaze-v5.15' of git://git.monstr.eu/linux-2.6-microblaze:
      microblaze: move core-y in arch/microblaze/Makefile to arch/microblaze/Kbuild
    torvalds committed Sep 8, 2021
  7. Merge tag 'nfsd-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/cel/linux
    
    Pull nfsd fixes from Chuck Lever:
    
     - Restore performance on memory-starved servers
    
    * tag 'nfsd-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
      SUNRPC: improve error response to over-size gss credential
      SUNRPC: don't pause on incomplete allocation
    torvalds committed Sep 8, 2021
  8. Merge tag 'ceph-for-5.15-rc1' of git://github.com/ceph/ceph-client

    Pull ceph updates from Ilya Dryomov:
    
     - a set of patches to address fsync stalls caused by depending on
       periodic rather than triggered MDS journal flushes in some cases
       (Xiubo Li)
    
     - a fix for mtime effectively not getting updated in case of competing
       writers (Jeff Layton)
    
     - a couple of fixes for inode reference leaks and various WARNs after
       "umount -f" (Xiubo Li)
    
     - a new ceph.auth_mds extended attribute (Jeff Layton)
    
     - a smattering of fixups and cleanups from Jeff, Xiubo and Colin.
    
    * tag 'ceph-for-5.15-rc1' of git://github.com/ceph/ceph-client:
      ceph: fix dereference of null pointer cf
      ceph: drop the mdsc_get_session/put_session dout messages
      ceph: lockdep annotations for try_nonblocking_invalidate
      ceph: don't WARN if we're forcibly removing the session caps
      ceph: don't WARN if we're force umounting
      ceph: remove the capsnaps when removing caps
      ceph: request Fw caps before updating the mtime in ceph_write_iter
      ceph: reconnect to the export targets on new mdsmaps
      ceph: print more information when we can't find snaprealm
      ceph: add ceph_change_snap_realm() helper
      ceph: remove redundant initializations from mdsc and session
      ceph: cancel delayed work instead of flushing on mdsc teardown
      ceph: add a new vxattr to return auth mds for an inode
      ceph: remove some defunct forward declarations
      ceph: flush the mdlog before waiting on unsafe reqs
      ceph: flush mdlog before umounting
      ceph: make iterate_sessions a global symbol
      ceph: make ceph_create_session_msg a global symbol
      ceph: fix comment about short copies in ceph_write_end
      ceph: fix memory leak on decode error in ceph_handle_caps
    torvalds committed Sep 8, 2021
  9. Merge tag '9p-for-5.15-rc1' of git://github.com/martinetd/linux

    Pull 9p updates from Dominique Martinet:
     "A couple of harmless fixes, increase max tcp msize (64KB -> 1MB), and
      increase default msize (8KB -> 128KB)
    
      The default increase has been discussed with Christian for the qemu
      side of things but makes sense for all supported transports"
    
    * tag '9p-for-5.15-rc1' of git://github.com/martinetd/linux:
      net/9p: increase default msize to 128k
      net/9p: use macro to define default msize
      net/9p: increase tcp max msize to 1MB
      9p/xen: Fix end of loop tests for list_for_each_entry
      9p/trans_virtio: Remove sysfs file on probe failure
    torvalds committed Sep 8, 2021
  10. arch: remove compat_alloc_user_space

    All users of compat_alloc_user_space() and copy_in_user() have been
    removed from the kernel, only a few functions in sparc remain that can be
    changed to calling arch_copy_in_user() instead.
    
    Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Feng Tang <feng.tang@intel.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    arndb authored and torvalds committed Sep 8, 2021
  11. compat: remove some compat entry points

    These are all handled correctly when calling the native system call entry
    point, so remove the special cases.
    
    Link: https://lkml.kernel.org/r/20210727144859.4150043-6-arnd@kernel.org
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Feng Tang <feng.tang@intel.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    arndb authored and torvalds committed Sep 8, 2021
  12. mm: simplify compat numa syscalls

    The compat implementations for mbind, get_mempolicy, set_mempolicy and
    migrate_pages are just there to handle the subtly different layout of
    bitmaps on 32-bit hosts.
    
    The compat implementation however lacks some of the checks that are
    present in the native one, in particular for checking that the extra bits
    are all zero when user space has a larger mask size than the kernel.
    Worse, those extra bits do not get cleared when copying in or out of the
    kernel, which can lead to incorrect data as well.
    
    Unify the implementation to handle the compat bitmap layout directly in
    the get_nodes() and copy_nodes_to_user() helpers.  Splitting out the
    get_bitmap() helper from get_nodes() also helps readability of the native
    case.
    
    On x86, two additional problems are addressed by this: compat tasks can
    pass a bitmap at the end of a mapping, causing a fault when reading across
    the page boundary for a 64-bit word.  x32 tasks might also run into
    problems with get_mempolicy corrupting data when an odd number of 32-bit
    words gets passed.
    
    On parisc the migrate_pages() system call apparently had the wrong calling
    convention, as big-endian architectures expect the words inside of a
    bitmap to be swapped.  This is not a problem though since parisc has no
    NUMA support.
    
    [arnd@arndb.de: fix mempolicy crash]
      Link: https://lkml.kernel.org/r/20210730143417.3700653-1-arnd@kernel.org
      Link: https://lore.kernel.org/lkml/YQPLG20V3dmOfq3a@osiris/
    
    Link: https://lkml.kernel.org/r/20210727144859.4150043-5-arnd@kernel.org
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Feng Tang <feng.tang@intel.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    arndb authored and torvalds committed Sep 8, 2021
  13. mm: simplify compat_sys_move_pages

    The compat move_pages() implementation uses compat_alloc_user_space() for
    converting the pointer array.  Moving the compat handling into the
    function itself is a bit simpler and lets us avoid the
    compat_alloc_user_space() call.
    
    Link: https://lkml.kernel.org/r/20210727144859.4150043-4-arnd@kernel.org
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Feng Tang <feng.tang@intel.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    arndb authored and torvalds committed Sep 8, 2021
  14. kexec: avoid compat_alloc_user_space

    kimage_alloc_init() expects a __user pointer, so compat_sys_kexec_load()
    uses compat_alloc_user_space() to convert the layout and put it back onto
    the user space caller stack.
    
    Moving the user space access into the syscall handler directly actually
    makes the code simpler, as the conversion for compat mode can now be done
    on kernel memory.
    
    Link: https://lkml.kernel.org/r/20210727144859.4150043-3-arnd@kernel.org
    Link: https://lore.kernel.org/lkml/YPbtsU4GX6PL7%2F42@infradead.org/
    Link: https://lore.kernel.org/lkml/m1y2cbzmnw.fsf@fess.ebiederm.org/
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Co-developed-by: Eric Biederman <ebiederm@xmission.com>
    Co-developed-by: Christoph Hellwig <hch@infradead.org>
    Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Feng Tang <feng.tang@intel.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    arndb authored and torvalds committed Sep 8, 2021
  15. kexec: move locking into do_kexec_load

    Patch series "compat: remove compat_alloc_user_space", v5.
    
    Going through compat_alloc_user_space() to convert indirect system call
    arguments tends to add complexity compared to handling the native and
    compat logic in the same code.
    
    This patch (of 6):
    
    The locking is the same between the native and compat version of
    sys_kexec_load(), so it can be done in the common implementation to reduce
    duplication.
    
    Link: https://lkml.kernel.org/r/20210727144859.4150043-1-arnd@kernel.org
    Link: https://lkml.kernel.org/r/20210727144859.4150043-2-arnd@kernel.org
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Co-developed-by: Eric Biederman <ebiederm@xmission.com>
    Co-developed-by: Christoph Hellwig <hch@infradead.org>
    Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Feng Tang <feng.tang@intel.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    arndb authored and torvalds committed Sep 8, 2021
  16. mm: migrate: change to use bool type for 'page_was_mapped'

    Change to use bool type for 'page_was_mapped' variable making it more
    readable.
    
    Link: https://lkml.kernel.org/r/ce1279df18d2c163998c403e0b5ec6d3f6f90f7a.1629447552.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Yang Shi <shy828301@gmail.com>
    Cc: Alistair Popple <apopple@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Baolin Wang authored and torvalds committed Sep 8, 2021
  17. mm: migrate: fix the incorrect function name in comments

    since commit a98a2f0 ("mm/rmap: split migration into its own
    function"), the migration ptes establishment has been split into a
    separate try_to_migrate() function, thus update the related comments.
    
    Link: https://lkml.kernel.org/r/5b824bad6183259c916ae6cf42f81d14c6118b06.1629447552.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: Yang Shi <shy828301@gmail.com>
    Reviewed-by: Alistair Popple <apopple@nvidia.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Baolin Wang authored and torvalds committed Sep 8, 2021
Older