Skip to content
Permalink
NeilBrown/Repa…
Switch branches/tags

Commits on Jan 24, 2022

  1. SUNRPC: lock against ->sock changing during sysfs read

    ->sock can be set to NULL asynchronously unless ->recv_mutex is held.
    So it is important to hold that mutex.  Otherwise a sysfs read can
    trigger an oops.
    Commit 17f09d3 ("SUNRPC: Check if the xprt is connected before
    handling sysfs reads") appears to attempt to fix this problem, but it
    only narrows the race window.
    
    Fixes: 17f09d3 ("SUNRPC: Check if the xprt is connected before handling sysfs reads")
    Fixes: a848248 ("SUNRPC query transport's source port")
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  2. NFS: swap-out must always use STABLE writes.

    The commit handling code is not safe against memory-pressure deadlocks
    when writing to swap.  In particular, nfs_commitdata_alloc() blocks
    indefinitely waiting for memory, and this can consume all available
    workqueue threads.
    
    swap-out most likely uses STABLE writes anyway as COND_STABLE indicates
    that a stable write should be used if the write fits in a single
    request, and it normally does.  However if we ever swap with a small
    wsize, or gather unusually large numbers of pages for a single write,
    this might change.
    
    For safety, make it explicit in the code that direct writes used for swap
    must always use FLUSH_COND_STABLE.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  3. NFSv4: keep state manager thread active if swap is enabled

    If we are swapping over NFSv4, we may not be able to allocate memory to
    start the state-manager thread at the time when we need it.
    So keep it always running when swap is enabled, and just signal it to
    start.
    
    This requires updating and testing the cl_swapper count on the root
    rpc_clnt after following all ->cl_parent links.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  4. SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC

    rpc tasks can be marked as RPC_TASK_SWAPPER.  This causes GFP_MEMALLOC
    to be used for some allocations.  This is needed in some cases, but not
    in all where it is currently provided, and in some where it isn't
    provided.
    
    Currently *all* tasks associated with a rpc_client on which swap is
    enabled get the flag and hence some GFP_MEMALLOC support.
    
    GFP_MEMALLOC is provided for ->buf_alloc() but only swap-writes need it.
    However xdr_alloc_bvec does not get GFP_MEMALLOC - though it often does
    need it.
    
    xdr_alloc_bvec is called while the XPRT_LOCK is held.  If this blocks,
    then it blocks all other queued tasks.  So this allocation needs
    GFP_MEMALLOC for *all* requests, not just writes, when the xprt is used
    for any swap writes.
    
    Similarly, if the transport is not connected, that will block all
    requests including swap writes, so memory allocations should get
    GFP_MEMALLOC if swap writes are possible.
    
    So with this patch:
     1/ we ONLY set RPC_TASK_SWAPPER for swap writes.
     2/ __rpc_execute() sets PF_MEMALLOC while handling any task
        with RPC_TASK_SWAPPER set, or when handling any task that
        holds the XPRT_LOCKED lock on an xprt used for swap.
        This removes the need for the RPC_IS_SWAPPER() test
        in ->buf_alloc handlers.
     3/ xprt_prepare_transmit() sets PF_MEMALLOC after locking
        any task to a swapper xprt.  __rpc_execute() will clear it.
     3/ PF_MEMALLOC is set for all the connect workers.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  5. NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS

    NFS_RPC_SWAPFLAGS is only used for READ requests.
    It sets RPC_TASK_SWAPPER which gives some memory-allocation priority to
    requests.  This is not needed for swap READ - though it is for writes
    where it is set via a different mechanism.
    
    RPC_TASK_ROOTCREDS causes the 'machine' credential to be used.
    This is not needed as the root credential is saved when the swap file is
    opened, and this is used for all IO.
    
    So NFS_RPC_SWAPFLAGS isn't needed, and as it is the only user of
    RPC_TASK_ROOTCREDS, that isn't needed either.
    
    Remove both.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  6. SUNRPC: remove scheduling boost for "SWAPPER" tasks.

    Currently, tasks marked as "swapper" tasks get put to the front of
    non-priority rpc_queues, and are sorted earlier than non-swapper tasks on
    the transport's ->xmit_queue.
    
    This is pointless as currently *all* tasks for a mount that has swap
    enabled on *any* file are marked as "swapper" tasks.  So the net result
    is that the non-priority rpc_queues are reverse-ordered (LIFO).
    
    This scheduling boost is not necessary to avoid deadlocks, and hurts
    fairness, so remove it.  If there were a need to expedite some requests,
    the tk_priority mechanism is a more appropriate tool.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  7. SUNRPC/xprt: async tasks mustn't block waiting for memory

    When memory is short, new worker threads cannot be created and we depend
    on the minimum one rpciod thread to be able to handle everything.  So it
    must not block waiting for memory.
    
    xprt_dynamic_alloc_slot can block indefinitely.  This can tie up all
    workqueue threads and NFS can deadlock.  So when called from a
    workqueue, set __GFP_NORETRY.
    
    The rdma alloc_slot already does not block.  However it sets the error
    to -EAGAIN suggesting this will trigger a sleep.  It does not.  As we
    can see in call_reserveresult(), only -ENOMEM causes a sleep.  -EAGAIN
    causes immediate retry.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  8. SUNRPC/auth: async tasks mustn't block waiting for memory

    When memory is short, new worker threads cannot be created and we depend
    on the minimum one rpciod thread to be able to handle everything.  So it
    must not block waiting for memory.
    
    mempools are particularly a problem as memory can only be released back
    to the mempool by an async rpc task running.  If all available workqueue
    threads are waiting on the mempool, no thread is available to return
    anything.
    
    lookup_cred() can block on a mempool or kmalloc - and this can cause
    deadlocks.  So add a new RPCAUTH_LOOKUP flag for async lookups and don't
    block on memory.  If the -ENOMEM gets back to call_refreshresult(), wait
    a short while and try again.  HZ>>4 is chosen as it is used elsewhere
    for -ENOMEM retries.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  9. SUNRPC/call_alloc: async tasks mustn't block waiting for memory

    When memory is short, new worker threads cannot be created and we depend
    on the minimum one rpciod thread to be able to handle everything.
    So it must not block waiting for memory.
    
    mempools are particularly a problem as memory can only be released back
    to the mempool by an async rpc task running.  If all available
    workqueue threads are waiting on the mempool, no thread is available to
    return anything.
    
    rpc_malloc() can block, and this might cause deadlocks.
    So check RPC_IS_ASYNC(), rather than RPC_IS_SWAPPER() to determine if
    blocking is acceptable.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  10. NFS: swap IO handling is slightly different for O_DIRECT IO

    1/ Taking the i_rwsem for swap IO triggers lockdep warnings regarding
       possible deadlocks with "fs_reclaim".  These deadlocks could, I believe,
       eventuate if a buffered read on the swapfile was attempted.
    
       We don't need coherence with the page cache for a swap file, and
       buffered writes are forbidden anyway.  There is no other need for
       i_rwsem during direct IO.  So never take it for swap_rw()
    
    2/ generic_write_checks() explicitly forbids writes to swap, and
       performs checks that are not needed for swap.  So bypass it
       for swap_rw().
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  11. NFS: rename nfs_direct_IO and use as ->swap_rw

    The nfs_direct_IO() exists to support SWAP IO, but hasn't worked for a
    while.  We now need a ->swap_rw function which behaves slightly
    differently, returning zero for success rather than a byte count.
    
    So modify nfs_direct_IO accordingly, rename it, and use it as the
    ->swap_rw function.
    
    Note: it still won't work - that will be fixed in later patches.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  12. NFS: remove IS_SWAPFILE hack

    This code is pointless as IS_SWAPFILE is always defined.
    So remove it.
    
    Suggested-by: Mark Hemment <markhemm@googlemail.com>
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  13. VFS: Add FMODE_CAN_ODIRECT file flag

    Currently various places test if direct IO is possible on a file by
    checking for the existence of the direct_IO address space operation.
    This is a poor choice, as the direct_IO operation may not be used - it is
    only used if the generic_file_*_iter functions are called for direct IO
    and some filesystems - particularly NFS - don't do this.
    
    Instead, introduce a new f_mode flag: FMODE_CAN_ODIRECT and change the
    various places to check this (avoiding pointer dereferences).
    do_dentry_open() will set this flag if ->direct_IO is present, so
    filesystems do not need to be changed.
    
    NFS *is* changed, to set the flag explicitly and discard the direct_IO
    entry in the address_space_operations for files.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  14. MM: submit multipage write for SWP_FS_OPS swap-space

    swap_writepage() is given one page at a time, but may be called repeatedly
    in succession.
    For block-device swapspace, the blk_plug functionality allows the
    multiple pages to be combined together at lower layers.
    That cannot be used for SWP_FS_OPS as blk_plug may not exist - it is
    only active when CONFIG_BLOCK=y.  Consequently all swap reads over NFS
    are single page reads.
    
    With this patch we pass a pointer-to-pointer via the wbc.
    swap_writepage can store state between calls - much like the pointer
    passed explicitly to swap_readpage.  After calling swap_writepage() some
    number of times, the state will be passed to swap_write_unplug() which
    can submit the combined request.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  15. MM: submit multipage reads for SWP_FS_OPS swap-space

    swap_readpage() is given one page at a time, but maybe called repeatedly
    in succession.
    For block-device swapspace, the blk_plug functionality allows the
    multiple pages to be combined together at lower layers.
    That cannot be used for SWP_FS_OPS as blk_plug may not exist - it is
    only active when CONFIG_BLOCK=y.  Consequently all swap reads over NFS
    are single page reads.
    
    With this patch we pass in a pointer-to-pointer when swap_readpage can
    store state between calls - much like the effect of blk_plug.  After
    calling swap_readpage() some number of times, the state will be passed
    to swap_read_unplug() which can submit the combined request.
    
    Some caller currently call blk_finish_plug() *before* the final call to
    swap_readpage(), so the last page cannot be included.  This patch moves
    blk_finish_plug() to after the last call, and calls swap_read_unplug()
    there too.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  16. DOC: update documentation for swap_activate and swap_rw

    This documentation for ->swap_activate() has been out-of-date for a long
    time.  This patch updates it to match recent changes, and adds
    documentation for the associated ->swap_rw()
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  17. MM: perform async writes to SWP_FS_OPS swap-space using ->swap_rw

    This patch switches swap-out to SWP_FS_OPS swap-spaces to use ->swap_rw
    and makes the writes asynchronous, like they are for other swap spaces.
    
    To make it async we need to allocate the kiocb struct from a mempool.
    This may block, but won't block as long as waiting for the write to
    complete.  At most it will wait for some previous swap IO to complete.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  18. MM: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space

    swap currently uses ->readpage to read swap pages.  This can only
    request one page at a time from the filesystem, which is not most
    efficient.
    
    swap uses ->direct_IO for writes which while this is adequate is an
    inappropriate over-loading.  ->direct_IO may need to had handle allocate
    space for holes or other details that are not relevant for swap.
    
    So this patch introduces a new address_space operation: ->swap_rw.
    In this patch it is used for reads, and a subsequent patch will switch
    writes to use it.
    
    No filesystem yet supports ->swap_rw, but that is not a problem because
    no filesystem actually works with filesystem-based swap.
    Only two filesystems set SWP_FS_OPS:
    - cifs sets the flag, but ->direct_IO always fails so swap cannot work.
    - nfs sets the flag, but ->direct_IO calls generic_write_checks()
      which has failed on swap files for several releases.
    
    To ensure that a NULL ->swap_rw isn't called, ->activate_swap() for both
    NFS and cifs are changed to fail if ->swap_rw is not set.  This can be
    removed if/when the function is added.
    
    Future patches will restore swap-over-NFS functionality.
    
    To submit an async read with ->swap_rw() we need to allocate a structure
    to hold the kiocb and other details.  swap_readpage() cannot handle
    transient failure, so we create a mempool to provide the structures.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  19. MM: reclaim mustn't enter FS for SWP_FS_OPS swap-space

    If swap-out is using filesystem operations (SWP_FS_OPS), then it is not
    safe to enter the FS for reclaim.
    So only down-grade the requirement for swap pages to __GFP_IO after
    checking that SWP_FS_OPS are not being used.
    
    This makes the calculation of "may_enter_fs" slightly more complex, so
    move it into a separate function.  with that done, there is little value
    in maintaining the bool variable any more.  So replace the
    may_enter_fs variable with a may_enter_fs() function.  This removes any
    risk for the variable becoming out-of-date.
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  20. MM: move responsibility for setting SWP_FS_OPS to ->swap_activate

    If a filesystem wishes to handle all swap IO itself (via ->direct_IO),
    rather than just providing devices addresses for submit_bio(),
    SWP_FS_OPS must be set.
    Currently the protocol for setting this it to have ->swap_activate
    return zero.  In that case SWP_FS_OPS is set, and add_swap_extent()
    is called for the entire file.
    
    This is a little clumsy as different return values for ->swap_activate
    have quite different meanings, and it makes it hard to search for which
    filesystems require SWP_FS_OPS to be set.
    
    So remove the special meaning of a zero return, and require the
    filesystem to set SWP_FS_OPS if it so desires, and to always call
    add_swap_extent() as required.
    
    Currently only NFS and CIFS return zero for add_swap_extent().
    
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  21. MM: drop swap_set_page_dirty

    Pages that are written to swap are owned by the MM subsystem - not any
    filesystem.
    
    When such a page is passed to a filesystem to be written out to a
    swap-file, the filesystem handles the data, but the page itself does not
    belong to the filesystem.  So calling the filesystem's set_page_dirty
    address_space operation makes no sense.  This is for pages in the given
    address space, and a page to be written to swap does not exist in the
    given address space.
    
    So drop swap_set_page_dirty() which calls the address-space's
    set_page_dirty, and alway use __set_page_dirty_no_writeback, which is
    appropriate for pages being swapped out.
    
    Fixes-no-auto-backport: 62c230b ("mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages")
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  22. MM: extend block-plugging to cover all swap reads with read-ahead

    Code that does swap read-ahead uses blk_start_plug() and
    blk_finish_plug() to allow lower levels to combine multiple read-ahead
    pages into a single request, but calls blk_finish_plug() *before*
    submitting the original (non-ahead) read request.
    This missed an opportunity to combine read requests.
    
    This patch moves the blk_finish_plug to *after* all the reads.
    This will likely combine the primary read with some of the "ahead"
    reads, and that may slightly increase the latency of that read, but it
    should more than make up for this by making more efficient use of the
    storage path.
    
    The patch mostly makes the code look more consistent.  Performance
    change is unlikely to be noticeable.
    
    Fixes-no-auto-backport: 3fb5c29 ("swap: allow swap readahead to be merged")
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022
  23. MM: create new mm/swap.h header file.

    Many functions declared in include/linux/swap.h are only used within mm/
    
    Create a new "mm/swap.h" and move some of these declarations there.
    Remove the redundant 'extern' from the function declarations.
    
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: NeilBrown <neilb@suse.de>
    neilbrown authored and intel-lab-lkp committed Jan 24, 2022

Commits on Jan 23, 2022

  1. Merge tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kern…

    …el/git/powerpc/linux
    
    Pull powerpc fixes from Michael Ellerman:
    
     - A series of bpf fixes, including an oops fix and some codegen fixes.
    
     - Fix a regression in syscall_get_arch() for compat processes.
    
     - Fix boot failure on some 32-bit systems with KASAN enabled.
    
     - A couple of other build/minor fixes.
    
    Thanks to Athira Rajeev, Christophe Leroy, Dmitry V. Levin, Jiri Olsa,
    Johan Almbladh, Maxime Bizon, Naveen N. Rao, and Nicholas Piggin.
    
    * tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc/64s: Mask SRR0 before checking against the masked NIP
      powerpc/perf: Only define power_pmu_wants_prompt_pmi() for CONFIG_PPC64
      powerpc/32s: Fix kasan_init_region() for KASAN
      powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32
      powerpc/audit: Fix syscall_get_arch()
      powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06
      tools/bpf: Rename 'struct event' to avoid naming conflict
      powerpc/bpf: Update ldimm64 instructions during extra pass
      powerpc32/bpf: Fix codegen for bpf-to-bpf calls
      bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
    torvalds committed Jan 23, 2022
  2. Merge tag 'irq_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/tip/tip
    
    Pull irq fix from Borislav Petkov:
     "A single use-after-free fix in the PCI MSI irq domain allocation path"
    
    * tag 'irq_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      PCI/MSI: Prevent UAF in error path
    torvalds committed Jan 23, 2022
  3. Merge tag 'sched_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/sc…

    …m/linux/kernel/git/tip/tip
    
    Pull scheduler fixes from Borislav Petkov:
     "A bunch of fixes: forced idle time accounting, utilization values
      propagation in the sched hierarchies and other minor cleanups and
      improvements"
    
    * tag 'sched_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      kernel/sched: Remove dl_boosted flag comment
      sched: Avoid double preemption in __cond_resched_*lock*()
      sched/fair: Fix all kernel-doc warnings
      sched/core: Accounting forceidle time for all tasks except idle task
      sched/pelt: Relax the sync of load_sum with load_avg
      sched/pelt: Relax the sync of runnable_sum with runnable_avg
      sched/pelt: Continue to relax the sync of util_sum with util_avg
      sched/pelt: Relax the sync of util_sum with util_avg
      psi: Fix uaf issue when psi trigger is destroyed while being polled
    torvalds committed Jan 23, 2022
  4. Merge tag 'perf_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/tip/tip
    
    Pull perf fixes from Borislav Petkov:
    
     - Add support for accessing the general purpose counters on Alder Lake
       via MMIO
    
     - Add new LBR format v7 support which is v5 modulo TSX
    
     - Fix counter enumeration on Alder Lake hybrids
    
     - Overhaul how context time updates are done and get rid of
       perf_event::shadow_ctx_time.
    
     - The usual amount of fixes: event mask correction, supported event
       types reporting, etc.
    
    * tag 'perf_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      x86/perf: Avoid warning for Arch LBR without XSAVE
      perf/x86/intel/uncore: Add IMC uncore support for ADL
      perf/x86/intel/lbr: Add static_branch for LBR INFO flags
      perf/x86/intel/lbr: Support LBR format V7
      perf/x86/rapl: fix AMD event handling
      perf/x86/intel/uncore: Fix CAS_COUNT_WRITE issue for ICX
      perf/x86/intel: Add a quirk for the calculation of the number of counters on Alder Lake
      perf: Fix perf_event_read_local() time
    torvalds committed Jan 23, 2022
  5. Linux 5.17-rc1

    torvalds committed Jan 23, 2022
  6. Merge tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/p…

    …ub/scm/linux/kernel/git/acme/linux
    
    Pull more perf tools updates from Arnaldo Carvalho de Melo:
    
     - Fix printing 'phys_addr' in 'perf script'.
    
     - Fix failure to add events with 'perf probe' in ppc64 due to not
       removing leading dot (ppc64 ABIv1).
    
     - Fix cpu_map__item() python binding building.
    
     - Support event alias in form foo-bar-baz, add pmu-events and
       parse-event tests for it.
    
     - No need to setup affinities when starting a workload or attaching to
       a pid.
    
     - Use path__join() to compose a path instead of ad-hoc snprintf()
       equivalent.
    
     - Override attr->sample_period for non-libpfm4 events.
    
     - Use libperf cpumap APIs instead of accessing the internal state
       directly.
    
     - Sync x86 arch prctl headers and files changed by the new
       set_mempolicy_home_node syscall with the kernel sources.
    
     - Remove duplicate include in cpumap.h.
    
     - Remove redundant err variable.
    
    * tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
      perf tools: Remove redundant err variable
      perf test: Add parse-events test for aliases with hyphens
      perf test: Add pmu-events test for aliases with hyphens
      perf parse-events: Support event alias in form foo-bar-baz
      perf evsel: Override attr->sample_period for non-libpfm4 events
      perf cpumap: Remove duplicate include in cpumap.h
      perf cpumap: Migrate to libperf cpumap api
      perf python: Fix cpu_map__item() building
      perf script: Fix printing 'phys_addr' failure issue
      tools headers UAPI: Sync files changed by new set_mempolicy_home_node syscall
      tools headers UAPI: Sync x86 arch prctl headers with the kernel sources
      perf machine: Use path__join() to compose a path instead of snprintf(dir, '/', filename)
      perf evlist: No need to setup affinities when disabling events for pid targets
      perf evlist: No need to setup affinities when enabling events for pid targets
      perf stat: No need to setup affinities when starting a workload
      perf affinity: Allow passing a NULL arg to affinity__cleanup()
      perf probe: Fix ppc64 'perf probe add events failed' case
    torvalds committed Jan 23, 2022
  7. Merge tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/rostedt/linux-trace
    
    Pull ftrace fix from Steven Rostedt:
     "Fix s390 breakage from sorting mcount tables.
    
      The latest merge of the tracing tree sorts the mcount table at build
      time. But s390 appears to do things differently (like always) and
      replaces the sorted table back to the original unsorted one. As the
      ftrace algorithm depends on it being sorted, bad things happen when it
      is not, and s390 experienced those bad things.
    
      Add a new config to tell the boot if the mcount table is sorted or
      not, and allow s390 to opt out of it"
    
    * tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
      ftrace: Fix assuming build time sort works for s390
    torvalds committed Jan 23, 2022
  8. ftrace: Fix assuming build time sort works for s390

    To speed up the boot process, as mcount_loc needs to be sorted for ftrace
    to work properly, sorting it at build time is more efficient than boot up
    and can save milliseconds of time. Unfortunately, this change broke s390
    as it will modify the mcount_loc location after the sorting takes place
    and will put back the unsorted locations. Since the sorting is skipped at
    boot up if it is believed that it was sorted at run time, ftrace can crash
    as its algorithms are dependent on the list being sorted.
    
    Add a new config BUILDTIME_MCOUNT_SORT that is set when
    BUILDTIME_TABLE_SORT but not if S390 is set. Use this config to determine
    if sorting should take place at boot up.
    
    Link: https://lore.kernel.org/all/yt9dee51ctfn.fsf@linux.ibm.com/
    
    Fixes: 72b3942 ("scripts: ftrace - move the sort-processing in ftrace_init")
    Reported-by: Sven Schnelle <svens@linux.ibm.com>
    Tested-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Jan 23, 2022
  9. Merge tag 'kbuild-fixes-v5.17' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/masahiroy/linux-kbuild
    
    Pull Kbuild fixes from Masahiro Yamada:
    
     - Bring include/uapi/linux/nfc.h into the UAPI compile-test coverage
    
     - Revert the workaround of CONFIG_CC_IMPLICIT_FALLTHROUGH
    
     - Fix build errors in certs/Makefile
    
    * tag 'kbuild-fixes-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
      certs: Fix build error when CONFIG_MODULE_SIG_KEY is empty
      certs: Fix build error when CONFIG_MODULE_SIG_KEY is PKCS#11 URI
      Revert "Makefile: Do not quote value for CONFIG_CC_IMPLICIT_FALLTHROUGH"
      usr/include/Makefile: add linux/nfc.h to the compile-test coverage
    torvalds committed Jan 23, 2022
  10. Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux

    Pull bitmap updates from Yury Norov:
    
     - introduce for_each_set_bitrange()
    
     - use find_first_*_bit() instead of find_next_*_bit() where possible
    
     - unify for_each_bit() macros
    
    * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
      vsprintf: rework bitmap_list_string
      lib: bitmap: add performance test for bitmap_print_to_pagebuf
      bitmap: unify find_bit operations
      mm/percpu: micro-optimize pcpu_is_populated()
      Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
      find: micro-optimize for_each_{set,clear}_bit()
      include/linux: move for_each_bit() macros from bitops.h to find.h
      cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
      tools: sync tools/bitmap with mother linux
      all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
      cpumask: use find_first_and_bit()
      lib: add find_first_and_bit()
      arch: remove GENERIC_FIND_FIRST_BIT entirely
      include: move find.h from asm_generic to linux
      bitops: move find_bit_*_le functions from le.h to find.h
      bitops: protect find_first_{,zero}_bit properly
    torvalds committed Jan 23, 2022

Commits on Jan 22, 2022

  1. perf tools: Remove redundant err variable

    Return value from perf_event__process_tracing_data() directly instead
    of taking this in another redundant variable.
    
    Reported-by: Zeal Robot <zealci@zte.com.cn>
    Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: http://lore.kernel.org/lkml/20220112080109.666800-1-chi.minghao@zte.com.cn
    Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Minghao Chi authored and Arnaldo Carvalho de Melo committed Jan 22, 2022
  2. perf test: Add parse-events test for aliases with hyphens

    Add a test which allows us to test parsing an event alias with hyphens.
    
    Since these events typically do not exist on most host systems, add the
    alias to the fake pmu.
    
    Function perf_pmu__test_parse_init() has terms added to match known test
    aliases.
    
    Signed-off-by: John Garry <john.garry@huawei.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Qi Liu <liuqi115@huawei.com>
    Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
    Cc: linuxarm@huawei.com
    Link: https://lore.kernel.org/r/1642432215-234089-4-git-send-email-john.garry@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    johnpgarry authored and Arnaldo Carvalho de Melo committed Jan 22, 2022
Older