Skip to content
Permalink
Waiman-Long/cg…
Switch branches/tags

Commits on Aug 14, 2021

  1. kselftest/cgroup: Add cpuset v2 partition root state test

    Add a test script test_cpuset_prs.sh with a helper program wait_inotify
    for exercising the cpuset v2 partition root state code.
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Waiman Long authored and intel-lab-lkp committed Aug 14, 2021
  2. cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-…

    …v2.rst
    
    Update Documentation/admin-guide/cgroup-v2.rst on the newly introduced
    "isolated" cpuset partition type as well as the ability to create
    non-top cpuset partition with no cpu allocated to it.
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Waiman Long authored and intel-lab-lkp committed Aug 14, 2021
  3. cgroup/cpuset: Allow non-top parent partition to distribute out all CPUs

    Currently, a parent partition cannot distribute all its CPUs to child
    partitions with no CPUs left. However in some use cases, a management
    application may want to create a parent partition as a management unit
    with no task associated with it and has all its CPUs distributed to
    various child partitions dynamically according to their needs. Leaving
    a cpu in the parent partition in such a case is now a waste.
    
    To accommodate such use cases, a parent partition can now have all its
    CPUs distributed to its child partitions with 0 effective cpu left as
    long as it is not the top cpuset and it has no task at the time the
    child partition is being created. A terminal partition with no child
    partition underlying it, however, cannot have 0 effective cpu which
    will make the partition invalid.
    
    Once an empty parent partition is formed, no new task can be moved
    into it.
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Waiman Long authored and intel-lab-lkp committed Aug 14, 2021
  4. cgroup/cpuset: Add a new isolated cpus.partition type

    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=TBD
    
    commit 994fb794cb252edd124a46ca0994e37a4726a100
    Author: Waiman Long <longman@redhat.com>
    Date:   Sat, 19 Jun 2021 13:28:19 -0400
    
        cgroup/cpuset: Add a new isolated cpus.partition type
    
        Cpuset v1 uses the sched_load_balance control file to determine if load
        balancing should be enabled.  Cpuset v2 gets rid of sched_load_balance
        as its use may require disabling load balancing at cgroup root.
    
        For workloads that require very low latency like DPDK, the latency
        jitters caused by periodic load balancing may exceed the desired
        latency limit.
    
        When cpuset v2 is in use, the only way to avoid this latency cost is to
        use the "isolcpus=" kernel boot option to isolate a set of CPUs. After
        the kernel boot, however, there is no way to add or remove CPUs from
        this isolated set. For workloads that are more dynamic in nature, that
        means users have to provision enough CPUs for the worst case situation
        resulting in excess idle CPUs.
    
        To address this issue for cpuset v2, a new cpuset.cpus.partition type
        "isolated" is added which allows the creation of a cpuset partition
        without load balancing. This will allow system administrators to
        dynamically adjust the size of isolated partition to the current need
        of the workload without rebooting the system.
    
        Signed-off-by: Waiman Long <longman@redhat.com>
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Waiman Long authored and intel-lab-lkp committed Aug 14, 2021
  5. cgroup/cpuset: Show invalid partition reason string

    There are a number of different reasons which can cause a partition to
    become invalid. A user seeing an invalid partition may not know exactly
    why. To help user to get a better understanding of the underlying reason,
    The cpuset.cpus.partition control file, when read, will now report the
    reason why a partition become invalid. When a partition does become
    invalid, reading the control file will show "root invalid (<reason>)"
    where <reason> is a string that describes why the partition is invalid.
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Waiman Long authored and intel-lab-lkp committed Aug 14, 2021
  6. cgroup/cpuset: Properly transition to invalid partition

    For cpuset partition, the special state of PRS_ERROR (invalid partition
    root) was originally designed to handle hotplug events.  In this state,
    CPUs allocated to the partition root is released back to the parent
    but the cpuset exclusive flags remain unchanged.
    
    Changing a cpuset into a partition root is strictly controlled. The
    following constraints must be satisfied in order to make the transition
    possible:
    
     - The "cpuset.cpus" is not empty and the list of CPUs are exclusive,
       i.e. they are not shared by any of its siblings.
     - The parent cgroup is a partition root.
     - The "cpuset.cpus" is a subset of the parent's "cpuset.cpus.effective".
     - There is no child cgroups with cpuset enabled.
    
    Changing a partition root back to a member is always allowed, though care
    must be taken to make sure that this change won't break child cpusets,
    if present.
    
    Since partition root sets the CPU_EXCLUSIVE flag, cpuset.cpus changes
    that break the cpu exclusivity rule will not be allowed. However,
    other changes to cpuset.cpus on a partition root may still cause it to
    become invalid. So users must always check the partition root state of
    "cpuset.cpus.partition" after making changes to cpuset.cpus to make sure
    that the partition root is still valid.
    
    For a partition root tree with parent and child partition roots, there
    are two cases where the child partitions can become invalid. Firstly,
    changing partition state to "member" will force the child partitions
    to become invalid.
    
    Secondly, if some cpus are taken away from the parent partition root
    so that its cpuset.cpus.effective becomes empty, it will try to pull
    cpus away from the child partitions and force them to become invalid
    which may allow the parent partition to remain valid.
    
    This patch makes sure that partitions are properly changed to invalid
    when some of the valid partition constraints are violated.
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Waiman Long authored and intel-lab-lkp committed Aug 14, 2021

Commits on Aug 12, 2021

  1. cgroup/cpuset: Enable memory migration for cpuset v2

    When a user changes cpuset.cpus, each task in a v2 cpuset will be moved
    to one of the new cpus if it is not there already. For memory, however,
    they won't be migrated to the new nodes when cpuset.mems changes. This is
    an inconsistency in behavior.
    
    In cpuset v1, there is a memory_migrate control file to enable such
    behavior by setting the CS_MEMORY_MIGRATE flag. Make it the default
    for cpuset v2 so that we have a consistent set of behavior for both
    cpus and memory.
    
    There is certainly a cost to make memory migration the default, but it
    is a one time cost that shouldn't really matter as long as cpuset.mems
    isn't changed frequenty.  Update the cgroup-v2.rst file to document the
    new behavior and recommend against changing cpuset.mems frequently.
    
    Since there won't be any concurrent access to the newly allocated cpuset
    structure in cpuset_css_alloc(), we can use the cheaper non-atomic
    __set_bit() instead of the more expensive atomic set_bit().
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Waiman Long authored and htejun committed Aug 12, 2021

Commits on Aug 11, 2021

  1. cgroup/cpuset: Enable event notification when partition state changes

    A valid cpuset partition can become invalid if all its CPUs are offlined
    or somehow removed. This can happen through external events without
    "cpuset.cpus.partition" being touched at all.
    
    Users that rely on the property of a partition being present do not
    currently have a simple way to get such an event notified other than
    constant periodic polling which is both inefficient and cumbersome.
    
    To make life easier for those users, event notification is now enabled
    for "cpuset.cpus.partition" whenever its state changes.
    
    Suggested-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Waiman Long authored and htejun committed Aug 11, 2021
  2. cgroup: cgroup-v1: clean up kernel-doc notation

    Fix kernel-doc warnings found in cgroup-v1.c:
    
    kernel/cgroup/cgroup-v1.c:55: warning: No description found for return value of 'cgroup_attach_task_all'
    kernel/cgroup/cgroup-v1.c:94: warning: expecting prototype for cgroup_trasnsfer_tasks(). Prototype was for cgroup_transfer_tasks() instead
    cgroup-v1.c:96: warning: No description found for return value of 'cgroup_transfer_tasks'
    kernel/cgroup/cgroup-v1.c:687: warning: No description found for return value of 'cgroupstats_build'
    
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Zefan Li <lizefan.x@bytedance.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: cgroups@vger.kernel.org
    Signed-off-by: Tejun Heo <tj@kernel.org>
    rddunlap authored and htejun committed Aug 11, 2021

Commits on Aug 9, 2021

  1. cgroup: Replace deprecated CPU-hotplug functions.

    The functions get_online_cpus() and put_online_cpus() have been
    deprecated during the CPU hotplug rework. They map directly to
    cpus_read_lock() and cpus_read_unlock().
    
    Replace deprecated CPU-hotplug functions with the official version.
    The behavior remains unchanged.
    
    Cc: Zefan Li <lizefan.x@bytedance.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: cgroups@vger.kernel.org
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Sebastian Andrzej Siewior authored and htejun committed Aug 9, 2021
  2. cgroup/cpuset: Fix violation of cpuset locking rule

    The cpuset fields that manage partition root state do not strictly
    follow the cpuset locking rule that update to cpuset has to be done
    with both the callback_lock and cpuset_mutex held. This is now fixed
    by making sure that the locking rule is upheld.
    
    Fixes: 3881b86 ("cpuset: Add an error state to cpuset.sched.partition")
    Fixes: 4b842da ("cpuset: Make CPU hotplug work with partition")
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Waiman Long authored and htejun committed Aug 9, 2021

Commits on Jul 27, 2021

  1. cgroup: rstat: fix A-A deadlock on 32bit around u64_stats_sync

    0fa294f ("cgroup: Replace cgroup_rstat_mutex with a spinlock") added
    cgroup_rstat_flush_irqsafe() allowing flushing to happen from the irq
    context. However, rstat paths use u64_stats_sync to synchronize access to
    64bit stat counters on 32bit machines. u64_stats_sync is implemented using
    seq_lock and trying to read from an irq context can lead to A-A deadlock if
    the irq happens to interrupt the stat update.
    
    Fix it by using the irqsafe variants - u64_stats_update_begin_irqsave() and
    u64_stats_update_end_irqrestore() - in the update paths. Note that none of
    this matters on 64bit machines. All these are just for 32bit SMP setups.
    
    Note that the interface was introduced way back, its first and currently
    only use was recently added by 2d146aa ("mm: memcontrol: switch to
    rstat"). Stable tagging targets this commit.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reported-by: Rik van Riel <riel@surriel.com>
    Fixes: 2d146aa ("mm: memcontrol: switch to rstat")
    Cc: stable@vger.kernel.org # v5.13+
    htejun committed Jul 27, 2021

Commits on Jul 26, 2021

  1. cgroup/cpuset: Fix a partition bug with hotplug

    In cpuset_hotplug_workfn(), the detection of whether the cpu list
    has been changed is done by comparing the effective cpus of the top
    cpuset with the cpu_active_mask. However, in the rare case that just
    all the CPUs in the subparts_cpus are offlined, the detection fails
    and the partition states are not updated correctly. Fix it by forcing
    the cpus_updated flag to true in this particular case.
    
    Fixes: 4b842da ("cpuset: Make CPU hotplug work with partition")
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Waiman Long authored and htejun committed Jul 26, 2021
  2. cgroup/cpuset: Miscellaneous code cleanup

    Use more descriptive variable names for update_prstate(), remove
    unnecessary code and fix some typos. There is no functional change.
    
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Waiman Long authored and htejun committed Jul 26, 2021

Commits on Jul 21, 2021

  1. cgroup1: fix leaked context root causing sporadic NULL deref in LTP

    Richard reported sporadic (roughly one in 10 or so) null dereferences and
    other strange behaviour for a set of automated LTP tests.  Things like:
    
       BUG: kernel NULL pointer dereference, address: 0000000000000008
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
       RIP: 0010:kernfs_sop_show_path+0x1b/0x60
    
    ...or these others:
    
       RIP: 0010:do_mkdirat+0x6a/0xf0
       RIP: 0010:d_alloc_parallel+0x98/0x510
       RIP: 0010:do_readlinkat+0x86/0x120
    
    There were other less common instances of some kind of a general scribble
    but the common theme was mount and cgroup and a dubious dentry triggering
    the NULL dereference.  I was only able to reproduce it under qemu by
    replicating Richard's setup as closely as possible - I never did get it
    to happen on bare metal, even while keeping everything else the same.
    
    In commit 71d883c ("cgroup_do_mount(): massage calling conventions")
    we see this as a part of the overall change:
    
       --------------
               struct cgroup_subsys *ss;
       -       struct dentry *dentry;
    
       [...]
    
       -       dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root,
       -                                CGROUP_SUPER_MAGIC, ns);
    
       [...]
    
       -       if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
       -               struct super_block *sb = dentry->d_sb;
       -               dput(dentry);
       +       ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns);
       +       if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
       +               struct super_block *sb = fc->root->d_sb;
       +               dput(fc->root);
                       deactivate_locked_super(sb);
                       msleep(10);
                       return restart_syscall();
               }
       --------------
    
    In changing from the local "*dentry" variable to using fc->root, we now
    export/leave that dentry pointer in the file context after doing the dput()
    in the unlikely "is_dying" case.   With LTP doing a crazy amount of back to
    back mount/unmount [testcases/bin/cgroup_regression_5_1.sh] the unlikely
    becomes slightly likely and then bad things happen.
    
    A fix would be to not leave the stale reference in fc->root as follows:
    
       --------------
                      dput(fc->root);
      +               fc->root = NULL;
                      deactivate_locked_super(sb);
       --------------
    
    ...but then we are just open-coding a duplicate of fc_drop_locked() so we
    simply use that instead.
    
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Zefan Li <lizefan.x@bytedance.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: stable@vger.kernel.org      # v5.1+
    Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
    Fixes: 71d883c ("cgroup_do_mount(): massage calling conventions")
    Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    paulgortmaker authored and htejun committed Jul 21, 2021

Commits on Jul 20, 2021

  1. seq_file: disallow extremely large seq buffer allocations

    There is no reasonable need for a buffer larger than this, and it avoids
    int overflow pitfalls.
    
    Fixes: 058504e ("fs/seq_file: fallback to vmalloc allocation")
    Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
    Reported-by: Qualys Security Advisory <qsa@qualys.com>
    Signed-off-by: Eric Sandeen <sandeen@redhat.com>
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Eric Sandeen authored and torvalds committed Jul 20, 2021

Commits on Jul 18, 2021

  1. Linux 5.14-rc2

    torvalds committed Jul 18, 2021
  2. Merge tag 'perf-tools-fixes-for-v5.14-2021-07-18' of git://git.kernel…

    ….org/pub/scm/linux/kernel/git/acme/linux
    
    Pull perf tools fixes from Arnaldo Carvalho de Melo:
    
     - Skip invalid hybrid PMU on hybrid systems when the atom (little) CPUs
       are offlined.
    
     - Fix 'perf test' problems related to the recently added hybrid
       (BIG/little) code.
    
     - Split ARM's coresight (hw tracing) decode by aux records to avoid
       fatal decoding errors.
    
     - Fix add event failure in 'perf probe' when running 32-bit perf in a
       64-bit kernel.
    
     - Fix 'perf sched record' failure when CONFIG_SCHEDSTATS is not set.
    
     - Fix memory and refcount leaks detected by ASAn when running 'perf
       test', should be clean of warnings now.
    
     - Remove broken definition of __LITTLE_ENDIAN from tools'
       linux/kconfig.h, which was breaking the build in some systems.
    
     - Cast PTHREAD_STACK_MIN to int as it may turn into 'long
       sysconf(__SC_THREAD_STACK_MIN_VALUE), breaking the build in some
       systems.
    
     - Fix libperf build error with LIBPFM4=1.
    
     - Sync UAPI files changed by the memfd_secret new syscall.
    
    * tag 'perf-tools-fixes-for-v5.14-2021-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (35 commits)
      perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set
      perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel
      perf data: Close all files in close_dir()
      perf probe-file: Delete namelist in del_events() on the error path
      perf test bpf: Free obj_buf
      perf trace: Free strings in trace__parse_events_option()
      perf trace: Free syscall tp fields in evsel->priv
      perf trace: Free syscall->arg_fmt
      perf trace: Free malloc'd trace fields on exit
      perf lzma: Close lzma stream on exit
      perf script: Fix memory 'threads' and 'cpus' leaks on exit
      perf script: Release zstd data
      perf session: Cleanup trace_event
      perf inject: Close inject.output on exit
      perf report: Free generated help strings for sort option
      perf env: Fix memory leak of cpu_pmu_caps
      perf test maps__merge_in: Fix memory leak of maps
      perf dso: Fix memory leak in dso__new_map()
      perf test event_update: Fix memory leak of unit
      perf test event_update: Fix memory leak of evlist
      ...
    torvalds committed Jul 18, 2021
  3. Merge tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/x…

    …fs-linux
    
    Pull xfs fixes from Darrick Wong:
     "A few fixes for issues in the new online shrink code, additional
      corrections for my recent bug-hunt w.r.t. extent size hints on
      realtime, and improved input checking of the GROWFSRT ioctl.
    
      IOW, the usual 'I somehow got bored during the merge window and
      resumed auditing the farther reaches of xfs':
    
       - Fix shrink eligibility checking when sparse inode clusters enabled
    
       - Reset '..' directory entries when unlinking directories to prevent
         verifier errors if fs is shrinked later
    
       - Don't report unusable extent size hints to FSGETXATTR
    
       - Don't warn when extent size hints are unusable because the sysadmin
         configured them that way
    
       - Fix insufficient parameter validation in GROWFSRT ioctl
    
       - Fix integer overflow when adding rt volumes to filesystem"
    
    * tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
      xfs: detect misaligned rtinherit directory extent size hints
      xfs: fix an integer overflow error in xfs_growfs_rt
      xfs: improve FSGROWFSRT precondition checking
      xfs: don't expose misaligned extszinherit hints to userspace
      xfs: correct the narrative around misaligned rtinherit/extszinherit dirs
      xfs: reset child dir '..' entry when unlinking child
      xfs: check for sparse inode clusters that cross new EOAG when shrinking
    torvalds committed Jul 18, 2021
  4. Merge tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs…

    …/xfs-linux
    
    Pull iomap fixes from Darrick Wong:
     "A handful of bugfixes for the iomap code.
    
      There's nothing especially exciting here, just fixes for UBSAN (not
      KASAN as I erroneously wrote in the tag message) warnings about
      undefined behavior in the SEEK_DATA/SEEK_HOLE code, and some
      reshuffling of per-page block state info to fix some problems with
      gfs2.
    
       - Fix KASAN warnings due to integer overflow in SEEK_DATA/SEEK_HOLE
    
       - Fix assertion errors when using inlinedata files on gfs2"
    
    * tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
      iomap: Don't create iomap_page objects in iomap_page_mkwrite_actor
      iomap: Don't create iomap_page objects for inline files
      iomap: Permit pages without an iop to enter writeback
      iomap: remove the length variable in iomap_seek_hole
      iomap: remove the length variable in iomap_seek_data
    torvalds committed Jul 18, 2021
  5. Merge tag 'kbuild-fixes-v5.14' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/masahiroy/linux-kbuild
    
    Pull Kbuild fixes from Masahiro Yamada:
    
     - Restore the original behavior of scripts/setlocalversion when
       LOCALVERSION is set to empty.
    
     - Show Kconfig prompts even for 'make -s'
    
     - Fix the combination of COFNIG_LTO_CLANG=y and CONFIG_MODVERSIONS=y
       for older GNU Make versions
    
    * tag 'kbuild-fixes-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
      Documentation: Fix intiramfs script name
      Kbuild: lto: fix module versionings mismatch in GNU make 3.X
      kbuild: do not suppress Kconfig prompts for silent build
      scripts/setlocalversion: fix a bug when LOCALVERSION is empty
    torvalds committed Jul 18, 2021
  6. Documentation: Fix intiramfs script name

    Documentation was not changed when renaming the script in commit
    80e715a ("initramfs: rename gen_initramfs_list.sh to
    gen_initramfs.sh"). Fixing this.
    
    Basically does:
    
     $ sed -i -e s/gen_initramfs_list.sh/gen_initramfs.sh/g $(git grep -l gen_initramfs_list.sh)
    
    Fixes: 80e715a ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh")
    Signed-off-by: Robert Richter <rrichter@amd.com>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Robert Richter authored and masahir0y committed Jul 18, 2021
  7. Kbuild: lto: fix module versionings mismatch in GNU make 3.X

    When building modules(CONFIG_...=m), I found some of module versions
    are incorrect and set to 0.
    This can be found in build log for first clean build which shows
    
    WARNING: EXPORT symbol "XXXX" [drivers/XXX/XXX.ko] version generation failed,
    symbol will not be versioned.
    
    But in second build(incremental build), the WARNING disappeared and the
    module version becomes valid CRC and make someone who want to change
    modules without updating kernel image can't insert their modules.
    
    The problematic code is
    +	$(foreach n, $(filter-out FORCE,$^),				\
    +		$(if $(wildcard $(n).symversions),			\
    +			; cat $(n).symversions >> $@.symversions))
    
    For example:
      rm -f fs/notify/built-in.a.symversions    ; rm -f fs/notify/built-in.a; \
    llvm-ar cDPrST fs/notify/built-in.a fs/notify/fsnotify.o \
    fs/notify/notification.o fs/notify/group.o ...
    
    `foreach n` shows nothing to `cat` into $(n).symversions because
    `if $(wildcard $(n).symversions)` return nothing, but actually
    they do exist during this line was executed.
    
    -rw-r--r-- 1 root root 168580 Jun 13 19:10 fs/notify/fsnotify.o
    -rw-r--r-- 1 root root    111 Jun 13 19:10 fs/notify/fsnotify.o.symversions
    
    The reason is the $(n).symversions are generated at runtime, but
    Makefile wildcard function expends and checks the file exist or not
    during parsing the Makefile.
    
    Thus fix this by use `test` shell command to check the file
    existence in runtime.
    
    Rebase from both:
    1. [https://lore.kernel.org/lkml/20210616080252.32046-1-lecopzer.chen@mediatek.com/]
    2. [https://lore.kernel.org/lkml/20210702032943.7865-1-lecopzer.chen@mediatek.com/]
    
    Fixes: 38e8918 ("kbuild: lto: fix module versioning")
    Co-developed-by: Sami Tolvanen <samitolvanen@google.com>
    Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Lecopzer Chen authored and masahir0y committed Jul 18, 2021
  8. kbuild: do not suppress Kconfig prompts for silent build

    When a new CONFIG option is available, Kbuild shows a prompt to get
    the user input.
    
      $ make
      [ snip ]
      Core Scheduling for SMT (SCHED_CORE) [N/y/?] (NEW)
    
    This is the only interactive place in the build process.
    
    Commit 174a1dc ("kbuild: sink stdout from cmd for silent build")
    suppressed Kconfig prompts as well because syncconfig is invoked by
    the 'cmd' macro. You cannot notice the fact that Kconfig is waiting
    for the user input.
    
    Use 'kecho' to show the equivalent short log without suppressing stdout
    from sub-make.
    
    Fixes: 174a1dc ("kbuild: sink stdout from cmd for silent build")
    Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    masahir0y committed Jul 18, 2021
  9. scripts/setlocalversion: fix a bug when LOCALVERSION is empty

    The commit 042da42 ("scripts/setlocalversion: simplify the short
    version part") reduces indentation. Unfortunately, it also changes behavior
    in a subtle way - if the user has empty "LOCALVERSION" variable, the plus
    sign is appended to the kernel version. It wasn't appended before.
    
    This patch reverts to the old behavior - we append the plus sign only if
    the LOCALVERSION variable is not set.
    
    Fixes: 042da42 ("scripts/setlocalversion: simplify the short version part")
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Mikulas Patocka authored and masahir0y committed Jul 18, 2021
  10. perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set

    The tracepoints trace_sched_stat_{wait, sleep, iowait} are not exposed to user
    if CONFIG_SCHEDSTATS is not set, "perf sched record" records the three events.
    As a result, the command fails.
    
    Before:
    
      #perf sched record sleep 1
      event syntax error: 'sched:sched_stat_wait'
                           \___ unknown tracepoint
    
      Error:  File /sys/kernel/tracing/events/sched/sched_stat_wait not found.
      Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
    
      Run 'perf list' for a list of valid events
    
       Usage: perf record [<options>] [<command>]
          or: perf record [<options>] -- <command> [<options>]
    
          -e, --event <event>   event selector. use 'perf list' to list available events
    
    Solution:
      Check whether schedstat tracepoints are exposed. If no, these events are not recorded.
    
    After:
      # perf sched record sleep 1
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.163 MB perf.data (1091 samples) ]
      # perf sched report
      run measurement overhead: 4736 nsecs
      sleep measurement overhead: 9059979 nsecs
      the run test took 999854 nsecs
      the sleep test took 8945271 nsecs
      nr_run_events:        716
      nr_sleep_events:      785
      nr_wakeup_events:     0
      ...
      ------------------------------------------------------------
    
    Fixes: 2a09b5d ("sched/fair: do not expose some tracepoints to user if CONFIG_SCHEDSTATS is not set")
    Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Cc: Yafang Shao <laoar.shao@gmail.com>
    Link: http://lore.kernel.org/lkml/20210713112358.194693-1-yangjihong1@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Yang Jihong authored and Arnaldo Carvalho de Melo committed Jul 18, 2021
  11. perf probe: Fix add event failure when running 32-bit perf in a 64-bi…

    …t kernel
    
    The "address" member of "struct probe_trace_point" uses long data type.
    If kernel is 64-bit and perf program is 32-bit, size of "address"
    variable is 32 bits.
    
    As a result, upper 32 bits of address read from kernel are truncated, an
    error occurs during address comparison in kprobe_warn_out_range().
    
    Before:
    
      # perf probe -a schedule
      schedule is out of .text, skip it.
        Error: Failed to add events.
    
    Solution:
      Change data type of "address" variable to u64 and change corresponding
    address printing and value assignment.
    
    After:
    
      # perf.new.new probe -a schedule
      Added new event:
        probe:schedule       (on schedule)
    
      You can now use it in all perf tools, such as:
    
              perf record -e probe:schedule -aR sleep 1
    
      # perf probe -l
        probe:schedule       (on schedule@kernel/sched/core.c)
      # perf record -e probe:schedule -aR sleep 1
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.156 MB perf.data (1366 samples) ]
      # perf report --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      #
      # Total Lost Samples: 0
      #
      # Samples: 1K of event 'probe:schedule'
      # Event count (approx.): 1366
      #
      # Overhead  Command          Shared Object      Symbol
      # ........  ...............  .................  ............
      #
           6.22%  migration/0      [kernel.kallsyms]  [k] schedule
           6.22%  migration/1      [kernel.kallsyms]  [k] schedule
           6.22%  migration/2      [kernel.kallsyms]  [k] schedule
           6.22%  migration/3      [kernel.kallsyms]  [k] schedule
           6.15%  migration/10     [kernel.kallsyms]  [k] schedule
           6.15%  migration/11     [kernel.kallsyms]  [k] schedule
           6.15%  migration/12     [kernel.kallsyms]  [k] schedule
           6.15%  migration/13     [kernel.kallsyms]  [k] schedule
           6.15%  migration/14     [kernel.kallsyms]  [k] schedule
           6.15%  migration/15     [kernel.kallsyms]  [k] schedule
           6.15%  migration/4      [kernel.kallsyms]  [k] schedule
           6.15%  migration/5      [kernel.kallsyms]  [k] schedule
           6.15%  migration/6      [kernel.kallsyms]  [k] schedule
           6.15%  migration/7      [kernel.kallsyms]  [k] schedule
           6.15%  migration/8      [kernel.kallsyms]  [k] schedule
           6.15%  migration/9      [kernel.kallsyms]  [k] schedule
           0.22%  rcu_sched        [kernel.kallsyms]  [k] schedule
      ...
      #
      # (Cannot load tips.txt file, please install perf!)
      #
    
    Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
    Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Frank Ch. Eigler <fche@redhat.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jianlin Lv <jianlin.lv@arm.com>
    Cc: Jin Yao <yao.jin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Li Huafei <lihuafei1@huawei.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
    Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
    Link: http://lore.kernel.org/lkml/20210715063723.11926-1-yangjihong1@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Yang Jihong authored and Arnaldo Carvalho de Melo committed Jul 18, 2021
  12. perf data: Close all files in close_dir()

    When using 'perf report' in directory mode, the first file is not closed
    on exit, causing a memory leak.
    
    The problem is caused by the iterating variable never reaching 0.
    
    Fixes: 1455206 ("perf data: Add perf_data__(create_dir|close_dir) functions")
    Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
    Acked-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Zhen Lei <thunder.leizhen@huawei.com>
    Link: http://lore.kernel.org/lkml/20210716141122.858082-1-rickyman7@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Manciukic authored and Arnaldo Carvalho de Melo committed Jul 18, 2021
  13. perf probe-file: Delete namelist in del_events() on the error path

    ASan reports some memory leaks when running:
    
      # perf test "42: BPF filter"
    
    This second leak is caused by a strlist not being dellocated on error
    inside probe_file__del_events.
    
    This patch adds a goto label before the deallocation and makes the error
    path jump to it.
    
    Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
    Fixes: e7895e4 ("perf probe: Split del_perf_probe_events()")
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lore.kernel.org/lkml/174963c587ae77fa108af794669998e4ae558338.1626343282.git.rickyman7@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Manciukic authored and Arnaldo Carvalho de Melo committed Jul 18, 2021

Commits on Jul 17, 2021

  1. Merge tag 'soc-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/soc/soc
    
    Pull ARM SoC fixes from Arnd Bergmann:
     "Here are the patches for this week that came as the fallout of the
      merge window:
    
       - Two fixes for the NVidia memory controller driver
    
       - multiple defconfig files get patched to turn CONFIG_FB back on
         after that is no longer selected by CONFIG_DRM
    
       - ffa and scmpi firmware drivers fixes, mostly addressing compiler
         and documentation warnings
    
       - Platform specific fixes for device tree files on ASpeed, Renesas
         and NVidia SoC, mostly for recent regressions.
    
       - A workaround for a regression on the USB PHY with devlink when the
         usb-nop-xceiv driver is not available until the rootfs is mounted.
    
       - Device tree compiler warnings in Arm Versatile-AB"
    
    * tag 'soc-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
      ARM: dts: versatile: Fix up interrupt controller node names
      ARM: multi_v7_defconfig: Make NOP_USB_XCEIV driver built-in
      ARM: configs: Update u8500_defconfig
      ARM: configs: Update Vexpress defconfig
      ARM: configs: Update Versatile defconfig
      ARM: configs: Update RealView defconfig
      ARM: configs: Update Integrator defconfig
      arm: Typo s/PCI_IXP4XX_LEGACY/IXP4XX_PCI_LEGACY/
      firmware: arm_scmi: Fix range check for the maximum number of pending messages
      firmware: arm_scmi: Avoid padding in sensor message structure
      firmware: arm_scmi: Fix kernel doc warnings about return values
      firmware: arm_scpi: Fix kernel doc warnings
      firmware: arm_scmi: Fix kernel doc warnings
      ARM: shmobile: defconfig: Restore graphical consoles
      firmware: arm_ffa: Fix a possible ffa_linux_errmap buffer overflow
      firmware: arm_ffa: Fix the comment style
      firmware: arm_ffa: Simplify probe function
      firmware: arm_ffa: Ensure drivers provide a probe function
      firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
      firmware: arm_scmi: Ensure drivers provide a probe function
      ...
    torvalds committed Jul 17, 2021
  2. Revert "mm/slub: use stackdepot to save stack trace in objects"

    This reverts commit 7886914.
    
    It's not clear why, but it causes unexplained problems in entirely
    unrelated xfs code.  The most likely explanation is some slab
    corruption, possibly triggered due to CONFIG_SLUB_DEBUG_ON.  See [1].
    
    It ends up having a few other problems too, like build errors on
    arch/arc, and Geert reporting it using much more memory on m68k [3] (it
    probably does so elsewhere too, but it is probably just more noticeable
    on m68k).
    
    The architecture issues (both build and memory use) are likely just
    because this change effectively force-enabled STACKDEPOT (along with a
    very bad default value for the stackdepot hash size).  But together with
    the xfs issue, this all smells like "this commit was not ready" to me.
    
    Link: https://lore.kernel.org/linux-xfs/YPE3l82acwgI2OiV@infradead.org/ [1]
    Link: https://lore.kernel.org/lkml/202107150600.LkGNb4Vb-lkp@intel.com/ [2]
    Link: https://lore.kernel.org/lkml/CAMuHMdW=eoVzM1Re5FVoEN87nKfiLmM2+Ah7eNu2KXEhCvbZyA@mail.gmail.com/ [3]
    Reported-by: Christoph Hellwig <hch@infradead.org>
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    torvalds committed Jul 17, 2021
  3. Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/g…

    …it/jejb/scsi
    
    Pull SCSI fixes from James Bottomley:
     "One core fix for an oops which can occur if the error handling thread
      fails to start for some reason and the driver is removed.
    
      The other fixes are all minor ones in drivers"
    
    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
      scsi: ufs: core: Add missing host_lock in ufshcd_vops_setup_xfer_req()
      scsi: mpi3mr: Fix W=1 compilation warnings
      scsi: pm8001: Clean up kernel-doc and comments
      scsi: zfcp: Report port fc_security as unknown early during remote cable pull
      scsi: core: Fix bad pointer dereference when ehandler kthread is invalid
      scsi: fas216: Fix a build error
      scsi: core: Fix the documentation of the scsi_execute() time parameter
    torvalds committed Jul 17, 2021
  4. Merge tag '5.14-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

    Pull cifs fixes from Steve French:
     "Eight cifs/smb3 fixes, including three for stable.
    
      Three are DFS related fixes, and two to fix problems pointed out by
      static checkers"
    
    * tag '5.14-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
      cifs: do not share tcp sessions of dfs connections
      SMB3.1.1: fix mount failure to some servers when compression enabled
      cifs: added WARN_ON for all the count decrements
      cifs: fix missing null session check in mount
      cifs: handle reconnect of tcon when there is no cached dfs referral
      cifs: fix the out of range assignment to bit fields in parse_server_interfaces
      cifs: Do not use the original cruid when following DFS links for multiuser mounts
      cifs: use the expiry output of dns_query to schedule next resolution
    torvalds committed Jul 17, 2021
Older