Skip to content
Permalink
Anshuman-Khand…
Switch branches/tags

Commits on Jan 24, 2022

  1. perf: Capture branch privilege information

    Platforms like arm64 could capture privilege level information for all the
    branch records. Hence this adds a new element in the struct branch_entry to
    record the privilege level information, which could be requested through a
    new event.attr.branch_sample_type flag PERF_SAMPLE_BRANCH_PRIV_SAVE. While
    here, update the BRBE driver as required.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  2. perf: Expand perf_branch_entry.type

    Current perf_branch_entry.type is a 4 bits field just enough to accommodate
    16 generic branch types. This is insufficient to accommodate platforms like
    arm64 which has much more branch types. Lets just expands this field into a
    6 bits one, which can now hold 64 generic branch types. This also adds more
    generic branch types and updates the BRBE driver as required.
    
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  3. perf: Add more generic branch types

    This expands generic branch type classification by adding some more entries
    , that can still be represented with the existing 4 bit 'type' field. While
    here this also updates the x86 implementation with these new branch types.
    
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  4. arm64/perf: Enable branch stack sampling

    Now that all the required pieces are already in place, just enable the perf
    branch stack sampling support on arm64 platform, by removing the gate which
    blocks it in armpmu_event_init().
    
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  5. arm64/perf: Add BRBE driver

    This adds a BRBE driver which implements all the required helper functions
    for struct arm_pmu. Following functions are defined by this driver which
    will configure, enable, capture, reset and disable BRBE buffer HW as and
    when requested via perf branch stack sampling framework.
    
    - arm64_pmu_brbe_filter()
    - arm64_pmu_brbe_enable()
    - arm64_pmu_brbe_disable()
    - arm64_pmu_brbe_read()
    - arm64_pmu_brbe_probe()
    - arm64_pmu_brbe_reset()
    - arm64_pmu_brbe_supported()
    
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  6. arm64/perf: Drive BRBE from perf event states

    Branch stack sampling rides along the normal perf event and all the branch
    records get captured during the PMU interrupt. This just changes perf event
    handling on the arm64 platform to accommodate required BRBE operations that
    will enable branch stack sampling support.
    
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  7. arm64/perf: Detect support for BRBE

    CPU specific BRBE entries, cycle count, format support gets detected during
    PMU init. This information gets saved in per-cpu struct pmu_hw_events which
    later helps in operating BRBE during a perf event context.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  8. arm64/perf: Update struct pmu_hw_events for BRBE

    A single perf event instance BRBE related contexts and data will be tracked
    in struct pmu_hw_events. Hence update the structure to accommodate required
    details related to BRBE.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  9. arm64/perf: Update struct arm_pmu for BRBE

    This updates struct arm_pmu to include all required helpers that will drive
    BRBE functionality for a given PMU implementation. These are the following.
    
    - brbe_filter	: Convert perf event filters into BRBE HW filters
    - brbe_probe	: Probe BRBE HW and capture its attributes
    - brbe_enable	: Enable BRBE HW with a given config
    - brbe_disable	: Disable BRBE HW
    - brbe_read	: Read BRBE buffer for captured branch records
    - brbe_reset	: Reset BRBE buffer
    - brbe_supported: Whether BRBE is supported or not
    
    A BRBE driver implementation needs to provide these functionalities.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  10. arm64/perf: Add register definitions for BRBE

    This adds BRBE related register definitions and various other related field
    macros there in. These will be used subsequently in a BRBE driver which is
    being added later on.
    
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: Marc Zyngier <maz@kernel.org>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022
  11. perf: Consolidate branch sample filter helpers

    Besides the branch type filtering requests, 'event.attr.branch_sample_type'
    also contains various flags indicating which additional information should
    be captured, along with the base branch record. These flags help configure
    the underlying hardware, and capture the branch records appropriately when
    required e.g after PMU interrupt. But first, this moves an existing helper
    perf_sample_save_hw_index() into the header before adding some more helpers
    for other branch sample filter flags.
    
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: linux-perf-users@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Anshuman Khandual authored and intel-lab-lkp committed Jan 24, 2022

Commits on Jan 20, 2022

  1. arm64: mm: apply __ro_after_init to memory_limit

    This variable is only set during initialization, so mark with
    __ro_after_init.
    
    Signed-off-by: Peng Fan <peng.fan@nxp.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Link: https://lore.kernel.org/r/20211215064559.2843555-1-peng.fan@oss.nxp.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    MrVan authored and ctmarinas committed Jan 20, 2022
  2. arm64: atomics: lse: Dereference matching size

    When building with -Warray-bounds, the following warning is generated:
    
    In file included from ./arch/arm64/include/asm/lse.h:16,
                     from ./arch/arm64/include/asm/cmpxchg.h:14,
                     from ./arch/arm64/include/asm/atomic.h:16,
                     from ./include/linux/atomic.h:7,
                     from ./include/asm-generic/bitops/atomic.h:5,
                     from ./arch/arm64/include/asm/bitops.h:25,
                     from ./include/linux/bitops.h:33,
                     from ./include/linux/kernel.h:22,
                     from kernel/printk/printk.c:22:
    ./arch/arm64/include/asm/atomic_lse.h:247:9: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'atomic_t[1]' [-Warray-bounds]
      247 |         asm volatile(                                                   \
          |         ^~~
    ./arch/arm64/include/asm/atomic_lse.h:266:1: note: in expansion of macro '__CMPXCHG_CASE'
      266 | __CMPXCHG_CASE(w,  , acq_, 32,  a, "memory")
          | ^~~~~~~~~~~~~~
    kernel/printk/printk.c:3606:17: note: while referencing 'printk_cpulock_owner'
     3606 | static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
          |                 ^~~~~~~~~~~~~~~~~~~~
    
    This is due to the compiler seeing an unsigned long * cast against
    something (atomic_t) that is int sized. Replace the cast with the
    matching size cast. This results in no change in binary output.
    
    Note that __ll_sc__cmpxchg_case_##name##sz already uses the same
    constraint:
    
    	[v] "+Q" (*(u##sz *)ptr
    
    Which is why only the LSE form needs updating and not the
    LL/SC form, so this change is unlikely to be problematic.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Cc: linux-arm-kernel@lists.infradead.org
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Acked-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Link: https://lore.kernel.org/r/20220112202259.3950286-1-keescook@chromium.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    kees authored and ctmarinas committed Jan 20, 2022
  3. asm-generic: Add missing brackets for io_stop_wc macro

    After using io_stop_wc(), drivers reports following compile error when
    compiled on X86.
    
      drivers/net/ethernet/hisilicon/hns3/hns3_enet.c: In function ‘hns3_tx_push_bd’:
      drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:2058:12: error: expected ‘;’ before ‘(’ token
        io_stop_wc();
                  ^
    It is because I missed to add the brackets after io_stop_wc macro. So
    let's add the missing brackets.
    
    Fixes: d5624bb ("asm-generic: introduce io_stop_wc() and add implementation for ARM64")
    Reported-by: Guangbin Huang <huangguangbin2@huawei.com>
    Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
    Link: https://lore.kernel.org/r/20220114105857.126300-1-wangxiongfeng2@huawei.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    fenghusthu authored and ctmarinas committed Jan 20, 2022

Commits on Jan 5, 2022

  1. Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', 'for-next/s…

    …tacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core
    
    * arm64/for-next/perf: (32 commits)
      arm64: perf: Don't register user access sysctl handler multiple times
      drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
      perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
      arm64: perf: Support new DT compatibles
      arm64: perf: Simplify registration boilerplate
      arm64: perf: Support Denver and Carmel PMUs
      drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
      docs: perf: Add description for HiSilicon PCIe PMU driver
      dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings
      drivers: perf: Add LLC-TAD perf counter support
      perf/smmuv3: Synthesize IIDR from CoreSight ID registers
      perf/smmuv3: Add devicetree support
      dt-bindings: Add Arm SMMUv3 PMCG binding
      perf/arm-cmn: Add debugfs topology info
      perf/arm-cmn: Add CI-700 Support
      dt-bindings: perf: arm-cmn: Add CI-700
      perf/arm-cmn: Support new IP features
      perf/arm-cmn: Demarcate CMN-600 specifics
      perf/arm-cmn: Move group validation data off-stack
      perf/arm-cmn: Optimise DTC counter accesses
      ...
    
    * for-next/misc:
      : Miscellaneous patches
      arm64: Use correct method to calculate nomap region boundaries
      arm64: Drop outdated links in comments
      arm64: errata: Fix exec handling in erratum 1418040 workaround
      arm64: Unhash early pointer print plus improve comment
      asm-generic: introduce io_stop_wc() and add implementation for ARM64
      arm64: remove __dma_*_area() aliases
      docs/arm64: delete a space from tagged-address-abi
      arm64/fp: Add comments documenting the usage of state restore functions
      arm64: mm: Use asid feature macro for cheanup
      arm64: mm: Rename asid2idx() to ctxid2asid()
      arm64: kexec: reduce calls to page_address()
      arm64: extable: remove unused ex_handler_t definition
      arm64: entry: Use SDEI event constants
      arm64: Simplify checking for populated DT
      arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c
    
    * for-next/cache-ops-dzp:
      : Avoid DC instructions when DCZID_EL0.DZP == 1
      arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1
      arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
    
    * for-next/stacktrace:
      : Unify the arm64 unwind code
      arm64: Make some stacktrace functions private
      arm64: Make dump_backtrace() use arch_stack_walk()
      arm64: Make profile_pc() use arch_stack_walk()
      arm64: Make return_address() use arch_stack_walk()
      arm64: Make __get_wchan() use arch_stack_walk()
      arm64: Make perf_callchain_kernel() use arch_stack_walk()
      arm64: Mark __switch_to() as __sched
      arm64: Add comment for stack_info::kr_cur
      arch: Make ARCH_STACKWALK independent of STACKTRACE
    
    * for-next/xor-neon:
      : Use SHA3 instructions to speed up XOR
      arm64/xor: use EOR3 instructions when available
    
    * for-next/kasan:
      : Log potential KASAN shadow aliases
      arm64: mm: log potential KASAN shadow alias
      arm64: mm: use die_kernel_fault() in do_mem_abort()
    
    * for-next/armv8_7-fp:
      : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES
      arm64: cpufeature: add HWCAP for FEAT_RPRES
      arm64: add ID_AA64ISAR2_EL1 sys register
      arm64: cpufeature: add HWCAP for FEAT_AFP
    
    * for-next/atomics:
      : arm64 atomics clean-ups and codegen improvements
      arm64: atomics: lse: define RETURN ops in terms of FETCH ops
      arm64: atomics: lse: improve constraints for simple ops
      arm64: atomics: lse: define ANDs in terms of ANDNOTs
      arm64: atomics lse: define SUBs in terms of ADDs
      arm64: atomics: format whitespace consistently
    
    * for-next/bti:
      : BTI clean-ups
      arm64: Ensure that the 'bti' macro is defined where linkage.h is included
      arm64: Use BTI C directly and unconditionally
      arm64: Unconditionally override SYM_FUNC macros
      arm64: Add macro version of the BTI instruction
      arm64: ftrace: add missing BTIs
      arm64: kexec: use __pa_symbol(empty_zero_page)
      arm64: update PAC description for kernel
    
    * for-next/sve:
      : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions
      arm64/sve: Minor clarification of ABI documentation
      arm64/sve: Generalise vector length configuration prctl() for SME
      arm64/sve: Make sysctl interface for SVE reusable by SME
    
    * for-next/kselftest:
      : arm64 kselftest additions
      kselftest/arm64: Add pidbench for floating point syscall cases
      kselftest/arm64: Add a test program to exercise the syscall ABI
      kselftest/arm64: Allow signal tests to trigger from a function
      kselftest/arm64: Parameterise ptrace vector length information
    
    * for-next/kcsan:
      : Enable KCSAN for arm64
      arm64: Enable KCSAN
    ctmarinas committed Jan 5, 2022
  2. arm64: Use correct method to calculate nomap region boundaries

    Nomap regions are treated as "reserved". When region boundaries are not
    page aligned, we usually increase the "reserved" regions rather than
    decrease them. So, we should use memblock_region_reserved_base_pfn()/
    memblock_region_reserved_end_pfn() instead of memblock_region_memory_
    base_pfn()/memblock_region_memory_base_pfn() to calculate boundaries.
    
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Link: https://lore.kernel.org/r/20211022070646.41923-1-chenhuacai@loongson.cn
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    chenhuacai authored and ctmarinas committed Jan 5, 2022
  3. arm64: Drop outdated links in comments

    As started by commit 05a5f51 ("Documentation: Replace lkml.org links
    with lore"), an effort was made to replace lkml.org links with lore to
    better use a single source that's more likely to stay available long-term.
    However, it seems these links don't offer much value here, so just
    remove them entirely.
    
    Cc: Joe Perches <joe@perches.com>
    Suggested-by: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/lkml/20210211100213.GA29813@willie-the-truck/
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Link: https://lore.kernel.org/r/20211215191835.1420010-1-keescook@chromium.org
    [catalin.marinas@arm.com: removed the arch/arm changes]
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    kees authored and ctmarinas committed Jan 5, 2022

Commits on Jan 4, 2022

  1. arm64: perf: Don't register user access sysctl handler multiple times

    Commit e201260 ("arm64: perf: Add userspace counter access disable
    switch") introduced a new 'perf_user_access' sysctl file to enable and
    disable direct userspace access to the PMU counters. Sadly, Geert
    reports that on his big.LITTLE SoC ('Renesas Salvator-XS w/ R-Car H3'),
    the file is created for each PMU type probed, resulting in a splat
    during boot:
    
      | hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
      | sysctl duplicate entry: /kernel//perf_user_access
      | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc3-arm64-renesas-00003-ge2012600810c #1420
      | Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
      | Call trace:
      |  dump_backtrace+0x0/0x190
      |  show_stack+0x14/0x20
      |  dump_stack_lvl+0x88/0xb0
      |  dump_stack+0x14/0x2c
      |  __register_sysctl_table+0x384/0x818
      |  register_sysctl+0x20/0x28
      |  armv8_pmu_init.constprop.0+0x118/0x150
      |  armv8_a57_pmu_init+0x1c/0x28
      |  arm_pmu_device_probe+0x1b4/0x558
      |  armv8_pmu_device_probe+0x18/0x20
      |  platform_probe+0x64/0xd0
      |  hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available
    
    Introduce a state variable to track creation of the sysctl file and
    ensure that it is only created once.
    
    Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Fixes: e201260 ("arm64: perf: Add userspace counter access disable switch")
    Link: https://lore.kernel.org/r/CAMuHMdVcDxR9sGzc5pcnORiotonERBgc6dsXZXMd6wTvLGA9iw@mail.gmail.com
    Signed-off-by: Will Deacon <will@kernel.org>
    willdeacon committed Jan 4, 2022
  2. drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check

    The devm_ioremap() function does not return error pointers.  It returns
    NULL.
    
    Fixes: 036a758 ("drivers: perf: Add LLC-TAD perf counter support")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Link: https://lore.kernel.org/r/20211217145907.GA16611@kili
    Signed-off-by: Will Deacon <will@kernel.org>
    error27 authored and willdeacon committed Jan 4, 2022
  3. perf/smmuv3: Fix unused variable warning when CONFIG_OF=n

    The kbuild robot reports that building the SMMUv3 PMU driver with
    CONFIG_OF=n results in a warning for W=1 builds:
    
    >> drivers/perf/arm_smmuv3_pmu.c:889:34: warning: unused variable 'smmu_pmu_of_match' [-Wunused-const-variable]
       static const struct of_device_id smmu_pmu_of_match[] = {
                                        ^
    
    Guard the match table with #ifdef CONFIG_OF.
    
    Link: https://lore.kernel.org/r/202201041700.01KZEzhb-lkp@intel.com
    Fixes: 3f7be43 ("perf/smmuv3: Add devicetree support")
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Will Deacon <will@kernel.org>
    willdeacon committed Jan 4, 2022

Commits on Dec 22, 2021

  1. arm64: errata: Fix exec handling in erratum 1418040 workaround

    The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0
    when executing compat threads. The workaround is applied when switching
    between tasks, but the need for the workaround could also change at an
    exec(), when a non-compat task execs a compat binary or vice versa. Apply
    the workaround in arch_setup_new_exec().
    
    This leaves a small window of time between SET_PERSONALITY and
    arch_setup_new_exec where preemption could occur and confuse the old
    workaround logic that compares TIF_32BIT between prev and next. Instead, we
    can just read cntkctl to make sure it's in the state that the next task
    needs. I measured cntkctl read time to be about the same as a mov from a
    general-purpose register on N1. Update the workaround logic to examine the
    current value of cntkctl instead of the previous task's compat state.
    
    Fixes: d49f7d7 ("arm64: Move handling of erratum 1418040 into C code")
    Cc: <stable@vger.kernel.org> # 5.9.x
    Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
    Reviewed-by: Marc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    D Scott Phillips authored and ctmarinas committed Dec 22, 2021
  2. arm64: Unhash early pointer print plus improve comment

    When facing a really early issue on DT parsing we have currently
    a message that shows both the physical and virtual address of the FDT.
    The printk pointer modifier for the virtual address shows a hashed
    address there unless the user provides "no_hash_pointers" parameter in
    the command-line. The situation in which this message shows-up is a bit
    more serious though: the boot process is broken, nothing can be done
    (even an oops is too much for this early stage) so we have this message
    as a last resort in order to help debug bootloader issues, for example.
    Hence, we hereby change that to "%px" in order to make debugging easy,
    there's not much information leak risk in such early boot failure.
    
    Also, we tried to improve a bit the commenting on that function, given
    that if kernel fails there, it just hangs forever in a cpu_relax() loop.
    The reason we cannot BUG/panic is that is too early to do so; thanks to
    Mark Brown for pointing that on IRC and thanks Robin Murphy for the good
    pointer hash discussion in the mailing-list.
    
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Robin Murphy <robin.murphy@arm.com>
    Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
    Reviewed-by: Robin Murphy <robin.murphy@arm.com>
    Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    Link: https://lore.kernel.org/r/20211221155230.1532850-1-gpiccoli@igalia.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Guilherme G. Piccoli authored and ctmarinas committed Dec 22, 2021
  3. asm-generic: introduce io_stop_wc() and add implementation for ARM64

    For memory accesses with write-combining attributes (e.g. those returned
    by ioremap_wc()), the CPU may wait for prior accesses to be merged with
    subsequent ones. But in some situation, such wait is bad for the
    performance.
    
    We introduce io_stop_wc() to prevent the merging of write-combining
    memory accesses before this macro with those after it.
    
    We add implementation for ARM64 using DGH instruction and provide NOP
    implementation for other architectures.
    
    Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
    Suggested-by: Will Deacon <will@kernel.org>
    Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
    Acked-by: Arnd Bergmann <arnd@arndb.de>
    Link: https://lore.kernel.org/r/20211221035556.60346-1-wangxiongfeng2@huawei.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    fenghusthu authored and ctmarinas committed Dec 22, 2021

Commits on Dec 17, 2021

  1. arm64: Ensure that the 'bti' macro is defined where linkage.h is incl…

    …uded
    
    Not all .S files include asm/assembler.h, however the SYM_FUNC_*
    definitions invoke the 'bti' macro. Include asm/assembler.h in
    asm/linkage.h.
    
    Fixes: 9be34be ("arm64: Add macro version of the BTI instruction")
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    ctmarinas committed Dec 17, 2021

Commits on Dec 15, 2021

  1. arm64: remove __dma_*_area() aliases

    The __dma_inv_area() and __dma_clean_area() aliases make cache.S harder
    to navigate, but don't gain us anything in practice.
    
    For clarity, let's remove them along with their redundant comments. The
    only users are __dma_map_area() and __dma_unmap_area(), which need to be
    position independent, and can call __pi_dcache_inval_poc() and
    __pi_dcache_clean_poc() directly.
    
    There should be no functional change as a result of this patch.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Cc: Ard Biesheuvel <ardb@kernel.org>
    Cc: Fuad Tabba <tabba@google.com>
    Cc: Marc Zyngier <maz@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Acked-by: Mark Brown <broonie@kernel.org>
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Link: https://lore.kernel.org/r/20211206124715.4101571-4-mark.rutland@arm.com
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Mark Rutland authored and ctmarinas committed Dec 15, 2021

Commits on Dec 14, 2021

  1. docs/arm64: delete a space from tagged-address-abi

    Since e71e2ac("userfaultfd: do not untag user pointers") which
    introduced a warning:
    
    linux/Documentation/arm64/tagged-address-abi.rst:52: WARNING: Unexpected indentation.
    
    Let's fix it.
    
    Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
    Link: https://lore.kernel.org/r/20211209091922.560979-1-siyanteng@loongson.cn
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    sterling-teng authored and ctmarinas committed Dec 14, 2021
  2. arm64: Enable KCSAN

    This patch enables KCSAN for arm64, with updates to build rules
    to not use KCSAN for several incompatible compilation units.
    
    Recent GCC version(at least GCC10) made outline-atomics as the
    default option(unlike Clang), which will cause linker errors
    for kernel/kcsan/core.o. Disables the out-of-line atomics by
    no-outline-atomics to fix the linker errors.
    
    Meanwhile, as Mark said[1], some latent issues are needed to be
    fixed which isn't just a KCSAN problem, we make the KCSAN depends
    on EXPERT for now.
    
    Tested selftest and kcsan_test(built with GCC11 and Clang 13),
    and all passed.
    
    [1] https://lkml.kernel.org/r/YadiUPpJ0gADbiHQ@FVFF77S0Q05N
    
    Acked-by: Marco Elver <elver@google.com> # kernel/kcsan
    Tested-by: Joey Gouly <joey.gouly@arm.com>
    Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
    Link: https://lore.kernel.org/r/20211211131734.126874-1-wangkefeng.wang@huawei.com
    [catalin.marinas@arm.com: added comment to justify EXPERT]
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Kefeng Wang authored and ctmarinas committed Dec 14, 2021
  3. kselftest/arm64: Add pidbench for floating point syscall cases

    Since it's likely to be useful for performance work with SVE let's have a
    pidbench that gives us some numbers for consideration. In order to ensure
    that we test exactly the scenario we want this is written in assembly - if
    system libraries use SVE this would stop us exercising the case where the
    process has never used SVE.
    
    We exercise three cases:
    
     - Never having used SVE.
     - Having used SVE once.
     - Using SVE after each syscall.
    
    by spinning running getpid() for a fixed number of iterations with the
    time measured using CNTVCT_EL0 reported on the console. This is obviously
    a totally unrealistic benchmark which will show the extremes of any
    performance variation but equally given the potential gotchas with use of
    FP instructions by system libraries it's good to have some concrete code
    shared to make it easier to compare notes on results.
    
    Testing over multiple SVE vector lengths will need to be done with vlset
    currently, the test could be extended to iterate over all of them if
    desired.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211202165107.1075259-1-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  4. arm64/fp: Add comments documenting the usage of state restore functions

    Add comments to help people figure out when fpsimd_bind_state_to_cpu() and
    fpsimd_update_current_state() are used.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211207163250.1373542-1-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  5. kselftest/arm64: Add a test program to exercise the syscall ABI

    Currently we don't have any coverage of the syscall ABI so let's add a very
    dumb test program which sets up register patterns, does a sysscall and then
    checks that the register state after the syscall matches what we expect.
    The program is written in an extremely simplistic fashion with the goal of
    making it easy to verify that it's doing what it thinks it's doing, it is
    not a model of how one should write actual code.
    
    Currently we validate the general purpose, FPSIMD and SVE registers. There
    are other thing things that could be covered like FPCR and flags registers,
    these can be covered incrementally - my main focus at the minute is
    covering the ABI for the SVE registers.
    
    The program repeats the tests for all possible SVE vector lengths in case
    some vector length specific optimisation causes issues, as well as testing
    FPSIMD only. It tries two syscalls, getpid() and sched_yield(), in an
    effort to cover both immediate return to userspace and scheduling another
    task though there are no guarantees which cases will be hit.
    
    A new test directory "abi" is added to hold the test, it doesn't seem to
    fit well into any of the existing directories.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211210184133.320748-7-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  6. kselftest/arm64: Allow signal tests to trigger from a function

    Currently we have the facility to specify custom code to trigger a signal
    but none of the tests use it and for some reason the framework requires us
    to also specify a signal to send as a trigger in order to make use of a
    custom trigger. This doesn't seem to make much sense, instead allow the
    use of a custom trigger function without specifying a signal to inject.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211210184133.320748-6-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  7. kselftest/arm64: Parameterise ptrace vector length information

    SME introduces a new mode called streaming mode in which the SVE registers
    have a different vector length. Since the ptrace interface for this is
    based on the existing SVE interface prepare for supporting this by moving
    the regset specific configuration into  struct and passing that around,
    allowing these tests to be reused for streaming mode. As we will also have
    to verify the interoperation of the SVE and streaming SVE regsets don't
    just iterate over an array.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211210184133.320748-5-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  8. arm64/sve: Minor clarification of ABI documentation

    As suggested by Luis for the SME version of this explicitly say that the
    vector length should be extracted from the return value of a set vector
    length prctl() with a bitwise and rather than just any old and.
    
    Suggested-by: Luis Machado <Luis.Machado@arm.com>
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211210184133.320748-4-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  9. arm64/sve: Generalise vector length configuration prctl() for SME

    In preparation for adding SME support update the bulk of the implementation
    for the vector length configuration prctl() calls to be independent of
    vector type.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211210184133.320748-3-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
  10. arm64/sve: Make sysctl interface for SVE reusable by SME

    The vector length configuration for SME is very similar to that for SVE
    so in order to allow reuse refactor the SVE configuration so that it takes
    the vector type from the struct ctl_table. Since there's no dedicated space
    for this we repurpose the extra1 field to store the vector type, this is
    otherwise unused for integer sysctls.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211210184133.320748-2-broonie@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    broonie authored and ctmarinas committed Dec 14, 2021
Older