Skip to content
Permalink
Huangzhaoyang/…
Switch branches/tags

Commits on Nov 12, 2021

  1. arch: arm64: have memblocks out of kernel text use section map

    By comparing the swapper_pg_dir with k54 and previous versions,we find
    that the linear mappings within which the addr is out of kernel text section
    will use the smallest pte. It should arise for the sake of rodata_full, which
    set all memblock use NO_CONT_MAPPINGS.
    
    Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
    Zhaoyang Huang authored and intel-lab-lkp committed Nov 12, 2021

Commits on Nov 8, 2021

  1. arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions

    gcc warns about undefined behavior the vmalloc code when building
    with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to
    __phys_to_pte_val() is evaluated twice:
    
    mm/vmalloc.c: In function 'vmap_pfn_apply':
    mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point]
     2800 |         *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
          |                                                 ~~~~~~~~~^~
    arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte'
       25 | #define __pte(x)        ((pte_t) { (x) } )
          |                                     ^
    arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val'
       80 |         __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
          |               ^~~~~~~~~~~~~~~~~
    mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte'
     2800 |         *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
          |                              ^~~~~~~
    
    I have no idea why this never showed up earlier, but the safest
    workaround appears to be changing those macros into inline functions
    so the arguments get evaluated only once.
    
    Cc: Matthew Wilcox <willy@infradead.org>
    Fixes: 75387b9 ("arm64: handle 52-bit physical addresses in page table entries")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    arndb authored and willdeacon committed Nov 8, 2021
  2. arm64: Track no early_pgtable_alloc() for kmemleak

    After switched page size from 64KB to 4KB on several arm64 servers here,
    kmemleak starts to run out of early memory pool due to a huge number of
    those early_pgtable_alloc() calls:
    
      kmemleak_alloc_phys()
      memblock_alloc_range_nid()
      memblock_phys_alloc_range()
      early_pgtable_alloc()
      init_pmd()
      alloc_init_pud()
      __create_pgd_mapping()
      __map_memblock()
      paging_init()
      setup_arch()
      start_kernel()
    
    Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
    won't be enough for a server with 200GB+ memory. There isn't much
    interesting to check memory leaks for those early page tables and those
    early memory mappings should not reference to other memory. Hence, no
    kmemleak false positives, and we can safely skip tracking those early
    allocations from kmemleak like we did in the commit fed84c7
    ("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
    introduce complications to automatically scale the value depends on the
    runtime memory size etc. After the patch, the default value of
    DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.
    
    Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
    Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
    Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
    Link: https://lore.kernel.org/r/20211105150509.7826-1-quic_qiancai@quicinc.com
    Signed-off-by: Will Deacon <will@kernel.org>
    qcsde authored and willdeacon committed Nov 8, 2021
  3. arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long

    This constant was previously an unsigned long, but was changed
    into an int in commit 433c38f ("arm64: mte: change ASYNC and
    SYNC TCF settings into bitfields"). This ended up causing spurious
    unsigned-signed comparison warnings in expressions such as:
    
    (x & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE
    
    Therefore, change it back into an unsigned long to silence these
    warnings.
    
    Link: https://linux-review.googlesource.com/id/I07a72310db30227a5b7d789d0b817d78b657c639
    Signed-off-by: Peter Collingbourne <pcc@google.com>
    Link: https://lore.kernel.org/r/20211105230829.2254790-1-pcc@google.com
    Signed-off-by: Will Deacon <will@kernel.org>
    pcc authored and willdeacon committed Nov 8, 2021
  4. arm64: vdso: remove -nostdlib compiler flag

    The -nostdlib option requests the compiler to not use the standard
    system startup files or libraries when linking. It is effective only
    when $(CC) is used as a linker driver.
    
    Since commit 691efbe ("arm64: vdso: use $(LD) instead of $(CC)
    to link VDSO"), $(LD) is directly used, hence -nostdlib is unneeded.
    
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Link: https://lore.kernel.org/r/20211107161802.323125-1-masahiroy@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    masahir0y authored and willdeacon committed Nov 8, 2021
  5. arm64: arm64_ftr_reg->name may not be a human-readable string

    The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes:
    one as the system register encoding (used for the sys_id field of
    __ftr_reg_entry), and the other as the register name (stringified
    and used for the name field of arm64_ftr_reg), which is debug
    information. The id argument is supposed to be a macro that
    indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc).
    
    ARM64_FTR_REG(), which also has the same id argument,
    uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro.
    Since the id argument is completely macro-expanded before it is
    substituted into a macro body of ARM64_FTR_REG_OVERRIDE(),
    the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not
    a human-readable register name, but a string of numeric bitwise
    operations.
    
    Fix this so that human-readable register names are available as
    debug information.
    
    Fixes: 8f266a5 ("arm64: cpufeature: Add global feature override facility")
    Signed-off-by: Reiji Watanabe <reijiw@google.com>
    Reviewed-by: Oliver Upton <oupton@google.com>
    Acked-by: Marc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com
    Signed-off-by: Will Deacon <will@kernel.org>
    reijiw-kvm authored and willdeacon committed Nov 8, 2021

Commits on Oct 29, 2021

  1. Merge branch 'for-next/fixes' into for-next/core

    Merge for-next/fixes to resolve conflicts in arm64_hugetlb_cma_reserve().
    
    * for-next/fixes:
      acpi/arm64: fix next_platform_timer() section mismatch error
      arm64/hugetlb: fix CMA gigantic page order for non-4K PAGE_SIZE
    willdeacon committed Oct 29, 2021
  2. Merge branch 'for-next/vdso' into for-next/core

    * for-next/vdso:
      arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd
      arm64: vdso32: suppress error message for 'make mrproper'
      arm64: vdso32: drop test for -march=armv8-a
      arm64: vdso32: drop the test for dmb ishld
    willdeacon committed Oct 29, 2021
  3. Merge branch 'for-next/trbe-errata' into for-next/core

    * for-next/trbe-errata:
      arm64: errata: Add detection for TRBE write to out-of-range
      arm64: errata: Add workaround for TSB flush failures
      arm64: errata: Add detection for TRBE overwrite in FILL mode
      arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
    willdeacon committed Oct 29, 2021
  4. Merge branch 'for-next/sve' into for-next/core

    * for-next/sve:
      arm64/sve: Fix warnings when SVE is disabled
      arm64/sve: Add stub for sve_max_virtualisable_vl()
      arm64/sve: Track vector lengths for tasks in an array
      arm64/sve: Explicitly load vector length when restoring SVE state
      arm64/sve: Put system wide vector length information into structs
      arm64/sve: Use accessor functions for vector lengths in thread_struct
      arm64/sve: Rename find_supported_vector_length()
      arm64/sve: Make access to FFR optional
      arm64/sve: Make sve_state_size() static
      arm64/sve: Remove sve_load_from_fpsimd_state()
      arm64/fp: Reindent fpsimd_save()
    willdeacon committed Oct 29, 2021
  5. Merge branch 'for-next/scs' into for-next/core

    * for-next/scs:
      scs: Release kasan vmalloc poison in scs_free process
    willdeacon committed Oct 29, 2021
  6. Merge branch 'for-next/pfn-valid' into for-next/core

    * for-next/pfn-valid:
      arm64/mm: drop HAVE_ARCH_PFN_VALID
      dma-mapping: remove bogus test for pfn_valid from dma_map_resource
    willdeacon committed Oct 29, 2021
  7. Merge branch 'for-next/perf' into for-next/core

    * for-next/perf:
      drivers/perf: Improve build test coverage
      drivers/perf: thunderx2_pmu: Change data in size tx2_uncore_event_update()
      drivers/perf: hisi: Fix PA PMU counter offset
    willdeacon committed Oct 29, 2021
  8. Merge branch 'for-next/mte' into for-next/core

    * for-next/mte:
      kasan: Extend KASAN mode kernel parameter
      arm64: mte: Add asymmetric mode support
      arm64: mte: CPU feature detection for Asymm MTE
      arm64: mte: Bitfield definitions for Asymm MTE
      kasan: Remove duplicate of kasan_flag_async
      arm64: kasan: mte: move GCR_EL1 switch to task switch when KASAN disabled
    willdeacon committed Oct 29, 2021
  9. Merge branch 'for-next/mm' into for-next/core

    * for-next/mm:
      arm64: mm: update max_pfn after memory hotplug
      arm64/mm: Add pud_sect_supported()
      arm64: mm: Drop pointless call to set_max_mapnr()
    willdeacon committed Oct 29, 2021
  10. Merge branch 'for-next/misc' into for-next/core

    * for-next/misc:
      arm64: Select POSIX_CPU_TIMERS_TASK_WORK
      arm64: Document boot requirements for FEAT_SME_FA64
      arm64: ftrace: use function_nocfi for _mcount as well
      arm64: asm: setup.h: export common variables
      arm64/traps: Avoid unnecessary kernel/user pointer conversion
    willdeacon committed Oct 29, 2021
  11. Merge branch 'for-next/kselftest' into for-next/core

    * for-next/kselftest:
      selftests: arm64: Factor out utility functions for assembly FP tests
      selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance
      selftests: arm64: Verify that all possible vector lengths are handled
      selftests: arm64: Fix and enable test for setting current VL in vec-syscfg
      selftests: arm64: Remove bogus error check on writing to files
      selftests: arm64: Fix printf() format mismatch in vec-syscfg
      selftests: arm64: Move FPSIMD in SVE ptrace test into a function
      selftests: arm64: More comprehensively test the SVE ptrace interface
      selftests: arm64: Verify interoperation of SVE and FPSIMD register sets
      selftests: arm64: Clarify output when verifying SVE register set
      selftests: arm64: Document what the SVE ptrace test is doing
      selftests: arm64: Remove extraneous register setting code
      selftests: arm64: Don't log child creation as a test in SVE ptrace test
      selftests: arm64: Use a define for the number of SVE ptrace tests to be run
    willdeacon committed Oct 29, 2021
  12. Merge branch 'for-next/kexec' into for-next/core

    * for-next/kexec:
      arm64: trans_pgd: remove trans_pgd_map_page()
      arm64: kexec: remove cpu-reset.h
      arm64: kexec: remove the pre-kexec PoC maintenance
      arm64: kexec: keep MMU enabled during kexec relocation
      arm64: kexec: install a copy of the linear-map
      arm64: kexec: use ld script for relocation function
      arm64: kexec: relocate in EL1 mode
      arm64: kexec: configure EL2 vectors for kexec
      arm64: kexec: pass kimage as the only argument to relocation function
      arm64: kexec: Use dcache ops macros instead of open-coding
      arm64: kexec: skip relocation code for inplace kexec
      arm64: kexec: flush image and lists during kexec load time
      arm64: hibernate: abstract ttrb0 setup function
      arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors
      arm64: kernel: add helper for booted at EL2 and not VHE
    willdeacon committed Oct 29, 2021
  13. Merge branch 'for-next/extable' into for-next/core

    * for-next/extable:
      arm64: vmlinux.lds.S: remove `.fixup` section
      arm64: extable: add load_unaligned_zeropad() handler
      arm64: extable: add a dedicated uaccess handler
      arm64: extable: add `type` and `data` fields
      arm64: extable: use `ex` for `exception_table_entry`
      arm64: extable: make fixup_exception() return bool
      arm64: extable: consolidate definitions
      arm64: gpr-num: support W registers
      arm64: factor out GPR numbering helpers
      arm64: kvm: use kvm_exception_table_entry
      arm64: lib: __arch_copy_to_user(): fold fixups into body
      arm64: lib: __arch_copy_from_user(): fold fixups into body
      arm64: lib: __arch_clear_user(): fold fixups into body
    willdeacon committed Oct 29, 2021
  14. Merge branch 'for-next/8.6-timers' into for-next/core

    * for-next/8.6-timers:
      arm64: Add HWCAP for self-synchronising virtual counter
      arm64: Add handling of CNTVCTSS traps
      arm64: Add CNT{P,V}CTSS_EL0 alternatives to cnt{p,v}ct_el0
      arm64: Add a capability for FEAT_ECV
      clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
      clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
      clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
      clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
      clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
      clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
      clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
      clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
      clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
      clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
      clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
      clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
      clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
    willdeacon committed Oct 29, 2021

Commits on Oct 28, 2021

  1. arm64: Select POSIX_CPU_TIMERS_TASK_WORK

    With 6caa581 ("KVM: arm64: Use generic KVM xfer to guest work
    function") all arm64 exit paths are properly equipped to handle the
    POSIX timers' task work.
    
    Deferring timer callbacks to thread context, not only limits the amount
    of time spent in hard interrupt context, but is a safer
    implementation[1], and will allow PREEMPT_RT setups to use KVM[2].
    
    So let's enable POSIX_CPU_TIMERS_TASK_WORK on arm64.
    
    [1] https://lore.kernel.org/all/20200716201923.228696399@linutronix.de/
    [2] https://lore.kernel.org/linux-rt-users/87v92bdnlx.ffs@tglx/
    
    Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
    Acked-by: Mark Rutland <mark.rutland@arm.com>
    Acked-by: Marc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20211018144713.873464-1-nsaenzju@redhat.com
    Signed-off-by: Will Deacon <will@kernel.org>
    vianpl authored and willdeacon committed Oct 28, 2021
  2. arm64: Document boot requirements for FEAT_SME_FA64

    The EAC1 release of the SME specification adds the FA64 feature which
    requires enablement at higher ELs before lower ELs can use it. Document
    what we require from higher ELs in our boot requirements.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211026111802.12853-1-broonie@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    broonie authored and willdeacon committed Oct 28, 2021

Commits on Oct 26, 2021

  1. arm64/sve: Fix warnings when SVE is disabled

    In configurations where SVE is disabled we define but never reference the
    functions for retrieving the default vector length, causing warnings. Fix
    this by move the ifdef up, marking get_default_vl() inline since it is
    referenced from code guarded by an IS_ENABLED() check, and do the same for
    the other accessors for consistency.
    
    Reported-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211022141635.2360415-3-broonie@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    broonie authored and willdeacon committed Oct 26, 2021
  2. arm64/sve: Add stub for sve_max_virtualisable_vl()

    Fixes build problems for configurations with KVM enabled but SVE disabled.
    
    Reported-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211022141635.2360415-2-broonie@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    broonie authored and willdeacon committed Oct 26, 2021

Commits on Oct 21, 2021

  1. arm64: errata: Add detection for TRBE write to out-of-range

    Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where
    the trbe, under some circumstances, might write upto 64bytes to an
    address after the Limit as programmed by the TRBLIMITR_EL1.LIMIT.
    This might -
      - Corrupt a page in the ring buffer, which may corrupt trace from a
        previous session, consumed by userspace.
      - Hit the guard page at the end of the vmalloc area and raise a fault.
    
    To keep the handling simpler, we always leave the last page from the
    range, which TRBE is allowed to write. This can be achieved by ensuring
    that we always have more than a PAGE worth space in the range, while
    calculating the LIMIT for TRBE. And then the LIMIT pointer can be
    adjusted to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the
    TRBE range while enabling it. This makes sure that the TRBE will only
    write to an area within its allowed limit (i.e, [head-head+size]) and
    we do not have to handle address faults within the driver.
    
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
    Cc: Mike Leach <mike.leach@linaro.org>
    Cc: Leo Yan <leo.yan@linaro.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
    Link: https://lore.kernel.org/r/20211019163153.3692640-5-suzuki.poulose@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Suzuki K Poulose authored and willdeacon committed Oct 21, 2021
  2. arm64: errata: Add workaround for TSB flush failures

    Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
    from errata, where a TSB (trace synchronization barrier)
    fails to flush the trace data completely, when executed from
    a trace prohibited region. In Linux we always execute it
    after we have moved the PE to trace prohibited region. So,
    we can apply the workaround every time a TSB is executed.
    
    The work around is to issue two TSB consecutively.
    
    NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
    that a late CPU could be blocked from booting if it is the
    first CPU that requires the workaround. This is because we
    do not allow setting a cpu_hwcaps after the SMP boot. The
    other alternative is to use "this_cpu_has_cap()" instead
    of the faster system wide check, which may be a bit of an
    overhead, given we may have to do this in nvhe KVM host
    before a guest entry.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
    Cc: Mike Leach <mike.leach@linaro.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Marc Zyngier <maz@kernel.org>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
    Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Suzuki K Poulose authored and willdeacon committed Oct 21, 2021
  3. arm64: errata: Add detection for TRBE overwrite in FILL mode

    Arm Neoverse-N2 and the Cortex-A710 cores are affected
    by a CPU erratum where the TRBE will overwrite the trace buffer
    in FILL mode. The TRBE doesn't stop (as expected in FILL mode)
    when it reaches the limit and wraps to the base to continue
    writing upto 3 cache lines. This will overwrite any trace that
    was written previously.
    
    Add the Neoverse-N2 erratum(#2139208) and Cortex-A710 erratum
    (#2119858) to the detection logic.
    
    This will be used by the TRBE driver in later patches to work
    around the issue. The detection has been kept with the core
    arm64 errata framework list to make sure :
      - We don't duplicate the framework in TRBE driver
      - The errata detection is advertised like the rest
        of the CPU errata.
    
    Note that the Kconfig entries are not fully active until the
    TRBE driver implements the work around.
    
    Cc: Will Deacon <will@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
    Cc: Mike Leach <mike.leach@linaro.org>
    cc: Leo Yan <leo.yan@linaro.org>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
    Link: https://lore.kernel.org/r/20211019163153.3692640-3-suzuki.poulose@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Suzuki K Poulose authored and willdeacon committed Oct 21, 2021
  4. arm64: Add Neoverse-N2, Cortex-A710 CPU part definition

    Add the CPU Partnumbers for the new Arm designs.
    
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
    Link: https://lore.kernel.org/r/20211019163153.3692640-2-suzuki.poulose@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Suzuki K Poulose authored and willdeacon committed Oct 21, 2021
  5. selftests: arm64: Factor out utility functions for assembly FP tests

    The various floating point test programs written in assembly have a bunch
    of helper functions and macros which are cut'n'pasted between them. Factor
    them out into a separate source file which is linked into all of them.
    
    We don't include memcmp() since it isn't as generic as it should be and
    directly branches to report an error in the programs.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20211019181851.3341232-1-broonie@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    broonie authored and willdeacon committed Oct 21, 2021
  6. arm64: vmlinux.lds.S: remove .fixup section

    We no longer place anything into a `.fixup` section, so we no longer
    need to place those sections into the `.text` section in the main kernel
    Image.
    
    Remove the use of `.fixup`.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Robin Murphy <robin.murphy@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20211019160219.5202-14-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Mark Rutland authored and willdeacon committed Oct 21, 2021
  7. arm64: extable: add load_unaligned_zeropad() handler

    For inline assembly, we place exception fixups out-of-line in the
    `.fixup` section such that these are out of the way of the fast path.
    This has a few drawbacks:
    
    * Since the fixup code is anonymous, backtraces will symbolize fixups as
      offsets from the nearest prior symbol, currently
      `__entry_tramp_text_end`. This is confusing, and painful to debug
      without access to the relevant vmlinux.
    
    * Since the exception handler adjusts the PC to execute the fixup, and
      the fixup uses a direct branch back into the function it fixes,
      backtraces of fixups miss the original function. This is confusing,
      and violates requirements for RELIABLE_STACKTRACE (and therefore
      LIVEPATCH).
    
    * Inline assembly and associated fixups are generated from templates,
      and we have many copies of logically identical fixups which only
      differ in which specific registers are written to and which address is
      branched to at the end of the fixup. This is potentially wasteful of
      I-cache resources, and makes it hard to add additional logic to fixups
      without significant bloat.
    
    * In the case of load_unaligned_zeropad(), the logic in the fixup
      requires a temporary register that we must allocate even in the
      fast-path where it will not be used.
    
    This patch address all four concerns for load_unaligned_zeropad() fixups
    by adding a dedicated exception handler which performs the fixup logic
    in exception context and subsequent returns back after the faulting
    instruction. For the moment, the fixup logic is identical to the old
    assembly fixup logic, but in future we could enhance this by taking the
    ESR and FAR into account to constrain the faults we try to fix up, or to
    specialize fixups for MTE tag check faults.
    
    Other than backtracing, there should be no functional change as a result
    of this patch.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Robin Murphy <robin.murphy@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20211019160219.5202-13-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Mark Rutland authored and willdeacon committed Oct 21, 2021
  8. arm64: extable: add a dedicated uaccess handler

    For inline assembly, we place exception fixups out-of-line in the
    `.fixup` section such that these are out of the way of the fast path.
    This has a few drawbacks:
    
    * Since the fixup code is anonymous, backtraces will symbolize fixups as
      offsets from the nearest prior symbol, currently
      `__entry_tramp_text_end`. This is confusing, and painful to debug
      without access to the relevant vmlinux.
    
    * Since the exception handler adjusts the PC to execute the fixup, and
      the fixup uses a direct branch back into the function it fixes,
      backtraces of fixups miss the original function. This is confusing,
      and violates requirements for RELIABLE_STACKTRACE (and therefore
      LIVEPATCH).
    
    * Inline assembly and associated fixups are generated from templates,
      and we have many copies of logically identical fixups which only
      differ in which specific registers are written to and which address is
      branched to at the end of the fixup. This is potentially wasteful of
      I-cache resources, and makes it hard to add additional logic to fixups
      without significant bloat.
    
    This patch address all three concerns for inline uaccess fixups by
    adding a dedicated exception handler which updates registers in
    exception context and subsequent returns back into the function which
    faulted, removing the need for fixups specialized to each faulting
    instruction.
    
    Other than backtracing, there should be no functional change as a result
    of this patch.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Robin Murphy <robin.murphy@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Mark Rutland authored and willdeacon committed Oct 21, 2021
  9. arm64: extable: add type and data fields

    Subsequent patches will add specialized handlers for fixups, in addition
    to the simple PC fixup and BPF handlers we have today. In preparation,
    this patch adds a new `type` field to struct exception_table_entry, and
    uses this to distinguish the fixup and BPF cases. A `data` field is also
    added so that subsequent patches can associate data specific to each
    exception site (e.g. register numbers).
    
    Handlers are named ex_handler_*() for consistency, following the exmaple
    of x86. At the same time, get_ex_fixup() is split out into a helper so
    that it can be used by other ex_handler_*() functions ins subsequent
    patches.
    
    This patch will increase the size of the exception tables, which will be
    remedied by subsequent patches removing redundant fixup code. There
    should be no functional change as a result of this patch.
    
    Since each entry is now 12 bytes in size, we must reduce the alignment
    of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4
    bytes), which is the natrual alignment of the `insn` and `fixup` fields.
    The current 8-byte alignment is a holdover from when the `insn` and
    `fixup` fields was 8 bytes, and while not harmful has not been necessary
    since commit:
    
      6c94f27 ("arm64: switch to relative exception tables")
    
    Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes.
    
    Concurrently with this patch, x86's exception table entry format is
    being updated (similarly to a 12-byte format, with 32-bytes of absolute
    data). Once both have been merged it should be possible to unify the
    sorttable logic for the two.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andrii Nakryiko <andrii@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morse <james.morse@arm.com>
    Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
    Cc: Robin Murphy <robin.murphy@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Mark Rutland authored and willdeacon committed Oct 21, 2021
  10. arm64: extable: use ex for exception_table_entry

    Subsequent patches will extend `struct exception_table_entry` with more
    fields, and the distinction between the entry and its `fixup` field will
    become more important.
    
    For clarity, let's consistently use `ex` to refer to refer to an entire
    entry. In subsequent patches we'll use `fixup` to refer to the fixup
    field specifically. This matches the naming convention used today in
    arch/arm64/net/bpf_jit_comp.c.
    
    There should be no functional change as a result of this patch.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Acked-by: Robin Murphy <robin.murphy@arm.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20211019160219.5202-10-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Mark Rutland authored and willdeacon committed Oct 21, 2021
  11. arm64: extable: make fixup_exception() return bool

    The return values of fixup_exception() and arm64_bpf_fixup_exception()
    represent a boolean condition rather than an error code, so for clarity
    it would be better to return `bool` rather than `int`.
    
    This patch adjusts the code accordingly. While we're modifying the
    prototype, we also remove the unnecessary `extern` keyword, so that this
    won't look out of place when we make subsequent additions to the header.
    
    There should be no functional change as a result of this patch.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andrii Nakryiko <andrii@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: James Morse <james.morse@arm.com>
    Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
    Cc: Robin Murphy <robin.murphy@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Link: https://lore.kernel.org/r/20211019160219.5202-9-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Mark Rutland authored and willdeacon committed Oct 21, 2021
Older