Skip to content
Permalink
Mark-Brown/arm…
Switch branches/tags

Commits on Feb 7, 2022

  1. kselftest/arm64: Add SME support to syscall ABI test

    For every possible combination of SVE and SME vector length verify that for
    each possible value of SVCR after a syscall we leave streaming mode and ZA
    is preserved. We don't need to take account of any streaming/non streaming
    SVE vector length changes in the assembler code since the store instructions
    will handle the vector length for us. We log if the system supports FA64 and
    only try to set FFR in streaming mode if it does.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  2. kselftest/arm64: Add coverage for the ZA ptrace interface

    Add some basic coverage for the ZA ptrace interface, including walking
    through all the vector lengths supported in the system.  Unlike SVE
    doing syscalls does not discard the ZA state so when we set data in ZA
    we run the child process briefly, having it add one to each byte in ZA
    in order to validate that both the vector size and data are being read
    and written as expected when the process runs.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  3. kselftest/arm64: Add streaming SVE to SVE ptrace tests

    In order to allow ptrace of streaming mode SVE registers we have added a
    new regset for streaming mode which in isolation offers the same ABI as
    regular SVE with a different vector type. Add this to the array of regsets
    we handle, together with additional tests for the interoperation of the
    two regsets.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  4. kselftest/arm64: signal: Add SME signal handling tests

    Add test cases for the SME signal handing ABI patterned off the SVE tests.
    Due to the small size of the tests and the differences in ABI (especially
    around needing to account for both streaming SVE and ZA) there is some code
    duplication here.
    
    We currently cover:
     - Reporting of the vector length.
     - Lack of support for changing vector length.
     - Presence and size of register state for streaming SVE and ZA.
    
    As with the SVE tests we do not yet have any validation of register
    contents.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  5. kselftest/arm64: Add stress test for SME ZA context switching

    Add a stress test for context switching of the ZA register state based on
    the similar tests Dave Martin wrote for FPSIMD and SVE registers. The test
    loops indefinitely writing a data pattern to ZA then reading it back and
    verifying that it's what was expected.
    
    Unlike the other tests we manually assemble the SME instructions since at
    present no released toolchain has SME support integrated.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  6. kselftest/arm64: signal: Handle ZA signal context in core code

    As part of the generic code for signal handling test cases we parse all
    signal frames to make sure they have at least the basic form we expect
    and that there are no unexpected frames present in the signal context.
    Add coverage of the ZA signal frame to this code.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  7. kselftest/arm64: sme: Provide streaming mode SVE stress test

    One of the features of SME is the addition of streaming mode, in which we
    have access to a set of streaming mode SVE registers at the SME vector
    length. Since these are accessed using the SVE instructions let's reuse
    the existing SVE stress test for testing with a compile time option for
    controlling the few small differences needed:
    
     - Enter streaming mode immediately on starting the program.
     - In streaming mode FFR is removed so skip reading and writing FFR.
    
    In order to avoid requiring a cutting edge toolchain with SME support
    use the op/CR form for specifying SVCR.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  8. kselftest/arm64: Extend vector configuration API tests to cover SME

    Provide RDVL helpers for SME and extend the main vector configuration tests
    to cover SME.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  9. kselftest/arm64: Add tests for TPIDR2

    The Scalable Matrix Extension adds a new system register TPIDR2 intended to
    be used by libc for its own thread specific use, add some kselftests which
    exercise the ABI for it.
    
    Since this test should with some adjustment work for TPIDR and any other
    similar registers added in future add tests for it in a separate
    directory rather than placing it with the other floating point tests,
    nothing existing looked suitable so I created a new test directory
    called "abi".
    
    Since this feature is intended to be used by libc the test is built as
    freestanding code using nolibc so we don't end up with the test program
    and libc both trying to manage the register simultaneously and
    distrupting each other. As a result of being written using nolibc rather
    than using hwcaps to identify if SME is available in the system we check
    for the default SME vector length configuration in proc, adding hwcap
    support to nolibc seems like disproportionate effort and didn't feel
    entirely idiomatic for what nolibc is trying to do.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  10. kselftest/arm64: sme: Add SME support to vlset

    The Scalable Matrix Extenions (SME) introduces additional register state
    with configurable vector lengths, similar to SVE but configured separately.
    Extend vlset to support configuring this state with a --sme or -s command
    line option.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  11. kselftest/arm64: Add manual encodings for SME instructions

    As for the kernel so that we don't have ambitious toolchain requirements
    to build the tests manually encode some of the SVE instructions.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  12. arm64/sme: Provide Kconfig for SME

    Now that basline support for the Scalable Matrix Extension (SME) is present
    introduce the Kconfig option allowing it to be built. While the feature
    registers don't impose a strong requirement for a system with SME to
    support SVE at runtime the support for streaming mode SVE is mostly
    shared with normal SVE so depend on SVE.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  13. KVM: arm64: Handle SME host state when running guests

    While we don't currently support SME in guests we do currently support it
    for the host system so we need to take care of SME's impact, including
    the floating point register state, when running guests. Simiarly to SVE
    we need to manage the traps in CPACR_RL1, what is new is the handling of
    streaming mode and ZA.
    
    Normally we defer any handling of the floating point register state until
    the guest first uses it however if the system is in streaming mode FPSIMD
    and SVE operations may generate SME traps which we would need to distinguish
    from actual attempts by the guest to use SME. Rather than do this for the
    time being if we are in streaming mode when entering the guest we force
    the floating point state to be saved immediately and exit streaming mode,
    meaning that the guest won't generate SME traps for supported operations.
    
    We could handle ZA in the access trap similarly to the FPSIMD/SVE state
    without the disruption caused by streaming mode but for simplicity
    handle it the same way as streaming mode for now.
    
    This will be revisited when we support SME for guests (hopefully before SME
    hardware becomes available), for now it will only incur additional cost on
    systems with SME and even there only if streaming mode or ZA are enabled.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  14. KVM: arm64: Trap SME usage in guest

    SME defines two new traps which need to be enabled for guests to ensure
    that they can't use SME, one for the main SME operations which mirrors the
    traps for SVE and another for access to TPIDR2 in SCTLR_EL2.
    
    For VHE manage SMEN along with ZEN in activate_traps() and the FP state
    management callbacks, along with SCTLR_EL2.EnTPIDR2.  There is no
    existing dynamic management of SCTLR_EL2.
    
    For nVHE manage TSM in activate_traps() along with the fine grained
    traps for TPIDR2 and SMPRI.  There is no existing dynamic management of
    fine grained traps.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  15. KVM: arm64: Hide SME system registers from guests

    For the time being we do not support use of SME by KVM guests, support for
    this will be enabled in future. In order to prevent any side effects or
    side channels via the new system registers, including the EL0 read/write
    register TPIDR2, explicitly undefine all the system registers added by
    SME and mask out the SME bitfield in SYS_ID_AA64PFR1.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  16. arm64/sme: Save and restore streaming mode over EFI runtime calls

    When saving and restoring the floating point state over an EFI runtime
    call ensure that we handle streaming mode, only handling FFR if we are not
    in streaming mode and ensuring that we are in normal mode over the call
    into runtime services.
    
    We currently assume that ZA will not be modified by runtime services, the
    specification is not yet finalised so this may need updating if that
    changes.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  17. arm64/sme: Disable streaming mode and ZA when flushing CPU state

    Both streaming mode and ZA may increase power consumption when they are
    enabled and streaming mode makes many FPSIMD and SVE instructions undefined
    which will cause problems for any kernel mode floating point so disable
    both when we flush the CPU state. This covers both kernel_neon_begin() and
    idle and after flushing the state a reload is always required anyway.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  18. arm64/sme: Add ptrace support for ZA

    The ZA array can be read and written with the NT_ARM_ZA.  Similarly to
    our interface for the SVE vector registers the regset consists of a
    header with information on the current vector length followed by an
    optional register data payload, represented as for signals as a series
    of horizontal vectors from 0 to VL/8 in the endianness independent
    format used for vectors.
    
    On get if ZA is enabled then register data will be provided, otherwise
    it will be omitted.  On set if register data is provided then ZA is
    enabled and initialized using the provided data, otherwise it is
    disabled.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  19. arm64/sme: Implement ptrace support for streaming mode SVE registers

    The streaming mode SVE registers are represented using the same data
    structures as for SVE but since the vector lengths supported and in use
    may not be the same as SVE we represent them with a new type NT_ARM_SSVE.
    Unfortunately we only have a single 16 bit reserved field available in
    the header so there is no space to fit the current and maximum vector
    length for both standard and streaming SVE mode without redefining the
    structure in a way the creates a complicatd and fragile ABI. Since FFR
    is not present in streaming mode it is read and written as zero.
    
    Setting NT_ARM_SSVE registers will put the task into streaming mode,
    similarly setting NT_ARM_SVE registers will exit it. Reads that do not
    correspond to the current mode of the task will return the header with
    no register data. For compatibility reasons on write setting no flag for
    the register type will be interpreted as setting SVE registers, though
    users can provide no register data as an alternative mechanism for doing
    so.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  20. arm64/sme: Implement ZA signal handling

    Implement support for ZA in signal handling in a very similar way to how
    we implement support for SVE registers, using a signal context structure
    with optional register state after it. Where present this register state
    stores the ZA matrix as a series of horizontal vectors numbered from 0 to
    VL/8 in the endinanness independent format used for vectors.
    
    As with SVE we do not allow changes in the vector length during signal
    return but we do allow ZA to be enabled or disabled.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  21. arm64/sme: Implement streaming SVE signal handling

    When in streaming mode we have the same set of SVE registers as we do in
    regular SVE mode with the exception of FFR and the use of the SME vector
    length. Provide signal handling for these registers by taking one of the
    reserved words in the SVE signal context as a flags field and defining a
    flag with a flag which is set for streaming mode. When the flag is set the
    vector length is set to the streaming mode vector length and we save and
    restore streaming mode data. We support entering or leaving streaming mode
    based on the value of the flag but do not support changing the vector
    length, this is not currently supported SVE signal handling.
    
    We could instead allocate a separate record in the signal frame for the
    streaming mode SVE context but this inflates the size of the maximal signal
    frame required and adds complication when validating signal frames from
    userspace, especially given the current structure of the code.
    
    Any implementation of support for streaming mode vectors in signals will
    have some potential for causing issues for applications that attempt to
    handle SVE vectors in signals, use streaming mode but do not understand
    streaming mode in their signal handling code, it is hard to identify a
    case that is clearly better than any other - they all have cases where
    they could cause unexpected register corruption or faults.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  22. arm64/sme: Disable ZA and streaming mode when handling signals

    The ABI requires that streaming mode and ZA are disabled when invoking
    signal handlers, do this in setup_return() when we prepare the task state
    for the signal handler.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  23. arm64/sme: Implement traps and syscall handling for SME

    By default all SME operations in userspace will trap.  When this happens
    we allocate storage space for the SME register state, set up the SVE
    registers and disable traps.  We do not need to initialize ZA since the
    architecture guarantees that it will be zeroed when enabled and when we
    trap ZA is disabled.
    
    On syscall we exit streaming mode if we were previously in it and ensure
    that all but the lower 128 bits of the registers are zeroed while
    preserving the state of ZA. This follows the aarch64 PCS for SME, ZA
    state is preserved over a function call and streaming mode is exited.
    Since the traps for SME do not distinguish between streaming mode SVE
    and ZA usage if ZA is in use rather than reenabling traps we instead
    zero the parts of the SVE registers not shared with FPSIMD and leave SME
    enabled, this simplifies handling SME traps. If ZA is not in use then we
    reenable SME traps and fall through to normal handling of SVE.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  24. arm64/sme: Implement ZA context switching

    Allocate space for storing ZA on first access to SME and use that to save
    and restore ZA state when context switching. We do this by using the vector
    form of the LDR and STR ZA instructions, these do not require streaming
    mode and have implementation recommendations that they avoid contention
    issues in shared SMCU implementations.
    
    Since ZA is architecturally guaranteed to be zeroed when enabled we do not
    need to explicitly zero ZA, either we will be restoring from a saved copy
    or trapping on first use of SME so we know that ZA must be disabled.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  25. arm64/sme: Implement streaming SVE context switching

    When in streaming mode we need to save and restore the streaming mode
    SVE register state rather than the regular SVE register state. This uses
    the streaming mode vector length and omits FFR but is otherwise identical,
    if TIF_SVE is enabled when we are in streaming mode then streaming mode
    takes precedence.
    
    This does not handle use of streaming SVE state with KVM, ptrace or
    signals. This will be updated in further patches.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  26. arm64/sme: Implement SVCR context switching

    In SME the use of both streaming SVE mode and ZA are tracked through
    PSTATE.SM and PSTATE.ZA, visible through the system register SVCR.  In
    order to context switch the floating point state for SME we need to
    context switch the contents of this register as part of context
    switching the floating point state.
    
    Since changing the vector length exits streaming SVE mode and disables
    ZA we also make sure we update SVCR appropriately when setting vector
    length, and similarly ensure that new threads have streaming SVE mode
    and ZA disabled.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  27. arm64/sme: Implement support for TPIDR2

    The Scalable Matrix Extension introduces support for a new thread specific
    data register TPIDR2 intended for use by libc. The kernel must save the
    value of TPIDR2 on context switch and should ensure that all new threads
    start off with a default value of 0. Add a field to the thread_struct to
    store TPIDR2 and context switch it with the other thread specific data.
    
    In case there are future extensions which also use TPIDR2 we introduce
    system_supports_tpidr2() and use that rather than system_supports_sme()
    for TPIDR2 handling.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  28. arm64/sme: Implement vector length configuration prctl()s

    As for SVE provide a prctl() interface which allows processes to
    configure their SME vector length.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  29. arm64/sme: Implement sysctl to set the default vector length

    As for SVE provide a sysctl which allows the default SME vector length to
    be configured.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  30. arm64/sme: Identify supported SME vector lengths at boot

    The vector lengths used for SME are controlled through a similar set of
    registers to those for SVE and enumerated using a similar algorithm with
    some slight differences due to the fact that unlike SVE there are no
    restrictions on which combinations of vector lengths can be supported
    nor any mandatory vector lengths which must be implemented.  Add a new
    vector type and implement support for enumerating it.
    
    One slightly awkward feature is that we need to read the current vector
    length using a different instruction (or enter streaming mode which
    would have the same issue and be higher cost).  Rather than add an ops
    structure we add special cases directly in the otherwise generic
    vec_probe_vqs() function, this is a bit inelegant but it's the only
    place where this is an issue.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  31. arm64/sme: Basic enumeration support

    This patch introduces basic cpufeature support for discovering the presence
    of the Scalable Matrix Extension.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  32. arm64/sme: Early CPU setup for SME

    SME requires similar setup to that for SVE: disable traps to EL2 and
    make sure that the maximum vector length is available to EL1, for SME we
    have two traps - one for SME itself and one for TPIDR2.
    
    In addition since we currently make no active use of priority control
    for SCMUs we map all SME priorities lower ELs may configure to 0, the
    architecture specified minimum priority, to ensure that nothing we
    manage is able to configure itself to consume excessive resources.  This
    will need to be revisited should there be a need to manage SME
    priorities at runtime.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  33. arm64/sme: Manually encode SME instructions

    As with SVE rather than impose ambitious toolchain requirements for SME
    we manually encode the few instructions which we require in order to
    perform the work the kernel needs to do. The instructions used to save
    and restore context are provided as assembler macros while those for
    entering and leaving streaming mode are done in asm volatile blocks
    since they are expected to be used from C.
    
    We could do the SMSTART and SMSTOP operations with read/modify/write
    cycles on SVCR but using the aliases provided for individual field
    accesses should be slightly faster. These instructions are aliases for
    MSR but since our minimum toolchain requirements are old enough to mean
    that we can't use the sX_X_cX_cX_X form and they always use xzr rather
    than taking a value like write_sysreg_s() wants we just use .inst.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  34. arm64/sme: System register and exception syndrome definitions

    The arm64 Scalable Matrix Extension (SME) adds some new system registers,
    fields in existing system registers and exception syndromes. This patch
    adds definitions for these for use in future patches implementing support
    for this extension.
    
    Since SME will be the first user of FEAT_HCX in the kernel also include
    the definitions for enumerating it and the HCRX system register it adds.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
  35. arm64/sme: Provide ABI documentation for SME

    Provide ABI documentation for SME similar to that for SVE. Due to the very
    large overlap around streaming SVE mode in both implementation and
    interfaces documentation for streaming mode SVE is added to the SVE
    document rather than the SME one.
    
    Signed-off-by: Mark Brown <broonie@kernel.org>
    broonie authored and intel-lab-lkp committed Feb 7, 2022
Older