Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv: Support vendor extensions and xtheadvector #865

Closed
wants to merge 26 commits into from

Commits on Apr 28, 2024

  1. cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

    When the cpus in the same cluster are all in the idle state, the kernel
    might put the cluster into a deeper low power state. Call the
    cluster_pm_enter() before entering the low power state and call the
    cluster_pm_exit() after the cluster woken up.
    
    Signed-off-by: Nick Hu <nick.hu@sifive.com>
    Link: https://lore.kernel.org/r/20240226065113.1690534-1-nick.hu@sifive.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    nick650823 authored and palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    3fd665f View commit details
    Browse the repository at this point in the history
  2. Merge patch series "riscv: fix patching with IPI"

    Alexandre Ghiti <alexghiti@rivosinc.com> says:
    
    patch 1 removes a useless memory barrier and patch 2 actually fixes the
    issue with IPI in the patching code.
    
    * b4-shazam-merge:
      riscv: Fix text patching when IPI are used
      riscv: Remove superfluous smp_mb()
    
    Link: https://lore.kernel.org/r/20240229121056.203419-1-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    30dae23 View commit details
    Browse the repository at this point in the history
  3. Merge patch series "riscv: Create and document PR_RISCV_SET_ICACHE_FL…

    …USH_CTX prctl"
    
    Charlie Jenkins <charlie@rivosinc.com> says:
    
    Improve the performance of icache flushing by creating a new prctl flag
    PR_RISCV_SET_ICACHE_FLUSH_CTX. The interface is left generic to allow
    for future expansions such as with the proposed J extension [1].
    
    Documentation is also provided to explain the use case.
    
    Patch sent to add PR_RISCV_SET_ICACHE_FLUSH_CTX to man-pages [2].
    
    [1] https://github.com/riscv/riscv-j-extension
    [2] https://lore.kernel.org/linux-man/20240124-fencei_prctl-v1-1-0bddafcef331@rivosinc.com
    
    * b4-shazam-merge:
      cpumask: Add assign cpu
      documentation: Document PR_RISCV_SET_ICACHE_FLUSH_CTX prctl
      riscv: Include riscv_set_icache_flush_ctx prctl
      riscv: Remove unnecessary irqflags processor.h include
    
    Link: https://lore.kernel.org/r/20240312-fencei-v13-0-4b6bdc2bbf32@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    a4c0451 View commit details
    Browse the repository at this point in the history
  4. riscv: Remove redundant CONFIG_64BIT from pgtable_l{4,5}_enabled

    IS_ENABLED(CONFIG_64BIT) in initialization of pgtable_l{4,5}_enabled is
    redundant, remove it.
    
    Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240320064712.442579-2-dawei.li@shingroup.cn
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Dawei Li authored and palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    3a0dc44 View commit details
    Browse the repository at this point in the history
  5. riscv: Annotate pgtable_l{4,5}_enabled with __ro_after_init

    pgtable_l{4,5}_enabled are read only after initialization, make explicit
    annotation of __ro_after_init on them.
    
    Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240320064712.442579-3-dawei.li@shingroup.cn
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Dawei Li authored and palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    7a04dd8 View commit details
    Browse the repository at this point in the history
  6. riscv: mm: still create swiotlb buffer for kmalloc() bouncing if requ…

    …ired
    
    After commit f51f7a0 ("riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC
    for !dma_coherent"), for non-coherent platforms with less than 4GB
    memory, we rely on users to pass "swiotlb=mmnn,force" kernel parameters
    to enable DMA bouncing for unaligned kmalloc() buffers. Now let's go
    further: If no bouncing needed for ZONE_DMA, let kernel automatically
    allocate 1MB swiotlb buffer per 1GB of RAM for kmalloc() bouncing on
    non-coherent platforms, so that no need to pass "swiotlb=mmnn,force"
    any more.
    
    The math of "1MB swiotlb buffer per 1GB of RAM for kmalloc() bouncing"
    is taken from arm64. Users can still force smaller swiotlb buffer by
    passing "swiotlb=mmnn".
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240325110036.1564-1-jszhang@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    xhackerustc authored and palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    fc7a50e View commit details
    Browse the repository at this point in the history
  7. Merge patch series "riscv: enable lockless lockref implementation"

    Jisheng Zhang <jszhang@kernel.org> says:
    
    This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
    cmpxchg-based lockless lockref implementation for riscv. Then,
    implement arch_cmpxchg64_{relaxed|acquire|release}.
    
    After patch1:
    Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
    On JH7110 platform, I see 12.0% improvement.
    
    After patch2:
    on both TH1520 and JH7110 platforms, I didn't see obvious
    performance improvement with Linus' test case [1]. IMHO, this may
    be related with the fence and lr.d/sc.d hw implementations. In theory,
    lr/sc without fence could give performance improvement over lr/sc plus
    fence, so add the code here to leave performance improvement room on
    newer HW platforms.
    
    * b4-shazam-merge:
      riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
      riscv: select ARCH_USE_CMPXCHG_LOCKREF
    
    Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
    Link: https://lore.kernel.org/r/20240325111038.1700-1-jszhang@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    e445302 View commit details
    Browse the repository at this point in the history
  8. riscv: select ARCH_HAS_FAST_MULTIPLIER

    Currently, riscv linux requires at least IMA, so all platforms have a
    multiplier. And I assume the 'mul' efficiency is comparable or better
    than a sequence of five or so register-dependent arithmetic
    instructions. Select ARCH_HAS_FAST_MULTIPLIER to get slightly nicer
    codegen. Refer to commit f9b4192 ("[PATCH] bitops: hweight()
    speedup") for more details.
    
    In a simple benchmark test calling hweight64() in a loop, it got:
    about 14% performance improvement on JH7110, tested on Milkv Mars.
    
    about 23% performance improvement on TH1520 and SG2042, tested on
    Sipeed LPI4A and SG2042 platform.
    
    a slight performance drop on CV1800B, tested on milkv duo. Among all
    riscv platforms in my hands, this is the only one which sees a slight
    performance drop. It means the 'mul' isn't quick enough. However, the
    situation exists on x86 too, for example, P4 doesn't have fast
    integer multiplies as said in the above commit, x86 also selects
    ARCH_HAS_FAST_MULTIPLIER. So let's select ARCH_HAS_FAST_MULTIPLIER
    which can benefit almost riscv platforms.
    
    Samuel also provided some performance numbers:
    On Unmatched: 20% speedup for __sw_hweight32 and 30% speedup for
    __sw_hweight64.
    On D1: 8% speedup for __sw_hweight32 and 8% slowdown for
    __sw_hweight64.
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
    Tested-by: Samuel Holland <samuel.holland@sifive.com>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240325105823.1483-1-jszhang@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    xhackerustc authored and palmer-dabbelt committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    0a16a17 View commit details
    Browse the repository at this point in the history
  9. adding ci files

    Björn Töpel committed Apr 28, 2024
    Configuration menu
    Copy the full SHA
    ff9921f View commit details
    Browse the repository at this point in the history

Commits on May 3, 2024

  1. dt-bindings: riscv: Add xtheadvector ISA extension description

    The xtheadvector ISA extension is described on the T-Head extension spec
    Github page [1] at commit 95358cb2cca9.
    
    Link: https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361c61d335e03d3134b14133f/xtheadvector.adoc [1]
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    2e3f8bc View commit details
    Browse the repository at this point in the history
  2. dt-bindings: riscv: cpus: add a vlen register length property

    Add a property analogous to the vlenb CSR so that software can detect
    the vector length of each CPU prior to it being brought online.
    Currently software has to assume that the vector length read from the
    boot CPU applies to all possible CPUs. On T-Head CPUs implementing
    pre-ratification vector, reading the th.vlenb CSR may produce an illegal
    instruction trap, so this property is required on such systems.
    
    Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    ConchuOD authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    2224c88 View commit details
    Browse the repository at this point in the history
  3. riscv: vector: Use vlenb from DT

    If vlenb is provided in the device tree, prefer that over reading the
    vlenb csr.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    f3dadcb View commit details
    Browse the repository at this point in the history
  4. riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree

    The D1/D1s SoCs support xtheadvector so it can be included in the
    devicetree. Also include vlenb for the cpu.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    a44e48c View commit details
    Browse the repository at this point in the history
  5. riscv: Extend cpufeature.c to detect vendor extensions

    Separate vendor extensions out into one struct per vendor
    instead of adding vendor extensions onto riscv_isa_ext.
    
    Add a hidden config RISCV_ISA_VENDOR_EXT to conditionally include this
    code.
    
    The xtheadvector vendor extension is added using these changes.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    57d7044 View commit details
    Browse the repository at this point in the history
  6. riscv: Add vendor extensions to /proc/cpuinfo

    All of the supported vendor extensions that have been listed in
    riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    015b2c0 View commit details
    Browse the repository at this point in the history
  7. riscv: Introduce vendor variants of extension helpers

    Vendor extensions are maintained in per-vendor structs (separate from
    standard extensions which live in riscv_isa). Create vendor variants for
    the existing extension helpers to interface with the riscv_isa_vendor
    bitmaps.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    f13a2e0 View commit details
    Browse the repository at this point in the history
  8. riscv: cpufeature: Extract common elements from extension checking

    The __riscv_has_extension_likely() and __riscv_has_extension_unlikely()
    functions from the vendor_extensions.h can be used to simplify the
    standard extension checking code as well. Migrate those functions to
    cpufeature.h and reorganize the code in the file to use the functions.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    5f323cb View commit details
    Browse the repository at this point in the history
  9. riscv: Convert xandespmu to use the vendor extension framework

    Migrate xandespmu out of riscv_isa_ext and into a new Andes-specific
    vendor namespace.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    96faf2e View commit details
    Browse the repository at this point in the history
  10. RISC-V: define the elements of the VCSR vector CSR

    The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].
    
    Define constants for those to access the elements in a readable way.
    
    Acked-by: Guo Ren <guoren@kernel.org>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    mmind authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    cac64e3 View commit details
    Browse the repository at this point in the history
  11. riscv: csr: Add CSR encodings for VCSR_VXRM/VCSR_VXSAT

    The VXRM vector csr for xtheadvector has an encoding of 0xa and VXSAT
    has an encoding of 0x9.
    
    Co-developed-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    9ef0115 View commit details
    Browse the repository at this point in the history
  12. riscv: Add xtheadvector instruction definitions

    xtheadvector uses different encodings than standard vector for
    vsetvli and vector loads/stores. Write the instruction formats to be
    used in assembly code.
    
    Co-developed-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    e549023 View commit details
    Browse the repository at this point in the history
  13. riscv: vector: Support xtheadvector save/restore

    Use alternatives to add support for xtheadvector vector save/restore
    routines.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    f972f9d View commit details
    Browse the repository at this point in the history
  14. riscv: hwprobe: Add thead vendor extension probing

    Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0" which
    allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR
    vendor extension.
    
    This new key will allow userspace code to probe for which thead vendor
    extensions are supported. This API is modeled to be consistent with
    RISCV_HWPROBE_KEY_IMA_EXT_0. The bitmask returned will have each bit
    corresponding to a supported thead vendor extension of the cpumask set.
    Just like RISCV_HWPROBE_KEY_IMA_EXT_0, this allows a userspace program
    to determine all of the supported thead vendor extensions in one call.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Reviewed-by: Evan Green <evan@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    dd79a18 View commit details
    Browse the repository at this point in the history
  15. riscv: hwprobe: Document thead vendor extensions and xtheadvector ext…

    …ension
    
    Document support for thead vendor extensions using the key
    RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 and xtheadvector extension using
    the key RISCV_HWPROBE_VENDOR_EXT_XTHEADVECTOR.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Reviewed-by: Evan Green <evan@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    7c29145 View commit details
    Browse the repository at this point in the history
  16. selftests: riscv: Fix vector tests

    Overhaul the riscv vector tests to use kselftest_harness to help the
    test cases correctly report the results and decouple the individual test
    cases from each other. With this refactoring, only run the test cases is
    vector is reported and properly report the test case as skipped
    otherwise. The v_initval_nolibc test was previously not checking if
    vector was supported and used a function (malloc) which invalidates
    the state of the vector registers.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    3eb103e View commit details
    Browse the repository at this point in the history
  17. selftests: riscv: Support xtheadvector in vector tests

    Extend existing vector tests to be compatible with the xtheadvector
    instruction set.
    
    Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
    Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
    charlie-rivos authored and Björn Töpel committed May 3, 2024
    Configuration menu
    Copy the full SHA
    a5d53e5 View commit details
    Browse the repository at this point in the history