Skip to content

Latest commit

 

History

History
684 lines (507 loc) · 48.1 KB

File metadata and controls

684 lines (507 loc) · 48.1 KB

User Guide for RISC-V Target

The RISC-V target provides code generation for processors implementing supported variations of the RISC-V specification. It lives in the llvm/lib/Target/RISCV directory.

There have been a number of revisions to the RISC-V specifications. LLVM aims to implement the most recent ratified version of the standard RISC-V base ISAs and ISA extensions with pragmatic variances. The most recent specification can be found at: https://github.com/riscv/riscv-isa-manual/releases/.

The official RISC-V International specification page. is also worth checking, but tends to significantly lag the specifications linked above. Make sure to check the wiki for not yet integrated extensions and note that in addition, we sometimes carry support for extensions that have not yet been ratified (these will be marked as experimental - see below) and support various vendor-specific extensions (see below).

The current known variances from the specification are:

  • Unconditionally allowing instructions from zifencei, zicsr, zicntr, and zihpm without gating them on the extensions being enabled. Previous revisions of the specification included these instructions in the base ISA, and we preserve this behavior to avoid breaking existing code. If a future revision of the specification reuses these opcodes for other extensions, we may need to reevaluate this choice, and thus recommend users migrate build systems so as not to rely on this.
  • Allowing CSRs to be named without gating on specific extensions. This applies to all CSR names, not just those in zicsr, zicntr, and zihpm.
  • The ordering of z*, s*, and x* prefixed extension names is not enforced in user-specified ISA naming strings (e.g. -march).

We are actively deciding not to support multiple specification revisions at this time. We acknowledge a likely future need, but actively defer the decisions making around handling this until we have a concrete example of real hardware having shipped and an incompatible change to the specification made afterwards.

The specification defines five base instruction sets: RV32I, RV32E, RV64I, RV64E, and RV128I. Currently, LLVM fully supports RV32I, and RV64I. RV32E and RV64E are supported by the assembly-based tools only. RV128I is not supported.

To specify the target triple:

RISC-V Architectures
Architecture Description
riscv32 RISC-V with XLEN=32 (i.e. RV32I or RV32E)
riscv64 RISC-V with XLEN=64 (i.e. RV64I or RV64E)

To select an E variant ISA (e.g. RV32E instead of RV32I), use the base architecture string (e.g. riscv32) with the extension e.

Supported profile names can be passed using -march instead of a standard ISA naming string. Currently supported profiles:

  • rvi20u32
  • rvi20u64
  • rva20u64
  • rva20s64
  • rva22u64
  • rva22s64
  • rva23u64
  • rva23s64
  • rvb23u64
  • rvb23s64

Note that you can also append additional extension names to be enabled, e.g. rva20u64_zicond will enable the zicond extension in addition to those in the rva20u64 profile.

Profiles that are not yet ratified cannot be used unless -menable-experimental-extensions (or equivalent for other tools) is specified. This applies to the following profiles:

  • rvm23u32

The following table provides a status summary for extensions which have been ratified and thus have finalized specifications. When relevant, detailed notes on support follow.

Ratified Extensions by Status
Extension Status
A Supported
B Supported
C Supported
D Supported
F Supported
E Supported (See note)
H Assembly Support
M Supported
Q Assembly Support
Sdext Assembly Support (See note)
Sdtrig Assembly Support (See note)
Sha Supported
Shcounterenw Assembly Support (See note)
Shgatpa Assembly Support (See note)
Shlcofideleg Supported
Shtvala Assembly Support (See note)
Shvsatpa Assembly Support (See note)
Shvstvala Assembly Support (See note)
Shvstvecd Assembly Support (See note)
Smaia Supported
Smcdeleg Supported
Smcntrpmf Supported
Smcsrind Supported
Smctr Assembly Support
Smdbltrp Supported
Smepmp Supported
Smmpm Supported
Smnpm Supported
Smrnmi Supported
Smstateen Assembly Support
Ssaia Supported
Ssccfg Supported
Ssccptr Assembly Support (See note)
Sscofpmf Assembly Support
Sscounterenw Assembly Support (See note)
Sscsrind Supported
Ssctr Assembly Support
Ssdbltrp Supported
Ssnpm Supported
Sspm Supported
Ssqosid Assembly Support
Ssstateen Assembly Support (See note)
Ssstrict Assembly Support (See note)
Sstc Assembly Support
Sstvala Assembly Support (See note)
Sstvecd Assembly Support (See note)
Ssu64xl Assembly Support (See note)
Supm Supported
Svade Assembly Support (See note)
Svadu Assembly Support
Svbare Assembly Support (See note)
Svinval Assembly Support
Svnapot Assembly Support
Svpbmt Supported
Svrsw60t59b Supported
Svvptc Supported
V Supported
Za128rs Supported (See note)
Za64rs Supported (See note)
Zaamo Assembly Support
Zabha Supported
Zacas Supported (See note)
Zalasr Supported
Zalrsc Assembly Support
Zama16b Supported (See note)
Zawrs Assembly Support
Zba Supported
Zbb Supported
Zbc Supported
Zbkb Supported (See note)
Zbkc Supported
Zbkx Supported (See note)
Zbs Supported
Zca Supported
Zcb Supported
Zcd Supported
Zcf Supported
Zclsd Assembly Support
Zcmop Supported
Zcmp Supported
Zcmt Assembly Support
Zdinx Supported
Zfa Supported
Zfbfmin Supported
Zfh Supported
Zfhmin Supported
Zfinx Supported
Zhinx Supported
Zhinxmin Supported
Zic64b Supported (See note)
Zicbom Assembly Support
Zicbop Supported
Zicboz Assembly Support
Ziccamoa Supported (See note)
Ziccamoc Supported (See note)
Ziccid Supported
Ziccif Supported (See note)
Zicclsm Supported (See note)
Ziccrse Supported (See note)
Zicntr (See Note)
Zicond Supported
Zicsr (See Note)
Zifencei (See Note)
Zihintntl Supported
Zihintpause Assembly Support
Zihpm (See Note)
Zilsd Supported
Zimop Supported
Zkn Supported
Zknd Supported (See note)
Zkne Supported (See note)
Zknh Supported (See note)
Zksed Supported (See note)
Zksh Supported (See note)
Zk Supported
Zkr Supported
Zks Supported
Zkt Supported
Zmmul Supported
Ztso Supported
Zvbb Supported
Zvbc Supported (See note)
Zve32x (Partially) Supported
Zve32f (Partially) Supported
Zve64x Supported
Zve64f Supported
Zve64d Supported
Zvfbfa Assembly Support
Zvfbfmin Supported
Zvfbfwma Supported
Zvfh Supported
Zvfhmin Supported
Zvfofp8min Assembly Support
Zvkb Supported
Zvkg Supported (See note)
Zvkn Supported (See note)
Zvknc Supported (See note)
Zvkned Supported (See note)
Zvkng Supported (See note)
Zvknha Supported (See note)
Zvknhb Supported (See note)
Zvks Supported (See note)
Zvksc Supported (See note)
Zvksed Supported (See note)
Zvksg Supported (See note)
Zvksh Supported (See note)
Zvkt Supported
Zvl32b (Partially) Supported
Zvl64b Supported
Zvl128b Supported
Zvl256b Supported
Zvl512b Supported
Zvl1024b Supported
Zvl2048b Supported
Zvl4096b Supported
Zvl8192b Supported
Zvl16384b Supported
Zvl32768b Supported
Zvl65536b Supported
Assembly Support
LLVM supports the associated instructions in assembly. All assembly related tools (e.g. assembler, disassembler, llvm-objdump, etc..) are supported. Compiler and linker will accept extension names, and linked binaries will contain appropriate ELF flags and attributes to reflect use of named extension.
Supported
Fully supported by the compiler. This includes everything in Assembly Support, along with - if relevant - C language intrinsics for the instructions and pattern matching by the compiler to recognize idiomatic patterns which can be lowered to the associated instructions.
E
Support of RV32E/RV64E and ilp32e/lp64e ABIs are experimental. To be compatible with the implementation of ilp32e in GCC, we don't use aligned registers to pass variadic arguments. Furthermore, we set the stack alignment to 4 bytes for types with length of 2*XLEN.
Zbkb, Zbkx
Pattern matching support for these instructions is incomplete.
Zknd, Zkne, Zknh, Zksed, Zksh
No pattern matching exists. As a result, these instructions can only be used from assembler or via intrinsic calls.
Zvbc, Zvkg, Zvkn, Zvknc, Zvkned, Zvkng, Zvknha, Zvknhb, Zvks, Zvks, Zvks, Zvksc, Zvksed, Zvksg, Zvksh.
No pattern matching exists. As a result, these instructions can only be used from assembler or via intrinsic calls.
Zve32x, Zve32f, Zvl32b
LLVM currently assumes a minimum VLEN (vector register width) of 64 bits during compilation, and as a result Zve32x and Zve32f are supported only for VLEN>=64. Assembly support doesn't have this restriction.
Zicntr, Zicsr, Zifencei, Zihpm
Between versions 2.0 and 2.1 of the base I specification, a backwards incompatible change was made to remove selected instructions and CSRs from the base ISA. These instructions were grouped into a set of new extensions, but were no longer required by the base ISA. This change is partially described in "Preface to Document Version 20190608-Base-Ratified" from the specification document (the zicntr and zihpm bits are not mentioned). LLVM currently implements version 2.1 of the base specification. To maintain compatibility, instructions from these extensions are accepted without being in the -march string. LLVM also allows the explicit specification of the extensions in an -march string.
Za128rs, Za64rs, Zama16b, Zic64b, Ziccamoa, Ziccamoc, Ziccif, Zicclsm, Ziccrse, Shcounterenvw, Shgatpa, Shtvala, Shvsatpa, Shvstvala, Shvstvecd, Ssccptr, Sscounterenw, Ssstateen, Ssstrict, Sstvala, Sstvecd, Ssu64xl, Svade, Svbare
These extensions are defined as part of the RISC-V Profiles specification. They do not introduce any new features themselves, but instead describe existing hardware features.

Sdext, Sdtrig The RISC-V Debug Specification.

Zacas
The compiler will not generate amocas.d on RV32 or amocas.q on RV64 due to ABI compatibility. These can only be used in the assembler.

At the time of writing there are three atomics mappings (ABIs) defined for RISC-V. As of LLVM 19, LLVM defaults to "A6S", which is compatible with both the original "A6" and the future "A7" ABI. See the psABI atomics document for more information on these mappings.

Note that although the "A6S" mapping is used, the ELF attribute recording the mapping isn't currently emitted by default due to a bug causing a crash in older versions of binutils when processing files containing this attribute.

LLVM supports (to various degrees) a number of experimental extensions. All experimental extensions have experimental- as a prefix. There is explicitly no compatibility promised between versions of the toolchain, and regular users are strongly advised not to make use of experimental extensions before they reach ratification.

The primary goal of experimental support is to assist in the process of ratification by providing an existence proof of an implementation, and simplifying efforts to validate the value of a proposed extension against large code bases. Experimental extensions are expected to either transition to ratified status, or be eventually removed. The decision on whether to accept an experimental extension is currently done on an entirely case by case basis; if you want to propose one, attending the bi-weekly RISC-V sync-up call is strongly advised.

experimental-p
LLVM implements the 0.21 draft specification.
experimental-zibi
LLVM implements the 0.1 release specification.
experimental-zicfilp, experimental-zicfiss
LLVM implements the 1.0 release specification.
experimental-zvbc32e, experimental-zvkgs
LLVM implements the 0.7 release specification.
experimental-svukte
LLVM implements the 0.3 draft specification.
experimental-zvdot4a8i
LLVM implements the 0.1 draft specification.
experimental-smpmpmt
LLVM implements the 0.6 draft specification.
experimental-zvabd
LLVM implements the 0.7 draft specification.
experimental-zvzip
LLVM implements the 0.1 draft specification.
experimental-zvvfmm
LLVM implements the 0.1 draft specification.
experimental-zvvmm
LLVM implements the 0.1 draft specification.

To use an experimental extension from clang, you must add -menable-experimental-extensions to the command line, and specify the exact version of the experimental extension you are using. To use an experimental extension with LLVM's internal developer tools (e.g. llc, llvm-objdump, llvm-mc), you must prefix the extension name with experimental-. Note that you don't need to specify the version with internal tools, and shouldn't include the experimental- prefix with clang.

Vendor extensions are extensions which are not standardized by RISC-V International, and are instead defined by a hardware vendor. The term vendor extension roughly parallels the definition of a non-standard extension from Section 1.3 of the Volume I: RISC-V Unprivileged ISA specification. In particular, we expect to eventually accept both custom extensions and non-conforming extensions.

Inclusion of a vendor extension will be considered on a case by case basis. All proposals should be brought to the bi-weekly RISC-V sync calls for discussion. For a general idea of the factors likely to be considered, please see the Clang documentation.

It is our intention to follow the naming conventions described in riscv-non-isa/riscv-toolchain-conventions. Exceptions to this naming will need to be strongly motivated.

The current vendor extensions supported are:

XAIFET
LLVM implements the AIFET (AI Foundry's ET) vendor-defined instructions specified in originally defined by Esperanto Technologies (and now under the AI Foundry non-profit). Instructions are prefixed with aif. as described in the specification.
XTHeadBa
LLVM implements the THeadBa (address-generation) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadBb
LLVM implements the THeadBb (basic bit-manipulation) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadBs
LLVM implements the THeadBs (single-bit operations) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadCondMov
LLVM implements the THeadCondMov (conditional move) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadCmo
LLVM implements the THeadCmo (cache management operations) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadFMemIdx
LLVM implements the THeadFMemIdx (indexed memory operations for floating point) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTheadMac
LLVM implements the XTheadMac (multiply-accumulate instructions) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadMemIdx
LLVM implements the THeadMemIdx (indexed memory operations) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadMemPair
LLVM implements the THeadMemPair (two-GPR memory operations) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadSync
LLVM implements the THeadSync (multi-core synchronization instructions) vendor-defined instructions specified in by T-HEAD of Alibaba. Instructions are prefixed with th. as described in the specification.
XTHeadVdot
LLVM implements version 1.0.0 of the THeadV-family custom instructions specification by T-HEAD of Alibaba. All instructions are prefixed with th. as described in the specification, and the riscv-toolchain-convention document linked above.
XVentanaCondOps
LLVM implements version 1.0.0 of the VTx-family custom instructions specification by Ventana Micro Systems. All instructions are prefixed with vt. as described in the specification, and the riscv-toolchain-convention document linked above. These instructions are only available for riscv64 at this time.
Xsfmm*
LLVM implements version 0.6 of the Xsfmm Family of Attached Matrix Extensions Specification by SiFive. All instructions are prefixed with sf. as described in the specification.
XSfvcp
LLVM implements version 1.1.0 of the SiFive Vector Coprocessor Interface (VCIX) Software Specification by SiFive. All instructions are prefixed with sf.vc. as described in the specification, and the riscv-toolchain-convention document linked above.
Xsfvfexp16e, Xsfvfbfexp16e, and Xsfvfexp32e
LLVM implements version 0.5 of the Vector Exponential Extension Specification by SiFive. All instructions are prefixed with sf. as described in the specification linked above.
Xsfvfexpa and Xsfvfexpa64e
LLVM implements version 0.2 of the Vector Exponential Approximation Extension Specification by SiFive. All instructions are prefixed with sf. as described in the specification linked above.
XSfvqmaccdod, XSfvqmaccqoq
LLVM implements version 1.1.0 of the SiFive Int8 Matrix Multiplication Extensions Specification by SiFive. All instructions are prefixed with sf. as described in the specification linked above.
Xsfvfnrclipxfqf
LLVM implements version 1.0.0 of the FP32-to-int8 Ranged Clip Instructions Extension Specification by SiFive. All instructions are prefixed with sf. as described in the specification linked above.
Xsfvfwmaccqqq
LLVM implements version 1.0.0 of the Matrix Multiply Accumulate Instruction Extension Specification by SiFive. All instructions are prefixed with sf. as described in the specification linked above.
XCVbitmanip
LLVM implements version 1.0.0 of the CORE-V Bit Manipulation custom instructions specification by OpenHW Group. All instructions are prefixed with cv. as described in the specification.
XCVelw
LLVM implements version 1.0.0 of the CORE-V Event load custom instructions specification by OpenHW Group. All instructions are prefixed with cv. as described in the specification. These instructions are only available for riscv32 at this time.
XCVmac
LLVM implements version 1.0.0 of the CORE-V Multiply-Accumulate (MAC) custom instructions specification by OpenHW Group. All instructions are prefixed with cv.mac as described in the specification. These instructions are only available for riscv32 at this time.
XCVmem
LLVM implements version 1.0.0 of the CORE-V Post-Increment load and stores custom instructions specification by OpenHW Group. All instructions are prefixed with cv. as described in the specification. These instructions are only available for riscv32 at this time.
XCValu
LLVM implements version 1.0.0 of the Core-V ALU custom instructions specification by Core-V. All instructions are prefixed with cv. as described in the specification. These instructions are only available for riscv32 at this time.
XCVsimd
LLVM implements version 1.0.0 of the CORE-V SIMD custom instructions specification by OpenHW Group. All instructions are prefixed with cv. as described in the specification.
XCVbi
LLVM implements version 1.0.0 of the CORE-V immediate branching custom instructions specification by OpenHW Group. All instructions are prefixed with cv. as described in the specification. These instructions are only available for riscv32 at this time.
XSiFivecdiscarddlone
LLVM implements the SiFive sf.cdiscard.d.l1 instruction by SiFive.
XSiFivecflushdlone
LLVM implements the SiFive sf.cflush.d.l1 instruction by SiFive.
XSfcease
LLVM implements the SiFive sf.cease instruction by SiFive.
Xwchc
LLVM implements the custom compressed opcodes present in some QingKe cores by WCH / Nanjing Qinheng Microelectronics. The vendor refers to these opcodes by the name "XW".
Xqccmp
LLVM implements version 0.3 of the 16-bit Push/Pop instructions and double-moves extension specification by Qualcomm. All instructions are prefixed with qc. as described in the specification.
Xqci
LLVM implements version 0.13 of the Qualcomm uC extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcia
LLVM implements version 0.7 of the Qualcomm uC Arithmetic extension specification by Qualcomm. These instructions are only available for riscv32.
Xqciac
LLVM implements version 0.3 of the Qualcomm uC Load-Store Address Calculation extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcibi
LLVM implements version 0.2 of the Qualcomm uC Branch Immediate extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcibm
LLVM implements version 0.8 of the Qualcomm uC Bit Manipulation extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcicli
LLVM implements version 0.3 of the Qualcomm uC Conditional Load Immediate extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcicm
LLVM implements version 0.2 of the Qualcomm uC Conditional Move extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcics
LLVM implements version 0.2 of the Qualcomm uC Conditional Select extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcicsr
LLVM implements version 0.4 of the Qualcomm uC CSR extension specification by Qualcomm. These instructions are only available for riscv32.
Xqciint
LLVM implements version 0.10 of the Qualcomm uC Interrupts extension specification by Qualcomm. These instructions are only available for riscv32.
Xqciio
LLVM implements version 0.1 of the Qualcomm uC External Input Output extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcilb
LLVM implements version 0.2 of the Qualcomm uC Long Branch extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcili
LLVM implements version 0.2 of the Qualcomm uC Load Large Immediate extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcilia
LLVM implements version 0.2 of the Qualcomm uC Large Immediate Arithmetic extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcilo
LLVM implements version 0.3 of the Qualcomm uC Large Offset Load Store extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcilsm
LLVM implements version 0.6 of the Qualcomm uC Load Store Multiple extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcisim
LLVM implements version 0.2 of the Qualcomm uC Simulation Hint extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcisls
LLVM implements version 0.2 of the Qualcomm uC Scaled Load Store extension specification by Qualcomm. These instructions are only available for riscv32.
Xqcisync
LLVM implements version 0.3 of the Qualcomm uC Sync Delay extension specification by Qualcomm. These instructions are only available for riscv32.
Xmipscbop
LLVM implements MIPS prefetch extension p8700 processor by MIPS.
Xmipscmov
LLVM implements conditional move for the p8700 processor by MIPS.
Xmipslsp
LLVM implements load/store pair instructions for the p8700 processor by MIPS.
experimental-XRivosVizip
LLVM implements version 0.1 of the Rivos Vector Register Zips extension specification.
XAndesPerf
LLVM implements version 5.0.0 of the Andes Performance Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XAndesBFHCvt
LLVM implements version 5.0.0 of the Andes Scalar BFLOAT16 Conversion Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XAndesVBFHCvt
LLVM implements version 5.0.0 of the Andes Vector BFLOAT16 Conversion Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XAndesVSINTH
LLVM implements version 5.0.0 of the Andes Vector Small Int Handling Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XAndesVSINTLoad
LLVM implements version 5.0.0 of the Andes Vector INT4 Load Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XAndesVPackFPH
LLVM implements version 5.0.0 of the Andes Vector Packed FP16 Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XAndesVDot
LLVM implements version 5.0.0 of the Andes Vector Dot Product Extension specification by Andes Technology. All instructions are prefixed with nds. as described in the specification.
XSMTVDot
SpacemiT defines Integrated Matrix Extension (IME) specification. LLVM implement the hardware-adapted subset for SpacemiT X60, defined in the feature document by SpacemiT. All instructions are prefixed with smt. as described in the implementation guide. Note that this implemented subset is version 1.0.0 of the SpacemiT Vector Dot Product Extension specification, which is strictly a subset of the full IME specification to reflect the capabilities of SpacemiT X60 hardware correctly.

In some cases an extension is non-experimental but the C intrinsics for that extension are still experimental. To use C intrinsics for such an extension from clang, you must add -menable-experimental-extensions to the command line. This currently applies to the following extensions:

No extensions have experimental intrinsics.

RISC-V is a variable-length ISA, but the standard currently only defines 16- and 32-bit instructions. The specification describes longer instruction encodings, but these are not ratified.

The LLVM disassembler, llvm-objdump, does use the longer instruction encodings described in the specification to guess the instruction length (up to 176 bits) and will group the disassembly view of encoding bytes correspondingly.

The LLVM integrated assembler for RISC-V supports two different kinds of .insn directive, for assembling instructions that LLVM does not yet support:

  • .insn type, args* which takes a known instruction type, and a list of fields. You are strongly recommended to use this variant of the directive if your instruction fits an existing instruction type.
  • .insn [ length , ] encoding which takes an (optional) explicit length (in bytes) and a raw encoding for the instruction. When given an explicit length, this variant can encode instructions up to 64 bits long. The encoding part of the directive must be given all bits for the instruction, none are filled in for the user. When used without the optional length, this variant of the directive will use the LSBs of the raw encoding to work out if an instruction is 16 or 32 bits long. LLVM does not infer that an instruction might be longer than 32 bits - in this case, the user must give the length explicitly.

It is strongly recommended to use the .insn directive for assembling unsupported instructions instead of .word or .hword, because it will produce the correct mapping symbols to mark the word as an instruction, not data.

Some of the RISC-V psABI variants reserve gp (x3) for use as a "Global Pointer", to make generating data addresses more efficient.

To use this functionality, you need to be doing all of the following:

  • Use the medlow (aka small) code model;
  • Not use the gp register for any other uses (some platforms use it for the shadow stack and others as a temporary -- as denoted by the Tag_RISCV_x3_reg_usage build attribute);
  • Compile your objects with Clang's -mrelax option, to enable relaxation annotations on relocatable objects (this is the default, but -mno-relax disables these relaxation annotations);
  • Compile for a position-dependent static executable (not a shared library, and -fno-PIC / -fno-pic / -fno-pie); and
  • Use LLD's --relax-gp option.

LLD will relax (rewrite) any code sequences that materialize an address within 2048 bytes of __global_pointer$ (which will be defined if it is used and does not already exist) to instead generate the address using gp and the correct (signed) 12-bit immediate. This usually saves at least one instruction compared to materialising a full 32-bit address value.

There can only be one gp value in a process (as gp is not changed when calling into a function in a shared library), so the symbol is is only defined and this relaxation is only done for executables, and not for shared libraries. The linker expects executable startup code to put the value of __global_pointer$ (from the executable) into gp before any user code is run.

Arguably, the most efficient use for this addressing mode is for smaller global variables, as larger global variables likely need many more loads or stores when they are being accessed anyway, so the cost of materializing the upper bits can be shared.

Therefore the compiler can place smaller global variables into sections with names starting with .sdata or .sbss (matching sections with names starting with .data and .bss respectively). LLD knows to define the global_pointer$ symbol close to these sections, and to lay these sections out adjacent to the .data section.

Clang's -msmall-data-limit= option controls what the threshold size is (in bytes) for a global variable to be considered small. -msmall-data-limit=0 disables the use of sections starting .sdata and .sbss. The -msmall-data-limit= option will not move global variables that have an explicit data section, and will keep globals in separate sections if you are using -fdata-sections.

The small data limit threshold is also used to separate small constants into sections with names starting with .srodata. LLD does not place these with the .sdata and .sbss sections as .srodata sections are read only and the other two are writable. Instead the .srodata sections are placed adjacent to .rodata.

Data suggests that these options can produce significant improvements across a range of benchmarks.

Note

This is a summary of the current state of sanitizers, and not an official support statement.

  • UBSan is not platform-specific, and should work out of the box.

  • ASan and TSan already have shadow mappings defined for Linux on RISC-V, and are likely to work.

  • HWASan is also likely to work, though RISC-V Pointer Masking (very new) is needed as well to make it run efficiently.

  • Memtag: N/A - there is currently no ratified RISC-V memory tagging spec.

  • MSan is unlikely to work: there are currently no RISC-V-specific shadow mappings (this is probably easy to fix; perhaps the default Linux 64-bit mapping will work) and MSan does not explicitly handle any of the @llvm.riscv.* intrinsics (this is significantly more work to fix).

    Some intrinsics will be handled correctly anyway, if the RISC-V intrinsic is auto-upgraded into cross-platform LLVM intrinsics. Some others will be "heuristically" handled (possibly incorrectly). The rest will default to the "strict" handler, which checks that all the parameters are fully initialized.

    MSan intrinsics support is only required if code (including dependencies) manually calls the intrinsic.

Due to RISC-V's highly configurable nature, it is often desirable to share a single scheduling model across multiple similar RISC-V processors that only differ in a small number of (uArch) tuning features. An example of such tuning feature could be whether the latency of vector operations depend on VL or not. This could be extended to tuning features that are not directly connected to scheduling model but other parts of the RISC-V backend, like the cost of vrgather.vv instruction.

To that end, RISC-V LLVM supports a tuning feature string format that helps users to build a performance model by "configuring" an existing tune CPU, along with its scheduling model. For example, this string

::
"sifive-x280:single-element-vec-fp64"

takes sifive-x280 as the "base" tune CPU and configured it with single-element-vec-fp64. This gives us a performance model that looks exactly like that of sifive-x280, except some of the 64-bit vector floating point instructions now produce only a single element per cycle due to single-element-vec-fp64. This string could eventually be used in places like -mtune at the frontend.

More formally speaking, each tuning feature string has the following format:

::
<tune-cpu>[":"<tune-features>]?

where

::
tune-cpu ::= 'tuning CPU name in lower case' directive ::= "[a-zA-Z0-9_-]+" tune-features ::= directive ["," directive]*

A directive can and can only _enable_ or _disable_ a certain tuning feature from the tuning CPU. A positive directive, like the single-element-vec-fp64 we just saw, enables an additional tuning feature in the associated tuning model. A negative directive, on the other hand, removes a certain tuning feature. For example, sifive-x390 already has the single-element-vec-fp64 feature, and we can use

::
"sifive-x390:full-vec-fp64"

to create a new performance model that looks nearly the same as sifive-x390 except single-element-vec-fp64 being cut out. In this case, full-vec-fp64 is a negative directive.

There are some rules for the list of directives, though:

  1. The same directive cannot appear more than once.
  2. The positive and negative directives that belong to the same feature cannot appear at the same time.
  3. If a feature implies other features -- for example, short-forward-branch-imul implies short-forward-branch-ialu -- then the _implied_ features are subject to the previous two rules, too. For example, we cannot write _"short-forward-branch-imul,no-short-forward-branch-ialu"_, because the feature implied by short-forward-branch-imul violates rule 2.

In addition to the rules listed above, right now, this string only accepts directives that are explicitly supported by the tune CPU. For example, _"sifive-x280:prefer-w-inst"_ is not a valid string as prefer-w-inst is not supported by sifive-x280 at this moment. Vendors of these processors are expected to maintain the compatibility of their supported directives across different versions. There have been lots of discussions on having "generic" features that are universally supported by all RISC-V CPUs, yet many concerns -- including the difficulty to maintain compatibility across _all_ CPU targets and versions -- make us decide to table this issue until we find a reliable process to select such features.