diff --git a/llvm/docs/CodeGenerator.rst b/llvm/docs/CodeGenerator.rst index d5a019dcb06ba..a2faedfd81056 100644 --- a/llvm/docs/CodeGenerator.rst +++ b/llvm/docs/CodeGenerator.rst @@ -2498,5 +2498,5 @@ The Lightweight Fault Isolation (LFI) sub-architecture LFI is a sub-architecture available for certain backends that allows programs compiled for the target to run in a sandboxed environment that is within the -same address space as host code. Refer to :doc:`LFI` for more information about -LFI. +same address space as host code. Refer to :doc:`LightweightFaultIsolation` for +more information about LFI. diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst deleted file mode 100644 index 65d8b70f17e0b..0000000000000 --- a/llvm/docs/LFI.rst +++ /dev/null @@ -1,441 +0,0 @@ -========================================= -Lightweight Fault Isolation (LFI) in LLVM -========================================= - -.. contents:: - :local: - -Introduction -++++++++++++ - -Lightweight Fault Isolation (LFI) is a compiler-based sandboxing technology for -native code. Like WebAssembly and Native Client, LFI isolates sandboxed code in-process -(i.e., in the same address space as a host application). - -LFI is designed from the ground up to sandbox existing code, such as C/C++ -libraries (including assembly code) and device drivers. - -LFI aims for the following goals: - -* Compatibility: LFI can be used to sandbox nearly all existing C/C++/assembly - libraries unmodified (they just need to be recompiled). Sandboxed libraries - work with existing system call interfaces, and are compatible with existing - development tools such as profilers, debuggers, and sanitizers. -* Performance: LFI aims for minimal overhead vs. unsandboxed code. -* Security: The LFI runtime and compiler elements aim to be simple and - verifiable when possible. -* Usability: LFI aims to make it as easy as possible to retrofit sandboxing, - i.e., to migrate from unsandboxed to sandboxed libraries with minimal effort. - -When building a program for the LFI target the compiler is designed to ensure -that the program will only be able to access memory within a limited region of -the virtual address space, starting from where the program is loaded (the -current design sets this region to a size of 4GiB of virtual memory). Programs -built for the LFI target are restricted to using a subset of the instruction -set, designed so that the programs can be soundly confined to their sandbox -region. LFI programs must run inside of an "emulator" (usually called the LFI -runtime), responsible for initializing the sandbox region, loading the program, -and servicing system call requests, or other forms of runtime calls. - -LFI uses an architecture-specific sandboxing scheme based on the general -technique of Software-Based Fault Isolation (SFI). Initial support for LFI in -LLVM is focused on the AArch64 platform, with x86-64 support planned for the -future. The initial version of LFI for AArch64 is designed to support the -Armv8.1 AArch64 architecture. - -See `https://github.com/lfi-project `__ for -details about the LFI project and additional software needed to run LFI -programs. - -Compiler Requirements -+++++++++++++++++++++ - -When building for the ``aarch64_lfi`` target, the compiler must restrict use of -the instruction set to a subset of instructions, which are known to be safe -from a sandboxing perspective. To do this, we apply a set of simple rewrites at -the assembly language level to transform standard native AArch64 assembly into -LFI-compatible AArch64 assembly. - -These rewrites (also called "expansions") are applied at the very end of the -LLVM compilation pipeline (during the assembler step). This allows the rewrites -to be applied to hand-written assembly, including inline assembly. - -Compiler Options -================ - -The LFI target has several configuration options. - -* ``+lfi-loads``: enable sandboxing for loads (default: true). -* ``+lfi-stores``: enable sandboxing for stores (default: true). - -Use ``+nolfi-loads`` to create a "stores-only" sandbox that may read, but not -write, outside the sandbox region. - -Use ``+nolfi-loads+nolfi-stores`` to create a "jumps-only" sandbox that may -read/write outside the sandbox region but may not transfer control outside -(e.g., may not execute system calls directly). This is primarily useful in -combination with some other form of memory sandboxing, such as Intel MPK. - -Reserved Registers -================== - -The LFI target uses a custom ABI that reserves additional registers for the -platform. The registers are listed below, along with the security invariant -that must be maintained. - -* ``x27``: always holds the sandbox base address. -* ``x28``: always holds an address within the sandbox. -* ``sp``: always holds an address within the sandbox. -* ``x30``: always holds an address within the sandbox. -* ``x26``: scratch register. -* ``x25``: points to a thread-local virtual register file for storing runtime context information. - -Linker Support -============== - -In the initial version, LFI only supports static linking, and only supports -creating ``static-pie`` binaries. There is nothing that fundamentally precludes -support for dynamic linking on the LFI target, but such support would require -that the code generated by the linker for PLT entries be slightly modified in -order to conform to the LFI architecture subset. - -Assembly Rewrites -================= - -Terminology -~~~~~~~~~~~ - -In the following assembly rewrites, some shorthand is used. - -* ``xN`` or ``wN``: refers to any general-purpose non-reserved register. -* ``{a,b,c}``: matches any of ``a``, ``b``, or ``c``. -* ``LDSTr``: a load/store instruction that supports register-register addressing modes, with one source/destination register. -* ``LDSTx``: a load/store instruction not matched by ``LDSTr``. - -Control flow -~~~~~~~~~~~~ - -Indirect branches get rewritten to branch through register ``x28``, which must -always contain an address within the sandbox. An ``add`` is used to safely -update ``x28`` with the destination address. Since ``ret`` uses ``x30`` by -default, which already must contain an address within the sandbox, it does not -require any rewrite. - -+--------------------+---------------------------+ -| Original | Rewritten | -+--------------------+---------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| {br,blr,ret} xN | add x28, x27, wN, uxtw | -| | {br,blr,ret} x28 | -| | | -+--------------------+---------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ret | ret | -| | | -+--------------------+---------------------------+ - -Memory accesses -~~~~~~~~~~~~~~~ - -Memory accesses are rewritten to use the ``[x27, wM, uxtw]`` addressing mode if -it is available, which is automatically safe. Otherwise, rewrites fall back to -using ``x28`` along with an instruction to safely load it with the target -address. - -+---------------------------------+-------------------------------+ -| Original | Rewritten | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTr xN, [xM] | LDSTr xN, [x27, wM, uxtw] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTr xN, [xM, #I] | add x28, x27, wM, uxtw | -| | LDSTr xN, [x28, #I] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTr xN, [xM, #I]! | add xM, xM, #I | -| | LDSTr xN, [x27, wM, uxtw] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTr xN, [xM], #I | LDSTr xN, [x27, wM, uxtw] | -| | add xM, xM, #I | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTr xN, [xM1, xM2] | add x26, xM1, xM2 | -| | LDSTr xN, [x27, w26, uxtw] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTr xN, [xM1, xM2, MOD #I] | add x26, xM1, xM2, MOD #I | -| | LDSTr xN, [x27, w26, uxtw] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTx ..., [xM] | add x28, x27, wM, uxtw | -| | LDSTx ..., [x28] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTx ..., [xM, #I] | add x28, x27, wM, uxtw | -| | LDSTx ..., [x28, #I] | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTx ..., [xM, #I]! | add x28, x27, wM, uxtw | -| | LDSTx ..., [x28, #I] | -| | add xM, xM, #I | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTx ..., [xM], #I | add x28, x27, wM, uxtw | -| | LDSTx ..., [x28] | -| | add xM, xM, #I | -| | | -+---------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| LDSTx ..., [xM1], xM2 | add x28, x27, wM1, uxtw | -| | LDSTx ..., [x28] | -| | add xM1, xM1, xM2 | -| | | -+---------------------------------+-------------------------------+ - -Stack pointer modification -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When the stack pointer is modified, we write the modified value to a temporary, -before moving it back into ``sp`` with a safe ``add``. - -+------------------------------+-------------------------------+ -| Original | Rewritten | -+------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mov sp, xN | add sp, x27, wN, uxtw | -| | | -+------------------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| {add,sub} sp, sp, {#I,xN} | {add,sub} x26, sp, {#I,xN} | -| | add sp, x27, w26, uxtw | -| | | -+------------------------------+-------------------------------+ - -Link register modification -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When the link register is modified, we write the modified value to a -temporary, before loading it back into ``x30`` with a safe ``add``. - -+-----------------------+----------------------------+ -| Original | Rewritten | -+-----------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ldr x30, [...] | ldr x26, [...] | -| | add x30, x27, w26, uxtw | -| | | -+-----------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ldp xN, x30, [...] | ldp xN, x26, [...] | -| | add x30, x27, w26, uxtw | -| | | -+-----------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ldp x30, xN, [...] | ldp x26, xN, [...] | -| | add x30, x27, w26, uxtw | -| | | -+-----------------------+----------------------------+ - -System instructions -~~~~~~~~~~~~~~~~~~~ - -System calls are rewritten into a sequence that loads the address of the first -runtime call entrypoint and jumps to it. The runtime call entrypoint table is -stored at the start of the sandbox, so it can be referenced by ``x27``. The -rewrite also saves and restores the link register, since it is used for -branching into the runtime. - -+-----------------+----------------------------+ -| Original | Rewritten | -+-----------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| svc #0 | mov w26, w30 | -| | ldr x30, [x27] | -| | blr x30 | -| | add x30, x27, w26, uxtw | -| | | -+-----------------+----------------------------+ - -Thread-local storage -~~~~~~~~~~~~~~~~~~~~ - -TLS accesses are rewritten into accesses offset from ``x25``, which is a -reserved register that points to a virtual register file, with a location for -storing the sandbox's thread pointer. ``TP`` is the offset into that virtual -register file where the thread pointer is stored. - -+----------------------+-----------------------+ -| Original | Rewritten | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs xN, tpidr_el0 | ldr xN, [x25, #TP] | -| | | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs tpidr_el0, xN | str xN, [x25, #TP] | -| | | -+----------------------+-----------------------+ - -Optimizations -============= - -Basic guard elimination -~~~~~~~~~~~~~~~~~~~~~~~ - -If a register is guarded multiple times in the same basic block without any -modifications to it during the intervening instructions, then subsequent guards -can be removed. - -+---------------------------+---------------------------+ -| Original | Rewritten | -+---------------------------+---------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| add x28, x27, wN, uxtw | add x28, x27, wN, uxtw | -| ldur xN, [x28] | ldur xN, [x28] | -| add x28, x27, wN, uxtw | ldur xN, [x28, #8] | -| ldur xN, [x28, #8] | ldur xN, [x28, #16] | -| add x28, x27, wN, uxtw | | -| ldur xN, [x28, #16] | | -| | | -+---------------------------+---------------------------+ - -Address generation -~~~~~~~~~~~~~~~~~~ - -Addresses to global symbols in position-independent executables are frequently -generated via ``adrp`` followed by ``ldr``. Since the address generated by -``adrp`` can be statically guaranteed to be within the sandbox, it is safe to -directly target ``x28`` for these sequences. This allows the omission of a -guard instruction before the ``ldr``. - -+----------------------+-----------------------+ -| Original | Rewritten | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| adrp xN, target | adrp x28, target | -| ldr xN, [xN, imm] | ldr xN, [x28, imm] | -| | | -+----------------------+-----------------------+ - -Stack guard elimination -~~~~~~~~~~~~~~~~~~~~~~~ - -**Note**: this optimization has not been implemented. - -If the stack pointer is modified by adding/subtracting a small immediate, and -then later used to perform a memory access without any intervening jumps, then -the guard on the stack pointer modification can be removed. This is because the -load/store is guaranteed to trap if the stack pointer has been moved outside of -the sandbox region. - -+---------------------------+---------------------------+ -| Original | Rewritten | -+---------------------------+---------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| add x26, sp, #8 | add sp, sp, #8 | -| add sp, x27, w26, uxtw | ... (same basic block) | -| ... (same basic block) | ldr xN, [sp] | -| ldr xN, [sp] | | -| | | -+---------------------------+---------------------------+ - -Guard hoisting -~~~~~~~~~~~~~~ - -**Note**: this optimization has not been implemented. - -In certain cases, guards may be hoisted outside of loops. - -+-----------------------+-------------------------------+ -| Original | Rewritten | -+-----------------------+-------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mov w8, #10 | mov w8, #10 | -| mov w9, #0 | mov w9, #0 | -| .loop: | add x28, x27, wM, uxtw | -| add w9, w9, #1 | .loop: | -| ldr xN, [xM] | add w9, w9, #1 | -| cmp w9, w8 | ldr xN, [x28] | -| b.lt .loop | cmp w9, w8 | -| .end: | b.lt .loop | -| | .end: | -| | | -+-----------------------+-------------------------------+ - -Assembler Directives -==================== - -The LFI assembler supports the following directives for controlling the -rewriter. - -``.lfi_rewrite_disable`` -~~~~~~~~~~~~~~~~~~~~~~~~ - -Disables LFI assembly rewrites for all subsequent instructions, until -``.lfi_rewrite_enable`` is used. This can be useful for hand-written assembly -that is already safe and should not be modified by the rewriter. - -``.lfi_rewrite_enable`` -~~~~~~~~~~~~~~~~~~~~~~~ - -Re-enables LFI assembly rewrites after a previous ``.lfi_rewrite_disable``. - -Example: - -.. code-block:: gas - - .lfi_rewrite_disable - // No rewrites applied here. - ldr x0, [x27, w1, uxtw] - .lfi_rewrite_enable - -References -++++++++++ - -For more information, please see the following resources: - -* `LFI project page `__ -* `LFI RFC `__ -* `LFI paper `__ - -Contact info: - -* Zachary Yedidia - zyedidia@cs.stanford.edu -* Tal Garfinkel - tgarfinkel@google.com -* Sharjeel Khan - sharjeelkhan@google.com diff --git a/llvm/docs/LightweightFaultIsolation.rst b/llvm/docs/LightweightFaultIsolation.rst new file mode 100644 index 0000000000000..d266ce61e420e --- /dev/null +++ b/llvm/docs/LightweightFaultIsolation.rst @@ -0,0 +1,187 @@ +========================================= +Lightweight Fault Isolation (LFI) in LLVM +========================================= + +.. contents:: + :local: + +Introduction +++++++++++++ + +Lightweight Fault Isolation (LFI) is a compiler-based sandboxing technology for +native code. Like WebAssembly and Native Client, LFI isolates sandboxed code in-process +(i.e., in the same address space as a host application). + +LFI is designed from the ground up to sandbox existing code, such as C/C++ +libraries (including assembly code) and device drivers. + +LFI aims for the following goals: + +* Compatibility: LFI can be used to sandbox nearly all existing C/C++/assembly + libraries unmodified (they just need to be recompiled). Sandboxed libraries + work with existing system call interfaces, and are compatible with existing + development tools such as profilers, debuggers, and sanitizers. +* Performance: LFI aims for minimal overhead vs. unsandboxed code. +* Security: The LFI runtime and compiler elements aim to be simple and + verifiable when possible. +* Usability: LFI aims to make it as easy as possible to retrofit sandboxing, + i.e., to migrate from unsandboxed to sandboxed libraries with minimal effort. + +When building a program for the LFI target the compiler is designed to ensure +that the program will only be able to access memory within a limited region of +the virtual address space, starting from where the program is loaded (the +current design sets this region to a size of 4GiB of virtual memory). Programs +built for the LFI target are restricted to using a subset of the instruction +set, designed so that the programs can be soundly confined to their sandbox +region. LFI programs must run inside of an "emulator" (usually called the LFI +runtime), responsible for initializing the sandbox region, loading the program, +and servicing system call requests, or other forms of runtime calls. + +LFI uses an architecture-specific sandboxing scheme based on the general +technique of Software-Based Fault Isolation (SFI). Initial support for LFI in +LLVM is focused on the AArch64 platform, with x86-64 support planned for the +future. The initial version of LFI for AArch64 is designed to support the +Armv8.1 AArch64 architecture. + +See `https://github.com/lfi-project `__ for +details about the LFI project and additional software needed to run LFI +programs. + +Compiler Requirements ++++++++++++++++++++++ + +When building for the ``aarch64_lfi`` target, the compiler must restrict use of +the instruction set to a subset of instructions, which are known to be safe +from a sandboxing perspective. To do this, we apply a set of simple rewrites at +the assembly language level to transform standard native AArch64 assembly into +LFI-compatible AArch64 assembly. + +These rewrites (also called "expansions") are applied at the very end of the +LLVM compilation pipeline (during the assembler step). This allows the rewrites +to be applied to hand-written assembly, including inline assembly. + +Reserved Registers +================== + +The LFI target uses a custom ABI that reserves additional registers for the +platform. The registers are listed below, along with the security invariant +that must be maintained. + +* ``x27``: always holds the sandbox base address. +* ``x28``: always holds an address within the sandbox. +* ``sp``: always holds an address within the sandbox. +* ``x30``: always holds an address within the sandbox. +* ``x26``: scratch register. +* ``x25``: context register (see below). + +Context Register +~~~~~~~~~~~~~~~~ + +The context register (``x25``) points to a block of thread-local memory managed +by the LFI runtime. The layout is as follows: + ++--------+--------+----------------------------------------------+ +| Offset | Size | Description | ++--------+--------+----------------------------------------------+ +| 0 | 8 | Reserved for future use. | ++--------+--------+----------------------------------------------+ +| 8 | 8 | Reserved for use by the LFI runtime. | ++--------+--------+----------------------------------------------+ +| 16 | 8 | Virtual thread pointer (used for TP access). | ++--------+--------+----------------------------------------------+ + +Linker Support +============== + +In the initial version, LFI only supports static linking, and only supports +creating ``static-pie`` binaries. There is nothing that fundamentally precludes +support for dynamic linking on the LFI target, but such support would require +that the code generated by the linker for PLT entries be slightly modified in +order to conform to the LFI architecture subset. + +Assembly Rewrites +================= + +System instructions +~~~~~~~~~~~~~~~~~~~ + +System calls are rewritten into a sequence that loads the address of the first +runtime call entrypoint and jumps to it. The runtime call entrypoint table is +stored at a negative offset from the sandbox base, so it can be referenced by +``x27``. The rewrite also saves and restores the link register, since it is +used for branching into the runtime. + ++-----------------+------------------------------+ +| Original | Rewritten | ++-----------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| svc #0 | mov x26, x30 | +| | ldur x30, [x27, #-8] | +| | blr x30 | +| | add x30, x27, w26, uxtw | +| | | ++-----------------+------------------------------+ + +Thread-local storage +~~~~~~~~~~~~~~~~~~~~ + +TLS accesses are rewritten into loads/stores from the context register +(``x25``), which holds the virtual thread pointer at offset 32 (see +`Context Register`_). + ++----------------------+-------------------------+ +| Original | Rewritten | ++----------------------+-------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs xN, tpidr_el0 | ldr xN, [x25, #32] | +| | | ++----------------------+-------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| msr tpidr_el0, xN | str xN, [x25, #32] | +| | | ++----------------------+-------------------------+ + +Assembler Directives +==================== + +The LFI assembler supports the following directives for controlling the +rewriter. + +``.lfi_rewrite_disable`` +~~~~~~~~~~~~~~~~~~~~~~~~ + +Disables LFI assembly rewrites for all subsequent instructions, until +``.lfi_rewrite_enable`` is used. This can be useful for hand-written assembly +that is already safe and should not be modified by the rewriter. + +``.lfi_rewrite_enable`` +~~~~~~~~~~~~~~~~~~~~~~~ + +Re-enables LFI assembly rewrites after a previous ``.lfi_rewrite_disable``. + +Example: + +.. code-block:: gas + + .lfi_rewrite_disable + // No rewrites applied here. + ldr x0, [x27, w1, uxtw] + .lfi_rewrite_enable + +References +++++++++++ + +For more information, please see the following resources: + +* `LFI project page `__ +* `LFI RFC `__ +* `LFI paper `__ + +Contact info: + +* Zachary Yedidia - zyedidia@cs.stanford.edu +* Tal Garfinkel - tgarfinkel@google.com +* Sharjeel Khan - sharjeelkhan@google.com diff --git a/llvm/docs/UserGuides.rst b/llvm/docs/UserGuides.rst index a712d4eb4c13e..bc12ca3d69168 100644 --- a/llvm/docs/UserGuides.rst +++ b/llvm/docs/UserGuides.rst @@ -50,7 +50,7 @@ intermediate LLVM representation. InstrProfileFormat InstrRefDebugInfo KeyInstructionsDebugInfo - LFI + LightweightFaultIsolation LinkTimeOptimization LoopTerminology MarkdownQuickstartTemplate @@ -319,5 +319,5 @@ Additional Topics :doc:`Telemetry` This document describes the Telemetry framework in LLVM. -:doc:`LFI ` +:doc:`LightweightFaultIsolation ` This document describes the Lightweight Fault Isolation (LFI) target in LLVM. diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp index 6c4a778b10f3f..d6beabb19331b 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp @@ -107,6 +107,19 @@ AArch64InstrInfo::AArch64InstrInfo(const AArch64Subtarget &STI) AArch64::ADJCALLSTACKUP, AArch64::CATCHRET), RI(STI.getTargetTriple(), STI.getHwMode()), Subtarget(STI) {} +/// Return the maximum number of bytes of code the specified instruction may be +/// after LFI rewriting. If the instruction is not rewritten, std::nullopt is +/// returned (use default sizing). +static std::optional getLFIInstSizeInBytes(const MachineInstr &MI) { + switch (MI.getOpcode()) { + case AArch64::SVC: + // SVC expands to 4 instructions. + return 16; + default: + return std::nullopt; + } +} + /// GetInstSize - Return the number of bytes of code the specified /// instruction may be. This returns the maximum number of bytes. unsigned AArch64InstrInfo::getInstSizeInBytes(const MachineInstr &MI) const { @@ -130,6 +143,11 @@ unsigned AArch64InstrInfo::getInstSizeInBytes(const MachineInstr &MI) const { unsigned NumBytes = 0; const MCInstrDesc &Desc = MI.getDesc(); + // LFI rewriter expansions that supersede normal sizing. + if (Subtarget.isLFI()) + if (auto Size = getLFIInstSizeInBytes(MI)) + return *Size; + if (!MI.isBundle() && isTailCallReturnInst(MI)) { NumBytes = Desc.getSize() ? Desc.getSize() : 4; diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp new file mode 100644 index 0000000000000..fa122ce327ece --- /dev/null +++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.cpp @@ -0,0 +1,204 @@ +//===- AArch64MCLFIRewriter.cpp ---------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the AArch64MCLFIRewriter class, the AArch64 specific +// subclass of MCLFIRewriter. +// +//===----------------------------------------------------------------------===// + +#include "AArch64MCLFIRewriter.h" +#include "AArch64AddressingModes.h" +#include "MCTargetDesc/AArch64MCTargetDesc.h" +#include "Utils/AArch64BaseInfo.h" + +#include "llvm/MC/MCInst.h" +#include "llvm/MC/MCStreamer.h" +#include "llvm/MC/MCSubtargetInfo.h" + +using namespace llvm; + +// LFI reserved registers. +static constexpr MCRegister LFIBaseReg = AArch64::X27; +static constexpr MCRegister LFIAddrReg = AArch64::X28; +static constexpr MCRegister LFIScratchReg = AArch64::X26; +static constexpr MCRegister LFICtxReg = AArch64::X25; + +// Offset into the context register block (pointed to by LFICtxReg) where the +// thread pointer is stored. This is a scaled offset (multiplied by 8 for +// 64-bit loads), so a value of 2 means an actual byte offset of 16. +static constexpr unsigned LFITPOffset = 2; + +static bool isSyscall(const MCInst &Inst) { + return Inst.getOpcode() == AArch64::SVC; +} + +static bool isPrivilegedTP(int64_t Reg) { + return Reg == AArch64SysReg::TPIDR_EL1 || Reg == AArch64SysReg::TPIDR_EL2 || + Reg == AArch64SysReg::TPIDR_EL3; +} + +static bool isTPRead(const MCInst &Inst) { + return Inst.getOpcode() == AArch64::MRS && + Inst.getOperand(1).getImm() == AArch64SysReg::TPIDR_EL0; +} + +static bool isTPWrite(const MCInst &Inst) { + return Inst.getOpcode() == AArch64::MSR && + Inst.getOperand(0).getImm() == AArch64SysReg::TPIDR_EL0; +} + +static bool isPrivilegedTPAccess(const MCInst &Inst) { + if (Inst.getOpcode() == AArch64::MRS) + return isPrivilegedTP(Inst.getOperand(1).getImm()); + if (Inst.getOpcode() == AArch64::MSR) + return isPrivilegedTP(Inst.getOperand(0).getImm()); + return false; +} + +bool AArch64MCLFIRewriter::mayModifyReserved(const MCInst &Inst) const { + return mayModifyRegister(Inst, LFIAddrReg) || + mayModifyRegister(Inst, LFIBaseReg) || + mayModifyRegister(Inst, LFICtxReg); +} + +void AArch64MCLFIRewriter::emitInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + Out.emitInstruction(Inst, STI); +} + +void AArch64MCLFIRewriter::emitAddMask(MCRegister Dest, MCRegister Src, + MCStreamer &Out, + const MCSubtargetInfo &STI) { + // add Dest, LFIBaseReg, W(Src), uxtw + MCInst Inst; + Inst.setOpcode(AArch64::ADDXrx); + Inst.addOperand(MCOperand::createReg(Dest)); + Inst.addOperand(MCOperand::createReg(LFIBaseReg)); + Inst.addOperand(MCOperand::createReg(getWRegFromXReg(Src))); + Inst.addOperand( + MCOperand::createImm(AArch64_AM::getArithExtendImm(AArch64_AM::UXTW, 0))); + emitInst(Inst, Out, STI); +} + +void AArch64MCLFIRewriter::emitBranch(unsigned Opcode, MCRegister Target, + MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCInst Branch; + Branch.setOpcode(Opcode); + Branch.addOperand(MCOperand::createReg(Target)); + emitInst(Branch, Out, STI); +} + +void AArch64MCLFIRewriter::emitMov(MCRegister Dest, MCRegister Src, + MCStreamer &Out, + const MCSubtargetInfo &STI) { + // orr Dest, xzr, Src + MCInst Inst; + Inst.setOpcode(AArch64::ORRXrs); + Inst.addOperand(MCOperand::createReg(Dest)); + Inst.addOperand(MCOperand::createReg(AArch64::XZR)); + Inst.addOperand(MCOperand::createReg(Src)); + Inst.addOperand(MCOperand::createImm(0)); + emitInst(Inst, Out, STI); +} + +// svc #0 +// -> +// mov x26, x30 +// ldur x30, [x27, #-8] +// blr x30 +// add x30, x27, w26, uxtw +void AArch64MCLFIRewriter::rewriteSyscall(const MCInst &, MCStreamer &Out, + const MCSubtargetInfo &STI) { + // Save LR to scratch. + emitMov(LFIScratchReg, AArch64::LR, Out, STI); + + // Load syscall handler address from negative offset from sandbox base. + MCInst Load; + Load.setOpcode(AArch64::LDURXi); + Load.addOperand(MCOperand::createReg(AArch64::LR)); + Load.addOperand(MCOperand::createReg(LFIBaseReg)); + Load.addOperand(MCOperand::createImm(-8)); + emitInst(Load, Out, STI); + + // Call the runtime. + emitBranch(AArch64::BLR, AArch64::LR, Out, STI); + + // Restore LR with guard. + emitAddMask(AArch64::LR, LFIScratchReg, Out, STI); +} + +// mrs xN, tpidr_el0 +// -> +// ldr xN, [x25, #16] +void AArch64MCLFIRewriter::rewriteTPRead(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCRegister DestReg = Inst.getOperand(0).getReg(); + + MCInst Load; + Load.setOpcode(AArch64::LDRXui); + Load.addOperand(MCOperand::createReg(DestReg)); + Load.addOperand(MCOperand::createReg(LFICtxReg)); + Load.addOperand(MCOperand::createImm(LFITPOffset)); + emitInst(Load, Out, STI); +} + +// msr tpidr_el0, xN +// -> +// str xN, [x25, #16] +void AArch64MCLFIRewriter::rewriteTPWrite(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCRegister SrcReg = Inst.getOperand(1).getReg(); + + MCInst Store; + Store.setOpcode(AArch64::STRXui); + Store.addOperand(MCOperand::createReg(SrcReg)); + Store.addOperand(MCOperand::createReg(LFICtxReg)); + Store.addOperand(MCOperand::createImm(LFITPOffset)); + emitInst(Store, Out, STI); +} + +void AArch64MCLFIRewriter::doRewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + // Reserved register modification is an error. + if (mayModifyReserved(Inst)) { + error(Inst, "illegal modification of reserved LFI register"); + return; + } + + // System instructions. + if (isSyscall(Inst)) + return rewriteSyscall(Inst, Out, STI); + + if (isTPRead(Inst)) + return rewriteTPRead(Inst, Out, STI); + + if (isTPWrite(Inst)) + return rewriteTPWrite(Inst, Out, STI); + + if (isPrivilegedTPAccess(Inst)) { + error(Inst, "illegal access to privileged thread pointer register"); + return; + } + + emitInst(Inst, Out, STI); +} + +bool AArch64MCLFIRewriter::rewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + // The guard prevents rewrite-recursion when we emit instructions from inside + // the rewriter (such instructions should not be rewritten). + if (!Enabled || Guard) + return false; + Guard = true; + + doRewriteInst(Inst, Out, STI); + + Guard = false; + return true; +} diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h new file mode 100644 index 0000000000000..049860c1ab6cc --- /dev/null +++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCLFIRewriter.h @@ -0,0 +1,81 @@ +//===- AArch64MCLFIRewriter.h -----------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file declares the AArch64MCLFIRewriter class, the AArch64 specific +// subclass of MCLFIRewriter. +// +//===----------------------------------------------------------------------===// +#ifndef LLVM_LIB_TARGET_AARCH64_MCTARGETDESC_AARCH64MCLFIREWRITER_H +#define LLVM_LIB_TARGET_AARCH64_MCTARGETDESC_AARCH64MCLFIREWRITER_H + +#include "llvm/MC/MCInstrInfo.h" +#include "llvm/MC/MCLFIRewriter.h" +#include "llvm/MC/MCRegister.h" +#include "llvm/MC/MCRegisterInfo.h" + +namespace llvm { +class MCContext; +class MCInst; +class MCStreamer; +class MCSubtargetInfo; + +/// Rewrites AArch64 instructions for LFI sandboxing. +/// +/// This class implements the LFI (Lightweight Fault Isolation) rewriting +/// for AArch64 instructions. It transforms instructions to ensure memory +/// accesses and control flow are confined within the sandbox region. +/// +/// Reserved registers: +/// - X27: Sandbox base address (always holds the base) +/// - X28: Safe address register (always within sandbox) +/// - X26: Scratch register for intermediate calculations +/// - X25: context register (points to thread-local runtime data) +/// - SP: Stack pointer (always within sandbox) +/// - X30: Link register (always within sandbox) +class AArch64MCLFIRewriter : public MCLFIRewriter { +public: + AArch64MCLFIRewriter(MCContext &Ctx, std::unique_ptr &&RI, + std::unique_ptr &&II) + : MCLFIRewriter(Ctx, std::move(RI), std::move(II)) {} + + bool rewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) override; + +private: + /// Recursion guard to prevent infinite loops when emitting instructions. + bool Guard = false; + + // Instruction classification. + bool mayModifyReserved(const MCInst &Inst) const; + + // Instruction emission. + void emitInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + void emitAddMask(MCRegister Dest, MCRegister Src, MCStreamer &Out, + const MCSubtargetInfo &STI); + void emitBranch(unsigned Opcode, MCRegister Target, MCStreamer &Out, + const MCSubtargetInfo &STI); + void emitMov(MCRegister Dest, MCRegister Src, MCStreamer &Out, + const MCSubtargetInfo &STI); + + // Rewriting logic. + void doRewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + // System instructions. + void rewriteSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + void rewriteTPRead(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + void rewriteTPWrite(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); +}; + +} // namespace llvm + +#endif // LLVM_LIB_TARGET_AARCH64_MCTARGETDESC_AARCH64MCLFIREWRITER_H diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp index 5c8f57664a2cc..d681e75b314b7 100644 --- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp +++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp @@ -13,6 +13,7 @@ #include "AArch64MCTargetDesc.h" #include "AArch64ELFStreamer.h" #include "AArch64MCAsmInfo.h" +#include "AArch64MCLFIRewriter.h" #include "AArch64WinCOFFStreamer.h" #include "MCTargetDesc/AArch64AddressingModes.h" #include "MCTargetDesc/AArch64InstPrinter.h" @@ -503,6 +504,17 @@ static MCInstrAnalysis *createAArch64InstrAnalysis(const MCInstrInfo *Info) { return new AArch64MCInstrAnalysis(Info); } +static MCLFIRewriter * +createAArch64MCLFIRewriter(MCStreamer &S, + std::unique_ptr &&RegInfo, + std::unique_ptr &&InstInfo) { + auto RW = std::make_unique( + S.getContext(), std::move(RegInfo), std::move(InstInfo)); + auto *Ptr = RW.get(); + S.setLFIRewriter(std::move(RW)); + return Ptr; +} + // Force static initialization. extern "C" LLVM_ABI LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAArch64TargetMC() { @@ -532,6 +544,9 @@ LLVMInitializeAArch64TargetMC() { TargetRegistry::RegisterMachOStreamer(*T, createMachOStreamer); TargetRegistry::RegisterCOFFStreamer(*T, createAArch64WinCOFFStreamer); + // Register the LFI rewriter. + TargetRegistry::RegisterMCLFIRewriter(*T, createAArch64MCLFIRewriter); + // Register the obj target streamer. TargetRegistry::RegisterObjectTargetStreamer( *T, createAArch64ObjectTargetStreamer); diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt b/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt index 7f220657e45f8..7d8d825d7220b 100644 --- a/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt +++ b/llvm/lib/Target/AArch64/MCTargetDesc/CMakeLists.txt @@ -6,6 +6,7 @@ add_llvm_component_library(LLVMAArch64Desc AArch64MCAsmInfo.cpp AArch64MCCodeEmitter.cpp AArch64MCExpr.cpp + AArch64MCLFIRewriter.cpp AArch64MCTargetDesc.cpp AArch64MachObjectWriter.cpp AArch64TargetStreamer.cpp diff --git a/llvm/test/MC/AArch64/LFI/reserved.s b/llvm/test/MC/AArch64/LFI/reserved.s new file mode 100644 index 0000000000000..8ad5e7c56bb9e --- /dev/null +++ b/llvm/test/MC/AArch64/LFI/reserved.s @@ -0,0 +1,45 @@ +// RUN: not llvm-mc -triple aarch64_lfi %s 2>&1 | FileCheck %s + +mov x27, x0 +// CHECK: error: illegal modification of reserved LFI register +// CHECK: mov x27, x0 + +ldr x27, [x0] +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldr x27, [x0] + +add x27, x0, x1 +// CHECK: error: illegal modification of reserved LFI register +// CHECK: add x27, x0, x1 + +mov x28, x0 +// CHECK: error: illegal modification of reserved LFI register +// CHECK: mov x28, x0 + +ldr x28, [x0] +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldr x28, [x0] + +add x28, x0, x1 +// CHECK: error: illegal modification of reserved LFI register +// CHECK: add x28, x0, x1 + +ldp x27, x28, [x0] +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldp x27, x28, [x0] + +ldp x0, x27, [x1] +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldp x0, x27, [x1] + +ldp x28, x0, [x1] +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldp x28, x0, [x1] + +ldr x0, [x27], #8 +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldr x0, [x27], #8 + +ldr x0, [x28, #8]! +// CHECK: error: illegal modification of reserved LFI register +// CHECK: ldr x0, [x28, #8]! diff --git a/llvm/test/MC/AArch64/LFI/sys.s b/llvm/test/MC/AArch64/LFI/sys.s new file mode 100644 index 0000000000000..df4b694600851 --- /dev/null +++ b/llvm/test/MC/AArch64/LFI/sys.s @@ -0,0 +1,7 @@ +// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s + +svc #0 +// CHECK: mov x26, x30 +// CHECK-NEXT: ldur x30, [x27, #-8] +// CHECK-NEXT: blr x30 +// CHECK-NEXT: add x30, x27, w26, uxtw diff --git a/llvm/test/MC/AArch64/LFI/tp.s b/llvm/test/MC/AArch64/LFI/tp.s new file mode 100644 index 0000000000000..0ddf839a1d56c --- /dev/null +++ b/llvm/test/MC/AArch64/LFI/tp.s @@ -0,0 +1,13 @@ +// RUN: llvm-mc -triple aarch64_lfi %s | FileCheck %s + +mrs x0, tpidr_el0 +// CHECK: ldr x0, [x25, #16] + +mrs x1, tpidr_el0 +// CHECK: ldr x1, [x25, #16] + +msr tpidr_el0, x0 +// CHECK: str x0, [x25, #16] + +msr tpidr_el0, x1 +// CHECK: str x1, [x25, #16]