Skip to content

Commit

Permalink
[ELF][AArch64] Support for BTI and PAC
Browse files Browse the repository at this point in the history
Branch Target Identification (BTI) and Pointer Authentication (PAC) are
architecture features introduced in v8.5a and 8.3a respectively. The new
instructions have been added in the hint space so that binaries take
advantage of support where it exists yet still run on older hardware. The
impact of each feature is:

BTI: For executable pages that have been guarded, all indirect branches
must have a destination that is a BTI instruction of the appropriate type.
For the static linker, this means that PLT entries must have a "BTI c" as
the first instruction in the sequence. BTI is an all or nothing
property for a link unit, any indirect branch not landing on a valid
destination will cause a Branch Target Exception.

PAC: The dynamic loader encodes with PACIA the address of the destination
that the PLT entry will load from the .plt.got, placing the result in a
subset of the top-bits that are not valid virtual addresses. The PLT entry
may authenticate these top-bits using the AUTIA instruction before
branching to the destination. Use of PAC in PLT sequences is a contract
between the dynamic loader and the static linker, it is independent of
whether the relocatable objects use PAC.

BTI and PAC are independent features that can be combined. So we can have
several combinations of PLT:
- Standard with no BTI or PAC
- BTI PLT with "BTI c" as first instruction.
- PAC PLT with "AUTIA1716" before the indirect branch to X17.
- BTIPAC PLT with "BTI c" as first instruction and "AUTIA1716" before the
  first indirect branch to X17.
    
The use of BTI and PAC in relocatable object files are encoded by feature
bits in the .note.gnu.property section in a similar way to Intel CET. There
is one AArch64 specific program property GNU_PROPERTY_AARCH64_FEATURE_1_AND
and two target feature bits defined:
- GNU_PROPERTY_AARCH64_FEATURE_1_BTI
-- All executable sections are compatible with BTI.
- GNU_PROPERTY_AARCH64_FEATURE_1_PAC
-- All executable sections have return address signing enabled.

Due to the properties of FEATURE_1_AND the static linker can tell when all
input relocatable objects have the BTI and PAC feature bits set. The static
linker uses this to enable the appropriate PLT sequence.
Neither -> standard PLT
GNU_PROPERTY_AARCH64_FEATURE_1_BTI -> BTI PLT
GNU_PROPERTY_AARCH64_FEATURE_1_PAC -> PAC PLT
Both properties -> BTIPAC PLT

In addition to the .note.gnu.properties there are two new command line
options:
--force-bti : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_BTI and warn for every relocatable object
that does not.
--pac-plt : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_PAC. As PAC is a contract between the loader
and static linker no warning is given if it is not present in an input.

Two processor specific dynamic tags are used to communicate that a non
standard PLT sequence is being used.
DTI_AARCH64_BTI_PLT and DTI_AARCH64_BTI_PAC.

Differential Revision: https://reviews.llvm.org/D62609

llvm-svn: 362793
  • Loading branch information
smithp35 committed Jun 7, 2019
1 parent f2ddd60 commit e208208
Show file tree
Hide file tree
Showing 24 changed files with 961 additions and 12 deletions.
158 changes: 154 additions & 4 deletions lld/ELF/Arch/AArch64.cpp
Expand Up @@ -28,7 +28,7 @@ uint64_t elf::getAArch64Page(uint64_t Expr) {
}

namespace {
class AArch64 final : public TargetInfo {
class AArch64 : public TargetInfo {
public:
AArch64();
RelExpr getRelExpr(RelType Type, const Symbol &S,
Expand Down Expand Up @@ -431,7 +431,157 @@ void AArch64::relaxTlsIeToLe(uint8_t *Loc, RelType Type, uint64_t Val) const {
llvm_unreachable("invalid relocation for TLS IE to LE relaxation");
}

TargetInfo *elf::getAArch64TargetInfo() {
static AArch64 Target;
return &Target;
// AArch64 may use security features in variant PLT sequences. These are:
// Pointer Authentication (PAC), introduced in armv8.3-a and Branch Target
// Indicator (BTI) introduced in armv8.5-a. The additional instructions used
// in the variant Plt sequences are encoded in the Hint space so they can be
// deployed on older architectures, which treat the instructions as a nop.
// PAC and BTI can be combined leading to the following combinations:
// writePltHeader
// writePltHeaderBti (no PAC Header needed)
// writePlt
// writePltBti (BTI only)
// writePltPac (PAC only)
// writePltBtiPac (BTI and PAC)
//
// When PAC is enabled the dynamic loader encrypts the address that it places
// in the .got.plt using the pacia1716 instruction which encrypts the value in
// x17 using the modifier in x16. The static linker places autia1716 before the
// indirect branch to x17 to authenticate the address in x17 with the modifier
// in x16. This makes it more difficult for an attacker to modify the value in
// the .got.plt.
//
// When BTI is enabled all indirect branches must land on a bti instruction.
// The static linker must place a bti instruction at the start of any PLT entry
// that may be the target of an indirect branch. As the PLT entries call the
// lazy resolver indirectly this must have a bti instruction at start. In
// general a bti instruction is not needed for a PLT entry as indirect calls
// are resolved to the function address and not the PLT entry for the function.
// There are a small number of cases where the PLT address can escape, such as
// taking the address of a function or ifunc via a non got-generating
// relocation, and a shared library refers to that symbol.
//
// We use the bti c variant of the instruction which permits indirect branches
// (br) via x16/x17 and indirect function calls (blr) via any register. The ABI
// guarantees that all indirect branches from code requiring BTI protection
// will go via x16/x17

namespace {
class AArch64BtiPac final : public AArch64 {
public:
AArch64BtiPac();
void writePltHeader(uint8_t *Buf) const override;
void writePlt(uint8_t *Buf, uint64_t GotPltEntryAddr, uint64_t PltEntryAddr,
int32_t Index, unsigned RelOff) const override;

private:
bool BtiHeader; // bti instruction needed in PLT Header
bool BtiEntry; // bti instruction needed in PLT Entry
bool PacEntry; // autia1716 instruction needed in PLT Entry
};
} // namespace

AArch64BtiPac::AArch64BtiPac() {
BtiHeader = (Config->AndFeatures & GNU_PROPERTY_AARCH64_FEATURE_1_BTI);
// A BTI (Branch Target Indicator) Plt Entry is only required if the
// address of the PLT entry can be taken by the program, which permits an
// indirect jump to the PLT entry. This can happen when the address
// of the PLT entry for a function is canonicalised due to the address of
// the function in an executable being taken by a shared library.
// FIXME: There is a potential optimization to omit the BTI if we detect
// that the address of the PLT entry isn't taken.
BtiEntry = BtiHeader && !Config->Shared;
PacEntry = (Config->AndFeatures & GNU_PROPERTY_AARCH64_FEATURE_1_PAC);

if (BtiEntry || PacEntry)
PltEntrySize = 24;
}

void AArch64BtiPac::writePltHeader(uint8_t *Buf) const {
const uint8_t BtiData[] = { 0x5f, 0x24, 0x03, 0xd5 }; // bti c
const uint8_t PltData[] = {
0xf0, 0x7b, 0xbf, 0xa9, // stp x16, x30, [sp,#-16]!
0x10, 0x00, 0x00, 0x90, // adrp x16, Page(&(.plt.got[2]))
0x11, 0x02, 0x40, 0xf9, // ldr x17, [x16, Offset(&(.plt.got[2]))]
0x10, 0x02, 0x00, 0x91, // add x16, x16, Offset(&(.plt.got[2]))
0x20, 0x02, 0x1f, 0xd6, // br x17
0x1f, 0x20, 0x03, 0xd5, // nop
0x1f, 0x20, 0x03, 0xd5 // nop
};
const uint8_t NopData[] = { 0x1f, 0x20, 0x03, 0xd5 }; // nop

uint64_t Got = In.GotPlt->getVA();
uint64_t Plt = In.Plt->getVA();

if (BtiHeader) {
// PltHeader is called indirectly by Plt[N]. Prefix PltData with a BTI C
// instruction.
memcpy(Buf, BtiData, sizeof(BtiData));
Buf += sizeof(BtiData);
Plt += sizeof(BtiData);
}
memcpy(Buf, PltData, sizeof(PltData));

relocateOne(Buf + 4, R_AARCH64_ADR_PREL_PG_HI21,
getAArch64Page(Got + 16) - getAArch64Page(Plt + 8));
relocateOne(Buf + 8, R_AARCH64_LDST64_ABS_LO12_NC, Got + 16);
relocateOne(Buf + 12, R_AARCH64_ADD_ABS_LO12_NC, Got + 16);
if (!BtiHeader)
// We didn't add the BTI c instruction so round out size with NOP.
memcpy(Buf + sizeof(PltData), NopData, sizeof(NopData));
}

void AArch64BtiPac::writePlt(uint8_t *Buf, uint64_t GotPltEntryAddr,
uint64_t PltEntryAddr, int32_t Index,
unsigned RelOff) const {
// The PLT entry is of the form:
// [BtiData] AddrInst (PacBr | StdBr) [NopData]
const uint8_t BtiData[] = { 0x5f, 0x24, 0x03, 0xd5 }; // bti c
const uint8_t AddrInst[] = {
0x10, 0x00, 0x00, 0x90, // adrp x16, Page(&(.plt.got[n]))
0x11, 0x02, 0x40, 0xf9, // ldr x17, [x16, Offset(&(.plt.got[n]))]
0x10, 0x02, 0x00, 0x91 // add x16, x16, Offset(&(.plt.got[n]))
};
const uint8_t PacBr[] = {
0x9f, 0x21, 0x03, 0xd5, // autia1716
0x20, 0x02, 0x1f, 0xd6 // br x17
};
const uint8_t StdBr[] = {
0x20, 0x02, 0x1f, 0xd6, // br x17
0x1f, 0x20, 0x03, 0xd5 // nop
};
const uint8_t NopData[] = { 0x1f, 0x20, 0x03, 0xd5 }; // nop

if (BtiEntry) {
memcpy(Buf, BtiData, sizeof(BtiData));
Buf += sizeof(BtiData);
PltEntryAddr += sizeof(BtiData);
}

memcpy(Buf, AddrInst, sizeof(AddrInst));
relocateOne(Buf, R_AARCH64_ADR_PREL_PG_HI21,
getAArch64Page(GotPltEntryAddr) -
getAArch64Page(PltEntryAddr));
relocateOne(Buf + 4, R_AARCH64_LDST64_ABS_LO12_NC, GotPltEntryAddr);
relocateOne(Buf + 8, R_AARCH64_ADD_ABS_LO12_NC, GotPltEntryAddr);

if (PacEntry)
memcpy(Buf + sizeof(AddrInst), PacBr, sizeof(PacBr));
else
memcpy(Buf + sizeof(AddrInst), StdBr, sizeof(StdBr));
if (!BtiEntry)
// We didn't add the BTI c instruction so round out size with NOP.
memcpy(Buf + sizeof(AddrInst) + sizeof(StdBr), NopData, sizeof(NopData));
}

static TargetInfo *getTargetInfo() {
if (Config->AndFeatures & (GNU_PROPERTY_AARCH64_FEATURE_1_BTI |
GNU_PROPERTY_AARCH64_FEATURE_1_PAC)) {
static AArch64BtiPac T;
return &T;
}
static AArch64 T;
return &T;
}

TargetInfo *elf::getAArch64TargetInfo() { return getTargetInfo(); }
2 changes: 2 additions & 0 deletions lld/ELF/Config.h
Expand Up @@ -147,6 +147,7 @@ struct Configuration {
bool ExecuteOnly;
bool ExportDynamic;
bool FixCortexA53Errata843419;
bool ForceBTI;
bool FormatBinary = false;
bool RequireCET;
bool GcSections;
Expand All @@ -168,6 +169,7 @@ struct Configuration {
bool OFormatBinary;
bool Omagic;
bool OptRemarksWithHotness;
bool PacPlt;
bool PicThunk;
bool Pie;
bool PrintGcSections;
Expand Down
30 changes: 28 additions & 2 deletions lld/ELF/Driver.cpp
Expand Up @@ -337,6 +337,13 @@ static void checkOptions() {

if (Config->ZRetpolineplt && Config->RequireCET)
error("--require-cet may not be used with -z retpolineplt");

if (Config->EMachine != EM_AARCH64) {
if (Config->PacPlt)
error("--pac-plt only supported on AArch64");
if (Config->ForceBTI)
error("--force-bti only supported on AArch64");
}
}

static const char *getReproduceOption(opt::InputArgList &Args) {
Expand Down Expand Up @@ -816,6 +823,7 @@ static void readConfigs(opt::InputArgList &Args) {
Config->FilterList = args::getStrings(Args, OPT_filter);
Config->Fini = Args.getLastArgValue(OPT_fini, "_fini");
Config->FixCortexA53Errata843419 = Args.hasArg(OPT_fix_cortex_a53_843419);
Config->ForceBTI = Args.hasArg(OPT_force_bti);
Config->RequireCET = Args.hasArg(OPT_require_cet);
Config->GcSections = Args.hasFlag(OPT_gc_sections, OPT_no_gc_sections, false);
Config->GnuUnique = Args.hasFlag(OPT_gnu_unique, OPT_no_gnu_unique, true);
Expand Down Expand Up @@ -851,6 +859,7 @@ static void readConfigs(opt::InputArgList &Args) {
Config->Optimize = args::getInteger(Args, OPT_O, 1);
Config->OrphanHandling = getOrphanHandling(Args);
Config->OutputFile = Args.getLastArgValue(OPT_o);
Config->PacPlt = Args.hasArg(OPT_pac_plt);
Config->Pie = Args.hasFlag(OPT_pie, OPT_no_pie, false);
Config->PrintIcfSections =
Args.hasFlag(OPT_print_icf_sections, OPT_no_print_icf_sections, false);
Expand Down Expand Up @@ -1594,20 +1603,32 @@ static void wrapSymbols(ArrayRef<WrappedSymbol> Wrapped) {
// with CET.
//
// This function returns the merged feature flags. If 0, we cannot enable CET.
// This is also the case with AARCH64's BTI and PAC which use the similar
// GNU_PROPERTY_AARCH64_FEATURE_1_AND mechanism.
//
// Note that the CET-aware PLT is not implemented yet. We do error
// check only.
template <class ELFT> static uint32_t getAndFeatures() {
if (Config->EMachine != EM_386 && Config->EMachine != EM_X86_64)
if (Config->EMachine != EM_386 && Config->EMachine != EM_X86_64 &&
Config->EMachine != EM_AARCH64)
return 0;

uint32_t Ret = -1;
for (InputFile *F : ObjectFiles) {
uint32_t Features = cast<ObjFile<ELFT>>(F)->AndFeatures;
if (!Features && Config->RequireCET)
if (Config->ForceBTI && !(Features & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)) {
warn(toString(F) + ": --force-bti: file does not have BTI property");
Features |= GNU_PROPERTY_AARCH64_FEATURE_1_BTI;
} else if (!Features && Config->RequireCET)
error(toString(F) + ": --require-cet: file is not compatible with CET");
Ret &= Features;
}

// Force enable pointer authentication Plt, we don't warn in this case as
// this does not require support in the object for correctness.
if (Config->PacPlt)
Ret |= GNU_PROPERTY_AARCH64_FEATURE_1_PAC;

return Ret;
}

Expand Down Expand Up @@ -1793,6 +1814,11 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &Args) {
// contain a hint to tweak linker's and loader's behaviors.
Config->AndFeatures = getAndFeatures<ELFT>();

// The Target instance handles target-specific stuff, such as applying
// relocations or writing a PLT section. It also contains target-dependent
// values such as a default image base address.
Target = getTarget();

Config->EFlags = Target->calcEFlags();
// MaxPageSize (sometimes called abi page size) is the maximum page size that
// the output can be run on. For example if the OS can use 4k or 64k page
Expand Down
11 changes: 8 additions & 3 deletions lld/ELF/InputFiles.cpp
Expand Up @@ -787,6 +787,10 @@ static uint32_t readAndFeatures(ObjFile<ELFT> *Obj, ArrayRef<uint8_t> Data) {
continue;
}

uint32_t FeatureAndType = Config->EMachine == EM_AARCH64
? GNU_PROPERTY_AARCH64_FEATURE_1_AND
: GNU_PROPERTY_X86_FEATURE_1_AND;

// Read a body of a NOTE record, which consists of type-length-value fields.
ArrayRef<uint8_t> Desc = Note.getDesc();
while (!Desc.empty()) {
Expand All @@ -796,7 +800,7 @@ static uint32_t readAndFeatures(ObjFile<ELFT> *Obj, ArrayRef<uint8_t> Data) {
uint32_t Type = read32le(Desc.data());
uint32_t Size = read32le(Desc.data() + 4);

if (Type == GNU_PROPERTY_X86_FEATURE_1_AND) {
if (Type == FeatureAndType) {
// We found a FEATURE_1_AND field. There may be more than one of these
// in a .note.gnu.propery section, for a relocatable object we
// accumulate the bits set.
Expand Down Expand Up @@ -966,8 +970,9 @@ InputSectionBase *ObjFile<ELFT>::createInputSection(const Elf_Shdr &Sec) {
if (Name == ".note.GNU-stack")
return &InputSection::Discarded;

// If an object file is compatible with Intel Control-Flow Enforcement
// Technology (CET), it has a .note.gnu.property section containing the
// Object files that use processor features such as Intel Control-Flow
// Enforcement (CET) or AArch64 Branch Target Identification BTI, use a
// .note.gnu.property section containing a bitfield of feature bits like the
// GNU_PROPERTY_X86_FEATURE_1_IBT flag. Read a bitmap containing the flag.
//
// Since we merge bitmaps from multiple object files to create a new
Expand Down
6 changes: 6 additions & 0 deletions lld/ELF/Options.td
Expand Up @@ -175,6 +175,9 @@ def fix_cortex_a53_843419: F<"fix-cortex-a53-843419">,
// is not complete.
def require_cet: F<"require-cet">;

def force_bti: F<"force-bti">,
HelpText<"Force enable AArch64 BTI in PLT, warn if Input ELF file does not have GNU_PROPERTY_AARCH64_FEATURE_1_BTI property">;

defm format: Eq<"format", "Change the input format of the inputs following this option">,
MetaVarName<"[default,elf,binary]">;

Expand Down Expand Up @@ -269,6 +272,9 @@ defm pack_dyn_relocs:
Eq<"pack-dyn-relocs", "Pack dynamic relocations in the given format">,
MetaVarName<"[none,android,relr,android+relr]">;

def pac_plt: F<"pac-plt">,
HelpText<"AArch64 only, use pointer authentication in PLT">;

defm use_android_relr_tags: B<"use-android-relr-tags",
"Use SHT_ANDROID_RELR / DT_ANDROID_RELR* tags instead of SHT_RELR / DT_RELR*",
"Use SHT_RELR / DT_RELR* tags (default)">;
Expand Down
18 changes: 15 additions & 3 deletions lld/ELF/SyntheticSections.cpp
Expand Up @@ -290,8 +290,9 @@ static size_t getHashSize() {

// This class represents a linker-synthesized .note.gnu.property section.
//
// In x86, object files may contain feature flags indicating the features that
// they are using. The flags are stored in a .note.gnu.property section.
// In x86 and AArch64, object files may contain feature flags indicating the
// features that they have used. The flags are stored in a .note.gnu.property
// section.
//
// lld reads the sections from input files and merges them by computing AND of
// the flags. The result is written as a new .note.gnu.property section.
Expand All @@ -304,11 +305,15 @@ GnuPropertySection::GnuPropertySection()
".note.gnu.property") {}

void GnuPropertySection::writeTo(uint8_t *Buf) {
uint32_t FeatureAndType = Config->EMachine == EM_AARCH64
? GNU_PROPERTY_AARCH64_FEATURE_1_AND
: GNU_PROPERTY_X86_FEATURE_1_AND;

write32(Buf, 4); // Name size
write32(Buf + 4, Config->Is64 ? 16 : 12); // Content size
write32(Buf + 8, NT_GNU_PROPERTY_TYPE_0); // Type
memcpy(Buf + 12, "GNU", 4); // Name string
write32(Buf + 16, GNU_PROPERTY_X86_FEATURE_1_AND); // Feature type
write32(Buf + 16, FeatureAndType); // Feature type
write32(Buf + 20, 4); // Feature size
write32(Buf + 24, Config->AndFeatures); // Feature flags
if (Config->Is64)
Expand Down Expand Up @@ -1340,6 +1345,13 @@ template <class ELFT> void DynamicSection<ELFT>::finalizeContents() {
addInt(DT_PLTREL, Config->IsRela ? DT_RELA : DT_REL);
}

if (Config->EMachine == EM_AARCH64) {
if (Config->AndFeatures & GNU_PROPERTY_AARCH64_FEATURE_1_BTI)
addInt(DT_AARCH64_BTI_PLT, 0);
if (Config->AndFeatures & GNU_PROPERTY_AARCH64_FEATURE_1_PAC)
addInt(DT_AARCH64_PAC_PLT, 0);
}

addInSec(DT_SYMTAB, In.DynSymTab);
addInt(DT_SYMENT, sizeof(Elf_Sym));
addInSec(DT_STRTAB, In.DynStrTab);
Expand Down
4 changes: 4 additions & 0 deletions lld/docs/ld.lld.1
Expand Up @@ -182,6 +182,8 @@ Set the
field to the specified value.
.It Fl -fini Ns = Ns Ar symbol
Specify a finalizer function.
.It Fl -force-bti
Force enable AArch64 BTI instruction in PLT, warn if Input ELF file does not have GNU_PROPERTY_AARCH64_FEATURE_1_BTI property.
.It Fl -format Ns = Ns Ar input-format , Fl b Ar input-format
Specify the format of the inputs following this option.
.Ar input-format
Expand Down Expand Up @@ -382,6 +384,8 @@ is the default. If
.Fl -use-android-relr-tags
is specified, use SHT_ANDROID_RELR instead of SHT_RELR.
.Pp
.It Fl -pac-plt
AArch64 only, use pointer authentication in PLT.
.It Fl -pic-veneer
Always generate position independent thunks.
.It Fl -pie , Fl -pic-executable
Expand Down
8 changes: 8 additions & 0 deletions lld/test/ELF/Inputs/aarch64-addrifunc.s
@@ -0,0 +1,8 @@
.text
.globl myfunc
.globl func1
.type func1, %function
func1:
adrp x8, :got: myfunc
ldr x8, [x8, :got_lo12: myfunc]
ret

0 comments on commit e208208

Please sign in to comment.