Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
81c5d46
[MLIR][NVVM] Propagate verification failure for unsupported SM target…
Men-cotton Dec 1, 2025
036279a
[lldb][debugserver] Return shared cache filepath in jGetSharedCacheIn…
jasonmolenda Dec 1, 2025
9416b19
[InstCombine] Add missing constant check (#170068)
dtcxzyw Dec 1, 2025
dc5ce79
[LV] Regenerate some check lines. NFC
lukel97 Dec 1, 2025
bbb0dba
[clang][AST] Add `RecordDecl::getNumFields()` (#170022)
tbaederr Dec 1, 2025
a751ed9
[BOLT] Support runtime library hook via DT_INIT_ARRAY (#167467)
vleonen Dec 1, 2025
4d7abe5
[mlir][arith] Add support for `cmpf` to `ArithToAPFloat` (#169753)
matthias-springer Dec 1, 2025
17677ad
[LV] Don't create WidePtrAdd recipes for scalar VFs (#169344)
david-arm Dec 1, 2025
7ce7141
[NFC][Linalg] Follow-up on ConvMatchBuilder (#170080)
Abhishek-Varma Dec 1, 2025
f67b018
[mlir][SPIRV] Improve ub.unreachable lowering test case (#170083)
matthias-springer Dec 1, 2025
05b1989
[mlir][arith] Add support for `negf` to `ArithToAPFloat` (#169759)
matthias-springer Dec 1, 2025
9afb651
Adding support for iterator in motion clauses. (#159112)
ShashwathiNavada Dec 1, 2025
147c466
[mlir][arith] Add support for min/max to `ArithToAPFloat` (#169760)
matthias-springer Dec 1, 2025
eb711d8
[clang-tidy][doc] Fix incorrect link syntax in cppcoreguidelines-pro-…
carlosgalvezp Dec 1, 2025
8079d03
[CAS] Temporarily skip tests on old windows version (#170063)
cachemeifyoucan Dec 1, 2025
8e6fb0e
Reapply "[BOLT][BTI] Skip inlining BasicBlocks containing indirect ta…
bgergely0 Dec 1, 2025
dda15ad
[mlir][spirv] Use MapVector for BlockMergeInfoMap (#169636)
IgWod-IMG Dec 1, 2025
1317083
[AArch64][SME] Support saving/restoring ZT0 in the MachineSMEABIPass …
MacDue Dec 1, 2025
34c44f2
[flang][TBAA] refine TARGET/POINTER encoding (#169544)
tblah Dec 1, 2025
8ec2112
[OMPIRBuilder] re-land cancel barriers patch #164586 (#169931)
tblah Dec 1, 2025
2c9e9ff
[SCCP] Handle llvm.experimental.get.vector.length calls (#169527)
lukel97 Dec 1, 2025
b162099
[clang][bytecode] Fix discarding ImplitiValueInitExprs (#170089)
tbaederr Dec 1, 2025
d1500d1
[SelectionDAG] Add SelectionDAG::getTypeSize. NFC (#169764)
lukel97 Dec 1, 2025
b7721c5
[RISCV] Remove the duplicate for RV32/RV64 in zicond-fp-select-zfinx.…
tclin914 Dec 1, 2025
8ceeba8
[MLIR][SCF] Canonicalize redundant scf.if from scf.while before regio…
NexMing Dec 1, 2025
29fef3a
[BOLT] Improve DWARF CFI generation for pac-ret binaries (#163381)
bgergely0 Dec 1, 2025
2c21790
Revert "[MLIR][SCF] Sink scf.if from scf.while before region into aft…
NexMing Dec 1, 2025
b60a84a
Revert "[flang][TBAA] refine TARGET/POINTER encoding" (#170105)
tblah Dec 1, 2025
bf22687
[OMPIRBuilder] CANCEL IF(FALSE) is still a cancellation point (#170095)
tblah Dec 1, 2025
6c0a02f
[X86] Add tests showing failure to concat sqrt intrinsics together. (…
RKSimon Dec 1, 2025
0e721b7
[X86] Add tests showing failure to concat RCPPS + RSQRTPS intrinsics …
RKSimon Dec 1, 2025
edd1856
[WebAssembly] Optimize away mask of 63 for shl ( zext (and i32 63))) …
badumbatish Dec 1, 2025
130746a
[MLIR] Fix build after #169982 (#170107)
jplehr Dec 1, 2025
577cd6f
[LIT] Workaround the 60 processed limit on Windows (#157759)
joker-eph Dec 1, 2025
48931e5
[clang][bytecode] Check memcmp builtin for one-past-the-end pointers …
tbaederr Dec 1, 2025
d0df51b
[ConstantRange] Allow casting to the same bitwidth. NFC (#170102)
lukel97 Dec 1, 2025
5877020
[DA] Clean up unnecessary member function declarations (#170106)
kasuga-fj Dec 1, 2025
6157d46
[MLIR|BUILD]: Fix for 8ceeba838 (#170110)
sohaibiftikhar Dec 1, 2025
989ac4c
[X86] Add tests showing failure to concat fp rounding intrinsics toge…
RKSimon Dec 1, 2025
efdebf8
merge main into amd-staging
ronlieb Dec 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions bolt/docs/CommandLineArgumentReference.md
Original file line number Diff line number Diff line change
Expand Up @@ -811,6 +811,15 @@

Specify file name of the runtime instrumentation library

- `--runtime-lib-init-hook=<value>`

Primary target for hooking runtime library initialization, used in
fallback order of availability in input binary (entry_point -> init
-> init_array) (default: entry_point)
- `entry_point`: use ELF Header Entry Point
- `init`: use ELF DT_INIT entry
- `init_array`: use ELF .init_array entry

- `--sctc-mode=<value>`

Mode for simplify conditional tail calls
Expand Down
21 changes: 14 additions & 7 deletions bolt/docs/PacRetDesign.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,15 +200,22 @@ This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa
Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
to know what RA state these have.

The current solution has the `inferUnknownStates` function to cover these, using
a fairly simple strategy: unknown states inherit the last known state.
> [!important]
> As issue #160989 explains, unwind info is missing from stubs.
> For this same reason, we cannot generate correct pac-specific unwind info: the
> signedness of the _incorrect_ return address is meaningless.

This will be updated to a more robust solution.
Assignment of RAStates to newly generated instructions is done in `inferUnknownStates`.
We have two different cases to cover:

> [!important]
> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
> For this same reason, we cannot generate correct pac-specific unwind info: the signess
> of the _incorrect_ return address is meaningless.
1. If a BasicBlock has some instructions with known RA state, and some without, we
can copy the RAState of known instructions to the unknown ones. As the control
flow only changes between BasicBlocks, instructions in the same BasicBlock have
the same return address. (The exception is noreturn calls, but these would only
cause problems, if the newly inserted instruction is right after the call.)

2. If a BasicBlock has no instructions with known RAState, we have to copy the
RAState of the previous BasicBlock in layout order.

### Optimizations requiring special attention

Expand Down
9 changes: 9 additions & 0 deletions bolt/include/bolt/Core/BinaryContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -807,6 +807,15 @@ class BinaryContext {
/// the execution of the binary is completed.
std::optional<uint64_t> FiniFunctionAddress;

/// DT_INIT.
std::optional<uint64_t> InitAddress;

/// DT_INIT_ARRAY. Only used when DT_INIT is not set.
std::optional<uint64_t> InitArrayAddress;

/// DT_INIT_ARRAYSZ. Only used when DT_INIT is not set.
std::optional<uint64_t> InitArraySize;

/// DT_FINI.
std::optional<uint64_t> FiniAddress;

Expand Down
25 changes: 23 additions & 2 deletions bolt/include/bolt/Passes/InsertNegateRAStatePass.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//===- bolt/Passes/InsertNegateRAStatePass.cpp ----------------------------===//
//===- bolt/Passes/InsertNegateRAStatePass.h ------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
Expand Down Expand Up @@ -30,9 +30,30 @@ class InsertNegateRAState : public BinaryFunctionPass {
private:
/// Because states are tracked as MCAnnotations on individual instructions,
/// newly inserted instructions do not have a state associated with them.
/// New states are "inherited" from the last known state.
/// Uses fillUnknownStateInBB and fillUnknownStubs.
void inferUnknownStates(BinaryFunction &BF);

/// Simple case: copy RAStates to unknown insts from previous inst.
/// If the first inst has unknown state, copy set it to the first known state.
/// Accounts for signing and authenticating insts.
void fillUnknownStateInBB(BinaryContext &BC, BinaryBasicBlock &BB);

/// Fill in RAState in BasicBlocks consisting entirely of new instructions.
/// As of #160989, we have to copy the RAState from the previous BB in the
/// layout, because CFIs are already incorrect here.
void fillUnknownStubs(BinaryFunction &BF);

/// Returns the first known RAState from \p BB, or std::nullopt if all are
/// unknown.
std::optional<bool> getFirstKnownRAState(BinaryContext &BC,
BinaryBasicBlock &BB);

/// \p Return true if all instructions have unknown RAState.
bool isUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB);

/// Set all instructions in \p BB to \p State.
void markUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB, bool State);

/// Support for function splitting:
/// if two consecutive BBs with Signed state are going to end up in different
/// functions (so are held by different FunctionFragments), we have to add a
Expand Down
11 changes: 10 additions & 1 deletion bolt/include/bolt/Rewrite/RewriteInstance.h
Original file line number Diff line number Diff line change
Expand Up @@ -93,14 +93,23 @@ class RewriteInstance {
/// section allocations if found.
void discoverBOLTReserved();

/// Check whether we should use DT_INIT or DT_INIT_ARRAY for instrumentation.
/// DT_INIT is preferred; DT_INIT_ARRAY is only used when no DT_INIT entry was
/// found.
Error discoverRtInitAddress();

/// Check whether we should use DT_FINI or DT_FINI_ARRAY for instrumentation.
/// DT_FINI is preferred; DT_FINI_ARRAY is only used when no DT_FINI entry was
/// found.
Error discoverRtFiniAddress();

/// If DT_INIT_ARRAY is used for instrumentation, update the relocation of its
/// first entry to point to the instrumentation library's init address.
Error updateRtInitReloc();

/// If DT_FINI_ARRAY is used for instrumentation, update the relocation of its
/// first entry to point to the instrumentation library's fini address.
void updateRtFiniReloc();
Error updateRtFiniReloc();

/// Create and initialize metadata rewriters for this instance.
void initializeMetadataManager();
Expand Down
26 changes: 26 additions & 0 deletions bolt/lib/Passes/Inliner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,32 @@ bool Inliner::inlineCallsInFunction(BinaryFunction &Function) {
}
}

// AArch64 BTI:
// If the callee has an indirect tailcall (BR), we would transform it to
// an indirect call (BLR) in InlineCall. Because of this, we would have to
// update the BTI at the target of the tailcall. However, these targets
// are not known. Instead, we skip inlining blocks with indirect
// tailcalls.
auto HasIndirectTailCall = [&](const BinaryFunction &BF) -> bool {
for (const auto &BB : BF) {
for (const auto &II : BB) {
if (BC.MIB->isIndirectBranch(II) && BC.MIB->isTailCall(II)) {
return true;
}
}
}
return false;
};

if (BC.isAArch64() && BC.usesBTI() &&
HasIndirectTailCall(*TargetFunction)) {
++InstIt;
LLVM_DEBUG(dbgs() << "BOLT-DEBUG: Skipping inlining block with tailcall"
<< " in " << Function << " : " << BB->getName()
<< " to keep BTIs consistent.\n");
continue;
}

LLVM_DEBUG(dbgs() << "BOLT-DEBUG: inlining call to " << *TargetFunction
<< " in " << Function << " : " << BB->getName()
<< ". Count: " << BB->getKnownExecutionCount()
Expand Down
147 changes: 124 additions & 23 deletions bolt/lib/Passes/InsertNegateRAStatePass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
MCInst &Inst = *It;
if (BC.MIB->isCFI(Inst))
continue;
auto RAState = BC.MIB->getRAState(Inst);
if (!RAState) {
std::optional<bool> RAState = BC.MIB->getRAState(Inst);
if (!RAState.has_value()) {
BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
<< " in function " << BF.getPrintName() << "\n";
PassFailed = true;
Expand All @@ -74,6 +74,20 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
}
}

void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
BinaryContext &BC = BF.getBinaryContext();

// Fill in missing RAStates in simple cases (inside BBs).
for (BinaryBasicBlock &BB : BF) {
fillUnknownStateInBB(BC, BB);
}
// BasicBlocks which are made entirely of "new instructions" (instructions
// without RAState annotation) are stubs, and do not have correct unwind info.
// We should iterate in layout order and fill them based on previous known
// RAState.
fillUnknownStubs(BF);
}

void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
FunctionFragment &FF) {
BinaryContext &BC = BF.getBinaryContext();
Expand All @@ -92,8 +106,8 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
// If a function is already split in the input, the first FF can also start
// with Signed state. This covers that scenario as well.
auto II = (*FirstNonEmpty)->getFirstNonPseudo();
auto RAState = BC.MIB->getRAState(*II);
if (!RAState) {
std::optional<bool> RAState = BC.MIB->getRAState(*II);
if (!RAState.has_value()) {
BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
<< " in function " << BF.getPrintName() << "\n";
PassFailed = true;
Expand All @@ -104,32 +118,119 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
MCCFIInstruction::createNegateRAState(nullptr));
}

void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
std::optional<bool>
InsertNegateRAState::getFirstKnownRAState(BinaryContext &BC,
BinaryBasicBlock &BB) {
for (const MCInst &Inst : BB) {
if (BC.MIB->isCFI(Inst))
continue;
std::optional<bool> RAState = BC.MIB->getRAState(Inst);
if (RAState.has_value())
return RAState;
}
return std::nullopt;
}

bool InsertNegateRAState::isUnknownBlock(BinaryContext &BC,
BinaryBasicBlock &BB) {
std::optional<bool> FirstRAState = getFirstKnownRAState(BC, BB);
return !FirstRAState.has_value();
}

void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
BinaryBasicBlock &BB) {

auto First = BB.getFirstNonPseudo();
if (First == BB.end())
return;
// If the first instruction has unknown RAState, we should copy the first
// known RAState.
std::optional<bool> RAState = BC.MIB->getRAState(*First);
if (!RAState.has_value()) {
std::optional<bool> FirstRAState = getFirstKnownRAState(BC, BB);
if (!FirstRAState.has_value())
// We fill unknown BBs later.
return;

BC.MIB->setRAState(*First, *FirstRAState);
}

// At this point we know the RAState of the first instruction,
// so we can propagate the RAStates to all subsequent unknown instructions.
MCInst Prev = *First;
for (auto It = First + 1; It != BB.end(); ++It) {
MCInst &Inst = *It;
if (BC.MIB->isCFI(Inst))
continue;

// No need to check for nullopt: we only entered this loop after the first
// instruction had its RAState set, and RAState is always set for the
// previous instruction in the previous iteration of the loop.
std::optional<bool> PrevRAState = BC.MIB->getRAState(Prev);

std::optional<bool> RAState = BC.MIB->getRAState(Inst);
if (!RAState.has_value()) {
if (BC.MIB->isPSignOnLR(Prev))
PrevRAState = true;
else if (BC.MIB->isPAuthOnLR(Prev))
PrevRAState = false;
BC.MIB->setRAState(Inst, *PrevRAState);
}
Prev = Inst;
}
}

void InsertNegateRAState::markUnknownBlock(BinaryContext &BC,
BinaryBasicBlock &BB, bool State) {
// If we call this when an Instruction has either kRASigned or kRAUnsigned
// annotation, setRASigned or setRAUnsigned would fail.
assert(isUnknownBlock(BC, BB) &&
"markUnknownBlock should only be called on unknown blocks");
for (MCInst &Inst : BB) {
if (BC.MIB->isCFI(Inst))
continue;
BC.MIB->setRAState(Inst, State);
}
}

void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
BinaryContext &BC = BF.getBinaryContext();
bool FirstIter = true;
MCInst PrevInst;
for (BinaryBasicBlock &BB : BF) {
for (MCInst &Inst : BB) {
if (BC.MIB->isCFI(Inst))
continue;
for (FunctionFragment &FF : BF.getLayout().fragments()) {
for (BinaryBasicBlock *BB : FF) {
if (FirstIter) {
FirstIter = false;
if (isUnknownBlock(BC, *BB))
// If the first BasicBlock is unknown, the function's entry RAState
// should be used.
markUnknownBlock(BC, *BB, BF.getInitialRAState());
} else if (isUnknownBlock(BC, *BB)) {
// As explained in issue #160989, the unwind info is incorrect for
// stubs. Indicating the correct RAState without the rest of the unwind
// info being correct is not useful. Instead, we copy the RAState from
// the previous instruction.
std::optional<bool> PrevRAState = BC.MIB->getRAState(PrevInst);
if (!PrevRAState.has_value()) {
// No non-cfi instruction encountered in the function yet.
// This means the RAState is the same as at the function entry.
markUnknownBlock(BC, *BB, BF.getInitialRAState());
continue;
}

auto RAState = BC.MIB->getRAState(Inst);
if (!FirstIter && !RAState) {
if (BC.MIB->isPSignOnLR(PrevInst))
RAState = true;
PrevRAState = true;
else if (BC.MIB->isPAuthOnLR(PrevInst))
RAState = false;
else {
auto PrevRAState = BC.MIB->getRAState(PrevInst);
RAState = PrevRAState ? *PrevRAState : false;
}
BC.MIB->setRAState(Inst, *RAState);
} else {
FirstIter = false;
if (!RAState)
BC.MIB->setRAState(Inst, BF.getInitialRAState());
PrevRAState = false;
markUnknownBlock(BC, *BB, *PrevRAState);
}
PrevInst = Inst;
// This function iterates on BasicBlocks, so the PrevInst has to be
// updated to the last instruction of the current BasicBlock. If the
// BasicBlock is empty, or only has PseudoInstructions, PrevInst will not
// be updated.
auto Last = BB->getLastNonPseudo();
if (Last != BB->rend())
PrevInst = *Last;
}
}
}
Expand Down
Loading