26 changes: 19 additions & 7 deletions .github/workflows/pr-code-format.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
name: "Check code formatting"

permissions:
contents: read

on:
pull_request_target:
pull_request:
branches:
- main

permissions:
pull-requests: write

jobs:
code_formatter:
runs-on: ubuntu-latest
Expand All @@ -31,12 +32,13 @@ jobs:
separator: ","
skip_initial_fetch: true

# We need to make sure that we aren't executing/using any code from the
# PR for security reasons as we're using pull_request_target. Checkout
# the target branch with the necessary files.
# We need to pull the script from the main branch, so that we ensure
# we get the latest version of this script.
- name: Fetch code formatting utils
uses: actions/checkout@v4
with:
repository: ${{ github.repository }}
ref: ${{ github.base_ref }}
sparse-checkout: |
llvm/utils/git/requirements_formatting.txt
llvm/utils/git/code-format-helper.py
Expand Down Expand Up @@ -75,10 +77,20 @@ jobs:
# to take advantage of the new --diff_from_common_commit option
# explicitly in code-format-helper.py and not have to diff starting at
# the merge base.
# Create an empty comments file so the pr-write job doesn't fail.
run: |
echo "[]" > comments &&
python ./code-format-tools/llvm/utils/git/code-format-helper.py \
--write-comment-to-file \
--token ${{ secrets.GITHUB_TOKEN }} \
--issue-number $GITHUB_PR_NUMBER \
--start-rev $(git merge-base $START_REV $END_REV) \
--end-rev $END_REV \
--changed-files "$CHANGED_FILES"
- uses: actions/upload-artifact@26f96dfa697d77e81fd5907df203aa23a56210a8 #v4.3.0
if: always()
with:
name: workflow-args
path: |
comments
2 changes: 1 addition & 1 deletion .github/workflows/release-lit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ jobs:
cd llvm/utils/lit
# Remove 'dev' suffix from lit version.
sed -i 's/ + "dev"//g' lit/__init__.py
python3 setup.py sdist
python3 setup.py sdist bdist_wheel
- name: Upload lit to test.pypi.org
uses: pypa/gh-action-pypi-publish@release/v1
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/scorecard.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
persist-credentials: false

- name: "Run analysis"
uses: ossf/scorecard-action@e38b1902ae4f44df626f11ba0734b14fb91f8f86 # v2.1.2
uses: ossf/scorecard-action@0864cf19026789058feabb7e87baa5f140aac736 # v2.3.1
with:
results_file: results.sarif
results_format: sarif
Expand Down
71 changes: 42 additions & 29 deletions bolt/docs/BAT.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,21 +42,21 @@ and [BoltAddressTranslation.cpp](/bolt/lib/Profile/BoltAddressTranslation.cpp).
### Layout
The general layout is as follows:
```
Hot functions table header
|------------------|
| Function entry |
| |--------------| |
| | OutOff InOff | |
| |--------------| |
~~~~~~~~~~~~~~~~~~~~
Hot functions table
Cold functions table
Cold functions table header
Functions table:
|------------------|
| Function entry |
| |--------------| |
| | OutOff InOff | |
| |--------------| |
~~~~~~~~~~~~~~~~~~~~
| |
| Address |
| translation |
| table |
| |
| Secondary entry |
| points |
|------------------|
```

### Functions table
Expand All @@ -74,30 +74,43 @@ internal offsets, and between hot and cold fragments, to better spread deltas
and save space.

Hot indices are delta encoded, implicitly starting at zero.
| Entry | Encoding | Description |
| ------ | ------| ----------- |
| `Address` | Continuous, Delta, ULEB128 | Function address in the output binary |
| `HotIndex` | Delta, ULEB128 | Cold functions only: index of corresponding hot function in hot functions table |
| `FuncHash` | 8b | Hot functions only: function hash for input function |
| `NumEntries` | ULEB128 | Number of address translation entries for a function |
| `EqualElems` | ULEB128 | Hot functions only: number of equal offsets in the beginning of a function |
| `BranchEntries` | Bitmask, `alignTo(EqualElems, 8)` bits | Hot functions only: if `EqualElems` is non-zero, bitmask denoting entries with `BRANCHENTRY` bit |

Function header is followed by `EqualElems` offsets (hot functions only) and
`NumEntries-EqualElems` (`NumEntries` for cold functions) pairs of offsets for
current function.
| Entry | Encoding | Description | Hot/Cold |
| ------ | ------| ----------- | ------ |
| `Address` | Continuous, Delta, ULEB128 | Function address in the output binary | Both |
| `HotIndex` | Delta, ULEB128 | Index of corresponding hot function in hot functions table | Cold |
| `FuncHash` | 8b | Function hash for input function | Hot |
| `NumBlocks` | ULEB128 | Number of basic blocks in the original function | Hot |
| `NumSecEntryPoints` | ULEB128 | Number of secondary entry points in the original function | Hot |
| `NumEntries` | ULEB128 | Number of address translation entries for a function | Both |
| `EqualElems` | ULEB128 | Number of equal offsets in the beginning of a function | Hot |
| `BranchEntries` | Bitmask, `alignTo(EqualElems, 8)` bits | If `EqualElems` is non-zero, bitmask denoting entries with `BRANCHENTRY` bit | Hot |

Function header is followed by *Address Translation Table* with `NumEntries`
total entries, and *Secondary Entry Points* table with `NumSecEntryPoints`
entries (hot functions only).

### Address translation table
Delta encoding means that only the difference with the previous corresponding
entry is encoded. Input offsets implicitly start at zero.
| Entry | Encoding | Description |
| ------ | ------| ----------- |
| `OutputOffset` | Continuous, Delta, ULEB128 | Function offset in output binary |
| `InputOffset` | Optional, Delta, SLEB128 | Function offset in input binary with `BRANCHENTRY` LSB bit |
| `BBHash` | Optional, 8b | Basic block entries only: basic block hash in input binary |
| Entry | Encoding | Description | Branch/BB |
| ------ | ------| ----------- | ------ |
| `OutputOffset` | Continuous, Delta, ULEB128 | Function offset in output binary | Both |
| `InputOffset` | Optional, Delta, SLEB128 | Function offset in input binary with `BRANCHENTRY` LSB bit | Both |
| `BBHash` | Optional, 8b | Basic block hash in input binary | BB |
| `BBIdx` | Optional, Delta, ULEB128 | Basic block index in input binary | BB |

For hot fragments, the table omits the first `EqualElems` input offsets
where the input offset equals output offset.

`BRANCHENTRY` bit denotes whether a given offset pair is a control flow source
(branch or call instruction). If not set, it signifies a control flow target
(basic block offset).
`InputAddr` is omitted for equal offsets in input and output function. In this
case, `BRANCHENTRY` bits are encoded separately in a `BranchEntries` bitvector.

### Secondary Entry Points table
The table is emitted for hot fragments only. It contains `NumSecEntryPoints`
offsets denoting secondary entry points, delta encoded, implicitly starting at zero.
| Entry | Encoding | Description |
| ----- | -------- | ----------- |
| `SecEntryPoint` | Delta, ULEB128 | Secondary entry point offset |
1 change: 0 additions & 1 deletion bolt/include/bolt/Core/AddressMap.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#ifndef BOLT_CORE_ADDRESS_MAP_H
#define BOLT_CORE_ADDRESS_MAP_H

#include "llvm/ADT/StringRef.h"
#include "llvm/MC/MCSymbol.h"

#include <optional>
Expand Down
3 changes: 2 additions & 1 deletion bolt/include/bolt/Core/BinaryContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,8 @@ class BinaryContext {

public:
static Expected<std::unique_ptr<BinaryContext>>
createBinaryContext(const ObjectFile *File, bool IsPIC,
createBinaryContext(Triple TheTriple, StringRef InputFileName,
SubtargetFeatures *Features, bool IsPIC,
std::unique_ptr<DWARFContext> DwCtx,
JournalingStreams Logger);

Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Core/BinaryData.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
#include "llvm/ADT/Twine.h"
#include "llvm/MC/MCSymbol.h"
#include "llvm/Support/raw_ostream.h"
#include <algorithm>
#include <string>
#include <vector>

Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Core/BinaryDomTree.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@

#include "bolt/Core/BinaryBasicBlock.h"
#include "llvm/IR/Dominators.h"
#include "llvm/Support/GenericDomTreeConstruction.h"

namespace llvm {
namespace bolt {
Expand Down
16 changes: 13 additions & 3 deletions bolt/include/bolt/Core/BinaryFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@

#include "bolt/Core/BinaryBasicBlock.h"
#include "bolt/Core/BinaryContext.h"
#include "bolt/Core/BinaryDomTree.h"
#include "bolt/Core/BinaryLoop.h"
#include "bolt/Core/BinarySection.h"
#include "bolt/Core/DebugData.h"
Expand All @@ -51,7 +52,6 @@
#include <iterator>
#include <limits>
#include <unordered_map>
#include <unordered_set>
#include <utility>
#include <vector>

Expand Down Expand Up @@ -266,6 +266,7 @@ class BinaryFunction {
BinaryContext &BC;

std::unique_ptr<BinaryLoopInfo> BLI;
std::unique_ptr<BinaryDominatorTree> BDT;

/// All labels in the function that are referenced via relocations from
/// data objects. Typically these are jump table destinations and computed
Expand Down Expand Up @@ -838,6 +839,14 @@ class BinaryFunction {
/// stats.
void calculateMacroOpFusionStats();

/// Returns if BinaryDominatorTree has been constructed for this function.
bool hasDomTree() const { return BDT != nullptr; }

BinaryDominatorTree &getDomTree() { return *BDT.get(); }

/// Constructs DomTree for this function.
void constructDomTree();

/// Returns if loop detection has been run for this function.
bool hasLoopInfo() const { return BLI != nullptr; }

Expand Down Expand Up @@ -1159,7 +1168,7 @@ class BinaryFunction {
/// Pass an offset of the entry point in the input binary and a corresponding
/// global symbol to the callback function.
///
/// Return true of all callbacks returned true, false otherwise.
/// Return true if all callbacks returned true, false otherwise.
bool forEachEntryPoint(EntryPointCallbackTy Callback) const;

/// Return MC symbol associated with the end of the function.
Expand Down Expand Up @@ -1393,7 +1402,8 @@ class BinaryFunction {

/// Return true if the function has CFI instructions
bool hasCFI() const {
return !FrameInstructions.empty() || !CIEFrameInstructions.empty();
return !FrameInstructions.empty() || !CIEFrameInstructions.empty() ||
IsInjected;
}

/// Return unique number associated with the function.
Expand Down
2 changes: 1 addition & 1 deletion bolt/include/bolt/Core/BinaryLoop.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#ifndef BOLT_CORE_BINARY_LOOP_H
#define BOLT_CORE_BINARY_LOOP_H

#include "llvm/Support/GenericLoopInfoImpl.h"
#include "llvm/Support/GenericLoopInfo.h"

namespace llvm {
namespace bolt {
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Core/BinarySection.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
#include "bolt/Core/DebugData.h"
#include "bolt/Core/Relocation.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/BinaryFormat/ELF.h"
#include "llvm/Object/ELFObjectFile.h"
#include "llvm/Object/MachO.h"
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Core/DebugData.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
#include <mutex>
#include <string>
#include <unordered_map>
#include <unordered_set>
#include <utility>
#include <vector>

Expand Down
13 changes: 12 additions & 1 deletion bolt/include/bolt/Core/DebugNames.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#ifndef BOLT_CORE_DEBUG_NAMES_H
#define BOLT_CORE_DEBUG_NAMES_H

#include "DebugData.h"
#include "bolt/Core/DebugData.h"
#include "llvm/CodeGen/AccelTable.h"

namespace llvm {
Expand Down Expand Up @@ -68,6 +68,16 @@ class DWARF5AcceleratorTable {
std::unique_ptr<DebugBufferVector> releaseBuffer() {
return std::move(FullTableBuffer);
}
/// Adds a DIE that is referenced across CUs.
void addCrossCUDie(const DIE *Die) {
CrossCUDies.insert({Die->getOffset(), Die});
}
/// Returns true if the DIE can generate an entry for a cross cu reference.
/// This only checks TAGs of a DIE because when this is invoked DIE might not
/// be fully constructed.
bool canGenerateEntryWithCrossCUReference(
const DWARFUnit &Unit, const DIE &Die,
const DWARFAbbreviationDeclaration::AttributeSpec &AttrSpec);

private:
BinaryContext &BC;
Expand Down Expand Up @@ -128,6 +138,7 @@ class DWARF5AcceleratorTable {
llvm::DenseMap<uint64_t, uint32_t> CUOffsetsToPatch;
// Contains a map of Entry ID to Entry relative offset.
llvm::DenseMap<uint64_t, uint32_t> EntryRelativeOffsets;
llvm::DenseMap<uint64_t, const DIE *> CrossCUDies;
/// Adds Unit to either CUList, LocalTUList or ForeignTUList.
/// Input Unit being processed, and DWO ID if Unit is being processed comes
/// from a DWO section.
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Core/FunctionLayout.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@
#include "llvm/ADT/iterator.h"
#include "llvm/ADT/iterator_range.h"
#include <iterator>
#include <utility>

namespace llvm {
namespace bolt {
Expand Down
2 changes: 0 additions & 2 deletions bolt/include/bolt/Core/MCPlus.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,8 @@
#ifndef BOLT_CORE_MCPLUS_H
#define BOLT_CORE_MCPLUS_H

#include "llvm/CodeGen/TargetOpcodes.h"
#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCInst.h"
#include "llvm/Support/Casting.h"
#include <vector>

namespace llvm {
Expand Down
6 changes: 3 additions & 3 deletions bolt/include/bolt/Core/MCPlusBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/StringMap.h"
#include "llvm/CodeGen/TargetOpcodes.h"
#include "llvm/MC/MCAsmBackend.h"
#include "llvm/MC/MCDisassembler/MCSymbolizer.h"
#include "llvm/MC/MCExpr.h"
Expand All @@ -27,6 +28,7 @@
#include "llvm/MC/MCInstrDesc.h"
#include "llvm/MC/MCInstrInfo.h"
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/RWMutex.h"
Expand Down Expand Up @@ -533,9 +535,7 @@ class MCPlusBuilder {
return Analysis->isReturn(Inst);
}

virtual bool isTerminator(const MCInst &Inst) const {
return Analysis->isTerminator(Inst);
}
virtual bool isTerminator(const MCInst &Inst) const;

virtual bool isNoop(const MCInst &Inst) const {
llvm_unreachable("not implemented");
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Passes/BinaryPasses.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
#include "bolt/Core/DynoStats.h"
#include "llvm/Support/CommandLine.h"
#include <atomic>
#include <map>
#include <set>
#include <string>
#include <unordered_set>
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Passes/CacheMetrics.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
#ifndef BOLT_PASSES_CACHEMETRICS_H
#define BOLT_PASSES_CACHEMETRICS_H

#include <cstdint>
#include <vector>

namespace llvm {
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Passes/DominatorAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@

#include "bolt/Passes/DataflowAnalysis.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Timer.h"

namespace opts {
extern llvm::cl::opt<bool> TimeOpts;
Expand Down
2 changes: 0 additions & 2 deletions bolt/include/bolt/Passes/ReachingDefOrUse.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,7 @@

#include "bolt/Passes/DataflowAnalysis.h"
#include "bolt/Passes/RegAnalysis.h"
#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Timer.h"
#include <optional>

namespace opts {
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Passes/ReachingInsns.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@

#include "bolt/Passes/DataflowAnalysis.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Timer.h"

namespace opts {
extern llvm::cl::opt<bool> TimeOpts;
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Passes/ReorderUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#ifndef BOLT_PASSES_REORDER_UTILS_H
#define BOLT_PASSES_REORDER_UTILS_H

#include <memory>
#include <vector>

#include "llvm/ADT/BitVector.h"
Expand Down
137 changes: 127 additions & 10 deletions bolt/include/bolt/Profile/BoltAddressTranslation.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include <unordered_map>

namespace llvm {
class MCSymbol;
class raw_ostream;

namespace object {
Expand Down Expand Up @@ -115,23 +116,23 @@ class BoltAddressTranslation {
/// Save function and basic block hashes used for metadata dump.
void saveMetadata(BinaryContext &BC);

/// Returns BB hash by function output address (after BOLT) and basic block
/// input offset.
size_t getBBHash(uint64_t FuncOutputAddress, uint32_t BBInputOffset) const;

/// Returns BF hash by function output address (after BOLT).
size_t getBFHash(uint64_t OutputAddress) const;

/// True if a given \p Address is a function with translation table entry.
bool isBATFunction(uint64_t Address) const { return Maps.count(Address); }

/// For a given \p Symbol in the output binary and known \p InputOffset
/// return a corresponding pair of parent BinaryFunction and secondary entry
/// point in it.
std::pair<const BinaryFunction *, unsigned>
translateSymbol(const BinaryContext &BC, const MCSymbol &Symbol,
uint32_t InputOffset) const;

private:
/// Helper to update \p Map by inserting one or more BAT entries reflecting
/// \p BB for function located at \p FuncAddress. At least one entry will be
/// emitted for the start of the BB. More entries may be emitted to cover
/// the location of calls or any instruction that may change control flow.
void writeEntriesForBB(MapTy &Map, const BinaryBasicBlock &BB,
uint64_t FuncAddress);
uint64_t FuncInputAddress, uint64_t FuncOutputAddress);

/// Write the serialized address translation table for a function.
template <bool Cold>
Expand All @@ -154,8 +155,15 @@ class BoltAddressTranslation {

std::map<uint64_t, MapTy> Maps;

using BBHashMap = std::unordered_map<uint32_t, size_t>;
std::unordered_map<uint64_t, std::pair<size_t, BBHashMap>> FuncHashes;
/// Map a function to its basic blocks count
std::unordered_map<uint64_t, size_t> NumBasicBlocksMap;

/// Map a function to its secondary entry points vector
std::unordered_map<uint64_t, std::vector<uint32_t>> SecondaryEntryPointsMap;

/// Return a secondary entry point ID for a function located at \p Address and
/// \p Offset within that function.
unsigned getSecondaryEntryPointId(uint64_t Address, uint32_t Offset) const;

/// Links outlined cold bocks to their original function
std::map<uint64_t, uint64_t> ColdPartSource;
Expand All @@ -166,6 +174,115 @@ class BoltAddressTranslation {
/// Identifies the address of a control-flow changing instructions in a
/// translation map entry
const static uint32_t BRANCHENTRY = 0x1;

public:
/// Map basic block input offset to a basic block index and hash pair.
class BBHashMapTy {
class EntryTy {
unsigned Index;
size_t Hash;

public:
unsigned getBBIndex() const { return Index; }
size_t getBBHash() const { return Hash; }
EntryTy(unsigned Index, size_t Hash) : Index(Index), Hash(Hash) {}
};

std::map<uint32_t, EntryTy> Map;
const EntryTy &getEntry(uint32_t BBInputOffset) const {
auto It = Map.find(BBInputOffset);
assert(It != Map.end());
return It->second;
}

public:
bool isInputBlock(uint32_t InputOffset) const {
return Map.count(InputOffset);
}

unsigned getBBIndex(uint32_t BBInputOffset) const {
return getEntry(BBInputOffset).getBBIndex();
}

size_t getBBHash(uint32_t BBInputOffset) const {
return getEntry(BBInputOffset).getBBHash();
}

void addEntry(uint32_t BBInputOffset, unsigned BBIndex, size_t BBHash) {
Map.emplace(BBInputOffset, EntryTy(BBIndex, BBHash));
}

size_t getNumBasicBlocks() const { return Map.size(); }

auto begin() const { return Map.begin(); }
auto end() const { return Map.end(); }
auto upper_bound(uint32_t Offset) const { return Map.upper_bound(Offset); }
};

/// Map function output address to its hash and basic blocks hash map.
class FuncHashesTy {
class EntryTy {
size_t Hash;
BBHashMapTy BBHashMap;

public:
size_t getBFHash() const { return Hash; }
const BBHashMapTy &getBBHashMap() const { return BBHashMap; }
EntryTy(size_t Hash) : Hash(Hash) {}
};

std::unordered_map<uint64_t, EntryTy> Map;
const EntryTy &getEntry(uint64_t FuncOutputAddress) const {
auto It = Map.find(FuncOutputAddress);
assert(It != Map.end());
return It->second;
}

public:
size_t getBFHash(uint64_t FuncOutputAddress) const {
return getEntry(FuncOutputAddress).getBFHash();
}

const BBHashMapTy &getBBHashMap(uint64_t FuncOutputAddress) const {
return getEntry(FuncOutputAddress).getBBHashMap();
}

void addEntry(uint64_t FuncOutputAddress, size_t BFHash) {
Map.emplace(FuncOutputAddress, EntryTy(BFHash));
}

size_t getNumFunctions() const { return Map.size(); };

size_t getNumBasicBlocks() const {
size_t NumBasicBlocks{0};
for (auto &I : Map)
NumBasicBlocks += I.second.getBBHashMap().getNumBasicBlocks();
return NumBasicBlocks;
}
};

/// Returns BF hash by function output address (after BOLT).
size_t getBFHash(uint64_t FuncOutputAddress) const {
return FuncHashes.getBFHash(FuncOutputAddress);
}

/// Returns BBHashMap by function output address (after BOLT).
const BBHashMapTy &getBBHashMap(uint64_t FuncOutputAddress) const {
return FuncHashes.getBBHashMap(FuncOutputAddress);
}

BBHashMapTy &getBBHashMap(uint64_t FuncOutputAddress) {
return const_cast<BBHashMapTy &>(
std::as_const(*this).getBBHashMap(FuncOutputAddress));
}

/// Returns the number of basic blocks in a function.
size_t getNumBasicBlocks(uint64_t OutputAddress) const {
return NumBasicBlocksMap.at(OutputAddress);
}

private:
FuncHashesTy FuncHashes;
};
} // namespace bolt

Expand Down
16 changes: 9 additions & 7 deletions bolt/include/bolt/Profile/DataAggregator.h
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,10 @@ class DataAggregator : public DataReader {
/// Aggregation statistics
uint64_t NumInvalidTraces{0};
uint64_t NumLongRangeTraces{0};
/// Specifies how many samples were recorded in cold areas if we are dealing
/// with profiling data collected in a bolted binary. For LBRs, incremented
/// for the source of the branch to avoid counting cold activity twice (one
/// for source and another for destination).
uint64_t NumColdSamples{0};

/// Looks into system PATH for Linux Perf and set up the aggregator to use it
Expand All @@ -245,14 +249,12 @@ class DataAggregator : public DataReader {
/// disassembled BinaryFunctions
BinaryFunction *getBinaryFunctionContainingAddress(uint64_t Address) const;

/// Perform BAT translation for a given \p Func and return the parent
/// BinaryFunction or nullptr.
BinaryFunction *getBATParentFunction(const BinaryFunction &Func) const;

/// Retrieve the location name to be used for samples recorded in \p Func.
/// If doing BAT translation, link cold parts to the hot part names (used by
/// the original binary). \p Count specifies how many samples were recorded
/// at that location, so we can tally total activity in cold areas if we are
/// dealing with profiling data collected in a bolted binary. For LBRs,
/// \p Count should only be used for the source of the branch to avoid
/// counting cold activity twice (one for source and another for destination).
StringRef getLocationName(BinaryFunction &Func, uint64_t Count);
StringRef getLocationName(const BinaryFunction &Func) const;

/// Semantic actions - parser hooks to interpret parsed perf samples
/// Register a sample (non-LBR mode), i.e. a new hit at \p Address
Expand Down
2 changes: 1 addition & 1 deletion bolt/include/bolt/Profile/ProfileReaderBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ class ProfileReaderBase {
/// Return true if the function \p BF may have a profile available.
/// The result is based on the name(s) of the function alone and the profile
/// match is not guaranteed.
virtual bool mayHaveProfileData(const BinaryFunction &BF);
virtual bool mayHaveProfileData(const BinaryFunction &BF) { return true; }

/// Return true if the profile contains an entry for a local object
/// that has an associated file name.
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Profile/ProfileYAMLMapping.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#define BOLT_PROFILE_PROFILEYAMLMAPPING_H

#include "bolt/Core/BinaryFunction.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/YAMLTraits.h"
#include <vector>

Expand Down
13 changes: 11 additions & 2 deletions bolt/include/bolt/Profile/YAMLProfileWriter.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

namespace llvm {
namespace bolt {
class BoltAddressTranslation;
class RewriteInstance;

class YAMLProfileWriter {
Expand All @@ -31,8 +32,16 @@ class YAMLProfileWriter {
/// Save execution profile for that instance.
std::error_code writeProfile(const RewriteInstance &RI);

static yaml::bolt::BinaryFunctionProfile convert(const BinaryFunction &BF,
bool UseDFS);
static yaml::bolt::BinaryFunctionProfile
convert(const BinaryFunction &BF, bool UseDFS,
const BoltAddressTranslation *BAT = nullptr);

/// Set CallSiteInfo destination fields from \p Symbol and return a target
/// BinaryFunction for that symbol.
static const BinaryFunction *
setCSIDestination(const BinaryContext &BC, yaml::bolt::CallSiteInfo &CSI,
const MCSymbol *Symbol, const BoltAddressTranslation *BAT,
uint32_t Offset = 0);
};

} // namespace bolt
Expand Down
2 changes: 0 additions & 2 deletions bolt/include/bolt/Rewrite/DWARFRewriter.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,7 @@
#include <memory>
#include <mutex>
#include <optional>
#include <set>
#include <unordered_map>
#include <unordered_set>
#include <vector>

namespace llvm {
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Rewrite/MetadataManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@

#include "bolt/Rewrite/MetadataRewriter.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/Support/Error.h"

namespace llvm {
namespace bolt {
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Rewrite/RewriteInstance.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
#include "bolt/Core/Linker.h"
#include "bolt/Rewrite/MetadataManager.h"
#include "bolt/Utils/NameResolver.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/MC/StringTableBuilder.h"
#include "llvm/Object/ELFObjectFile.h"
#include "llvm/Object/ObjectFile.h"
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/RuntimeLibs/RuntimeLibrary.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@

#include "bolt/Core/Linker.h"
#include "llvm/ADT/StringRef.h"
#include <functional>
#include <vector>

namespace llvm {
Expand Down
1 change: 0 additions & 1 deletion bolt/include/bolt/Utils/NameShortener.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#define BOLT_UTILS_NAME_SHORTENER_H

#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/Twine.h"

namespace llvm {
namespace bolt {
Expand Down
67 changes: 41 additions & 26 deletions bolt/lib/Core/BinaryContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#include "bolt/Core/BinaryEmitter.h"
#include "bolt/Core/BinaryFunction.h"
#include "bolt/Utils/CommandLineOpts.h"
#include "bolt/Utils/NameResolver.h"
#include "bolt/Utils/Utils.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Twine.h"
Expand All @@ -39,7 +38,6 @@
#include <algorithm>
#include <functional>
#include <iterator>
#include <numeric>
#include <unordered_set>

using namespace llvm;
Expand Down Expand Up @@ -162,28 +160,30 @@ BinaryContext::~BinaryContext() {

/// Create BinaryContext for a given architecture \p ArchName and
/// triple \p TripleName.
Expected<std::unique_ptr<BinaryContext>>
BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
std::unique_ptr<DWARFContext> DwCtx,
JournalingStreams Logger) {
Expected<std::unique_ptr<BinaryContext>> BinaryContext::createBinaryContext(
Triple TheTriple, StringRef InputFileName, SubtargetFeatures *Features,
bool IsPIC, std::unique_ptr<DWARFContext> DwCtx, JournalingStreams Logger) {
StringRef ArchName = "";
std::string FeaturesStr = "";
switch (File->getArch()) {
switch (TheTriple.getArch()) {
case llvm::Triple::x86_64:
if (Features)
return createFatalBOLTError(
"x86_64 target does not use SubtargetFeatures");
ArchName = "x86-64";
FeaturesStr = "+nopl";
break;
case llvm::Triple::aarch64:
if (Features)
return createFatalBOLTError(
"AArch64 target does not use SubtargetFeatures");
ArchName = "aarch64";
FeaturesStr = "+all";
break;
case llvm::Triple::riscv64: {
ArchName = "riscv64";
Expected<SubtargetFeatures> Features = File->getFeatures();

if (auto E = Features.takeError())
return std::move(E);

if (!Features)
return createFatalBOLTError("RISCV target needs SubtargetFeatures");
// We rely on relaxation for some transformations (e.g., promoting all calls
// to PseudoCALL and then making JITLink relax them). Since the relax
// feature is not stored in the object file, we manually enable it.
Expand All @@ -196,12 +196,11 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
"BOLT-ERROR: Unrecognized machine in ELF file");
}

auto TheTriple = std::make_unique<Triple>(File->makeTriple());
const std::string TripleName = TheTriple->str();
const std::string TripleName = TheTriple.str();

std::string Error;
const Target *TheTarget =
TargetRegistry::lookupTarget(std::string(ArchName), *TheTriple, Error);
TargetRegistry::lookupTarget(std::string(ArchName), TheTriple, Error);
if (!TheTarget)
return createStringError(make_error_code(std::errc::not_supported),
Twine("BOLT-ERROR: ", Error));
Expand Down Expand Up @@ -240,13 +239,13 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
Twine("BOLT-ERROR: no instruction info for target ", TripleName));

std::unique_ptr<MCContext> Ctx(
new MCContext(*TheTriple, AsmInfo.get(), MRI.get(), STI.get()));
new MCContext(TheTriple, AsmInfo.get(), MRI.get(), STI.get()));
std::unique_ptr<MCObjectFileInfo> MOFI(
TheTarget->createMCObjectFileInfo(*Ctx, IsPIC));
Ctx->setObjectFileInfo(MOFI.get());
// We do not support X86 Large code model. Change this in the future.
bool Large = false;
if (TheTriple->getArch() == llvm::Triple::aarch64)
if (TheTriple.getArch() == llvm::Triple::aarch64)
Large = true;
unsigned LSDAEncoding =
Large ? dwarf::DW_EH_PE_absptr : dwarf::DW_EH_PE_udata4;
Expand All @@ -273,7 +272,7 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,

int AsmPrinterVariant = AsmInfo->getAssemblerDialect();
std::unique_ptr<MCInstPrinter> InstructionPrinter(
TheTarget->createMCInstPrinter(*TheTriple, AsmPrinterVariant, *AsmInfo,
TheTarget->createMCInstPrinter(TheTriple, AsmPrinterVariant, *AsmInfo,
*MII, *MRI));
if (!InstructionPrinter)
return createStringError(
Expand All @@ -285,8 +284,8 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
TheTarget->createMCCodeEmitter(*MII, *Ctx));

auto BC = std::make_unique<BinaryContext>(
std::move(Ctx), std::move(DwCtx), std::move(TheTriple), TheTarget,
std::string(TripleName), std::move(MCE), std::move(MOFI),
std::move(Ctx), std::move(DwCtx), std::make_unique<Triple>(TheTriple),
TheTarget, std::string(TripleName), std::move(MCE), std::move(MOFI),
std::move(AsmInfo), std::move(MII), std::move(STI),
std::move(InstructionPrinter), std::move(MIA), nullptr, std::move(MRI),
std::move(DisAsm), Logger);
Expand All @@ -296,7 +295,7 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
BC->MAB = std::unique_ptr<MCAsmBackend>(
BC->TheTarget->createMCAsmBackend(*BC->STI, *BC->MRI, MCTargetOptions()));

BC->setFilename(File->getFileName());
BC->setFilename(InputFileName);

BC->HasFixedLoadAddress = !IsPIC;

Expand Down Expand Up @@ -556,6 +555,9 @@ bool BinaryContext::analyzeJumpTable(const uint64_t Address,
const uint64_t NextJTAddress,
JumpTable::AddressesType *EntriesAsAddress,
bool *HasEntryInFragment) const {
// Target address of __builtin_unreachable.
const uint64_t UnreachableAddress = BF.getAddress() + BF.getSize();

// Is one of the targets __builtin_unreachable?
bool HasUnreachable = false;

Expand All @@ -565,9 +567,15 @@ bool BinaryContext::analyzeJumpTable(const uint64_t Address,
// Number of targets other than __builtin_unreachable.
uint64_t NumRealEntries = 0;

auto addEntryAddress = [&](uint64_t EntryAddress) {
if (EntriesAsAddress)
EntriesAsAddress->emplace_back(EntryAddress);
// Size of the jump table without trailing __builtin_unreachable entries.
size_t TrimmedSize = 0;

auto addEntryAddress = [&](uint64_t EntryAddress, bool Unreachable = false) {
if (!EntriesAsAddress)
return;
EntriesAsAddress->emplace_back(EntryAddress);
if (!Unreachable)
TrimmedSize = EntriesAsAddress->size();
};

ErrorOr<const BinarySection &> Section = getSectionForAddress(Address);
Expand Down Expand Up @@ -619,8 +627,8 @@ bool BinaryContext::analyzeJumpTable(const uint64_t Address,
: *getPointerAtAddress(EntryAddress);

// __builtin_unreachable() case.
if (Value == BF.getAddress() + BF.getSize()) {
addEntryAddress(Value);
if (Value == UnreachableAddress) {
addEntryAddress(Value, /*Unreachable*/ true);
HasUnreachable = true;
LLVM_DEBUG(dbgs() << formatv("OK: {0:x} __builtin_unreachable\n", Value));
continue;
Expand Down Expand Up @@ -674,6 +682,13 @@ bool BinaryContext::analyzeJumpTable(const uint64_t Address,
addEntryAddress(Value);
}

// Trim direct/normal jump table to exclude trailing unreachable entries that
// can collide with a function address.
if (Type == JumpTable::JTT_NORMAL && EntriesAsAddress &&
TrimmedSize != EntriesAsAddress->size() &&
getBinaryFunctionAtAddress(UnreachableAddress))
EntriesAsAddress->resize(TrimmedSize);

// It's a jump table if the number of real entries is more than 1, or there's
// one real entry and one or more special targets. If there are only multiple
// special targets, then it's not a jump table.
Expand Down
20 changes: 13 additions & 7 deletions bolt/lib/Core/BinaryFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@

#include "bolt/Core/BinaryFunction.h"
#include "bolt/Core/BinaryBasicBlock.h"
#include "bolt/Core/BinaryDomTree.h"
#include "bolt/Core/DynoStats.h"
#include "bolt/Core/HashUtilities.h"
#include "bolt/Core/MCPlusBuilder.h"
Expand All @@ -35,6 +34,8 @@
#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/GenericDomTreeConstruction.h"
#include "llvm/Support/GenericLoopInfoImpl.h"
#include "llvm/Support/GraphWriter.h"
#include "llvm/Support/LEB128.h"
#include "llvm/Support/Regex.h"
Expand Down Expand Up @@ -3547,7 +3548,7 @@ MCSymbol *BinaryFunction::getSymbolForEntryID(uint64_t EntryID) {
if (!isMultiEntry())
return nullptr;

uint64_t NumEntries = 0;
uint64_t NumEntries = 1;
if (hasCFG()) {
for (BinaryBasicBlock *BB : BasicBlocks) {
MCSymbol *EntrySymbol = getSecondaryEntryPointSymbol(*BB);
Expand Down Expand Up @@ -3580,7 +3581,7 @@ uint64_t BinaryFunction::getEntryIDForSymbol(const MCSymbol *Symbol) const {
return 0;

// Check all secondary entries available as either basic blocks or lables.
uint64_t NumEntries = 0;
uint64_t NumEntries = 1;
for (const BinaryBasicBlock *BB : BasicBlocks) {
MCSymbol *EntrySymbol = getSecondaryEntryPointSymbol(*BB);
if (!EntrySymbol)
Expand All @@ -3589,7 +3590,7 @@ uint64_t BinaryFunction::getEntryIDForSymbol(const MCSymbol *Symbol) const {
return NumEntries;
++NumEntries;
}
NumEntries = 0;
NumEntries = 1;
for (const std::pair<const uint32_t, MCSymbol *> &KV : Labels) {
MCSymbol *EntrySymbol = getSecondaryEntryPointSymbol(KV.second);
if (!EntrySymbol)
Expand Down Expand Up @@ -4076,12 +4077,17 @@ BinaryFunction::~BinaryFunction() {
delete BB;
}

void BinaryFunction::constructDomTree() {
BDT.reset(new BinaryDominatorTree);
BDT->recalculate(*this);
}

void BinaryFunction::calculateLoopInfo() {
if (!hasDomTree())
constructDomTree();
// Discover loops.
BinaryDominatorTree DomTree;
DomTree.recalculate(*this);
BLI.reset(new BinaryLoopInfo());
BLI->analyze(DomTree);
BLI->analyze(getDomTree());

// Traverse discovered loops and add depth and profile information.
std::stack<BinaryLoop *> St;
Expand Down
7 changes: 5 additions & 2 deletions bolt/lib/Core/DIEBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@
#include "llvm/Support/Casting.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/Format.h"
#include "llvm/Support/LEB128.h"

#include <algorithm>
Expand Down Expand Up @@ -545,6 +544,10 @@ void DIEBuilder::cloneDieReferenceAttribute(
NewRefDie = DieInfo.Die;

if (AttrSpec.Form == dwarf::DW_FORM_ref_addr) {
// Adding referenced DIE to DebugNames to be used when entries are created
// that contain cross cu references.
if (DebugNamesTable.canGenerateEntryWithCrossCUReference(U, Die, AttrSpec))
DebugNamesTable.addCrossCUDie(DieInfo.Die);
// no matter forward reference or backward reference, we are supposed
// to calculate them in `finish` due to the possible modification of
// the DIE.
Expand All @@ -554,7 +557,7 @@ void DIEBuilder::cloneDieReferenceAttribute(
std::make_pair(CurDieInfo, AddrReferenceInfo(&DieInfo, AttrSpec)));

Die.addValue(getState().DIEAlloc, AttrSpec.Attr, dwarf::DW_FORM_ref_addr,
DIEInteger(0xDEADBEEF));
DIEInteger(DieInfo.Die->getOffset()));
return;
}

Expand Down
3 changes: 0 additions & 3 deletions bolt/lib/Core/DebugData.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
#include "bolt/Core/DebugData.h"
#include "bolt/Core/BinaryContext.h"
#include "bolt/Core/DIEBuilder.h"
#include "bolt/Rewrite/RewriteInstance.h"
#include "bolt/Utils/Utils.h"
#include "llvm/BinaryFormat/Dwarf.h"
#include "llvm/CodeGen/DIE.h"
Expand All @@ -23,7 +22,6 @@
#include "llvm/MC/MCAssembler.h"
#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCObjectStreamer.h"
#include "llvm/Support/Allocator.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/EndianStream.h"
#include "llvm/Support/LEB128.h"
Expand All @@ -32,7 +30,6 @@
#include <cassert>
#include <cstdint>
#include <functional>
#include <limits>
#include <memory>
#include <unordered_map>
#include <vector>
Expand Down
109 changes: 68 additions & 41 deletions bolt/lib/Core/DebugNames.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,55 @@ static bool shouldIncludeVariable(const DWARFUnit &Unit, const DIE &Die) {
return false;
}

bool static canProcess(const DWARFUnit &Unit, const DIE &Die,
std::string &NameToUse, const bool TagsOnly) {
switch (Die.getTag()) {
case dwarf::DW_TAG_base_type:
case dwarf::DW_TAG_class_type:
case dwarf::DW_TAG_enumeration_type:
case dwarf::DW_TAG_imported_declaration:
case dwarf::DW_TAG_pointer_type:
case dwarf::DW_TAG_structure_type:
case dwarf::DW_TAG_typedef:
case dwarf::DW_TAG_unspecified_type:
if (TagsOnly || Die.findAttribute(dwarf::Attribute::DW_AT_name))
return true;
return false;
case dwarf::DW_TAG_namespace:
// According to DWARF5 spec namespaces without DW_AT_name needs to have
// "(anonymous namespace)"
if (!Die.findAttribute(dwarf::Attribute::DW_AT_name))
NameToUse = "(anonymous namespace)";
return true;
case dwarf::DW_TAG_inlined_subroutine:
case dwarf::DW_TAG_label:
case dwarf::DW_TAG_subprogram:
if (TagsOnly || Die.findAttribute(dwarf::Attribute::DW_AT_low_pc) ||
Die.findAttribute(dwarf::Attribute::DW_AT_high_pc) ||
Die.findAttribute(dwarf::Attribute::DW_AT_ranges) ||
Die.findAttribute(dwarf::Attribute::DW_AT_entry_pc))
return true;
return false;
case dwarf::DW_TAG_variable:
return TagsOnly || shouldIncludeVariable(Unit, Die);
default:
break;
}
return false;
}

bool DWARF5AcceleratorTable::canGenerateEntryWithCrossCUReference(
const DWARFUnit &Unit, const DIE &Die,
const DWARFAbbreviationDeclaration::AttributeSpec &AttrSpec) {
if (!isCreated())
return false;
std::string NameToUse = "";
if (!canProcess(Unit, Die, NameToUse, true))
return false;
return (AttrSpec.Attr == dwarf::Attribute::DW_AT_abstract_origin ||
AttrSpec.Attr == dwarf::Attribute::DW_AT_specification) &&
AttrSpec.Form == dwarf::DW_FORM_ref_addr;
}
/// Returns name offset in String Offset section.
static uint64_t getNameOffset(BinaryContext &BC, DWARFUnit &Unit,
const uint64_t Index) {
Expand Down Expand Up @@ -175,41 +224,6 @@ DWARF5AcceleratorTable::addAccelTableEntry(
if (Unit.getVersion() < 5 || !NeedToCreate)
return std::nullopt;
std::string NameToUse = "";
auto canProcess = [&](const DIE &Die) -> bool {
switch (Die.getTag()) {
case dwarf::DW_TAG_base_type:
case dwarf::DW_TAG_class_type:
case dwarf::DW_TAG_enumeration_type:
case dwarf::DW_TAG_imported_declaration:
case dwarf::DW_TAG_pointer_type:
case dwarf::DW_TAG_structure_type:
case dwarf::DW_TAG_typedef:
case dwarf::DW_TAG_unspecified_type:
if (Die.findAttribute(dwarf::Attribute::DW_AT_name))
return true;
return false;
case dwarf::DW_TAG_namespace:
// According to DWARF5 spec namespaces without DW_AT_name needs to have
// "(anonymous namespace)"
if (!Die.findAttribute(dwarf::Attribute::DW_AT_name))
NameToUse = "(anonymous namespace)";
return true;
case dwarf::DW_TAG_inlined_subroutine:
case dwarf::DW_TAG_label:
case dwarf::DW_TAG_subprogram:
if (Die.findAttribute(dwarf::Attribute::DW_AT_low_pc) ||
Die.findAttribute(dwarf::Attribute::DW_AT_high_pc) ||
Die.findAttribute(dwarf::Attribute::DW_AT_ranges) ||
Die.findAttribute(dwarf::Attribute::DW_AT_entry_pc))
return true;
return false;
case dwarf::DW_TAG_variable:
return shouldIncludeVariable(Unit, Die);
default:
break;
}
return false;
};

auto getUnitID = [&](const DWARFUnit &Unit, bool &IsTU,
uint32_t &DieTag) -> uint32_t {
Expand All @@ -223,7 +237,7 @@ DWARF5AcceleratorTable::addAccelTableEntry(
return CUList.size() - 1;
};

if (!canProcess(Die))
if (!canProcess(Unit, Die, NameToUse, false))
return std::nullopt;

// Addes a Unit to either CU, LocalTU or ForeignTU list the first time we
Expand Down Expand Up @@ -318,10 +332,24 @@ DWARF5AcceleratorTable::addAccelTableEntry(
const DIEValue Value = Die.findAttribute(Attr);
if (!Value)
return std::nullopt;
const DIEEntry &DIEENtry = Value.getDIEEntry();
DIE &EntryDie = DIEENtry.getEntry();
addEntry(EntryDie.findAttribute(dwarf::Attribute::DW_AT_linkage_name));
return addEntry(EntryDie.findAttribute(dwarf::Attribute::DW_AT_name));
const DIE *EntryDie = nullptr;
if (Value.getForm() == dwarf::DW_FORM_ref_addr) {
auto Iter = CrossCUDies.find(Value.getDIEInteger().getValue());
if (Iter == CrossCUDies.end()) {
BC.errs() << "BOLT-WARNING: [internal-dwarf-warning]: Could not find "
"referenced DIE in CrossCUDies for "
<< Twine::utohexstr(Value.getDIEInteger().getValue())
<< ".\n";
return std::nullopt;
}
EntryDie = Iter->second;
} else {
const DIEEntry &DIEENtry = Value.getDIEEntry();
EntryDie = &DIEENtry.getEntry();
}

addEntry(EntryDie->findAttribute(dwarf::Attribute::DW_AT_linkage_name));
return addEntry(EntryDie->findAttribute(dwarf::Attribute::DW_AT_name));
};

if (std::optional<BOLTDWARF5AccelTableData *> Entry =
Expand All @@ -332,7 +360,6 @@ DWARF5AcceleratorTable::addAccelTableEntry(
return *Entry;

return addEntry(Die.findAttribute(dwarf::Attribute::DW_AT_name));
;
}

/// Algorithm from llvm implementation.
Expand Down
13 changes: 9 additions & 4 deletions bolt/lib/Core/FunctionLayout.cpp
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
//===- bolt/Core/FunctionLayout.cpp - Fragmented Function Layout -*- C++ -*-==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "bolt/Core/FunctionLayout.h"
#include "bolt/Core/BinaryFunction.h"
#include "bolt/Core/BinaryBasicBlock.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/edit_distance.h"
#include <algorithm>
#include <cstddef>
#include <functional>
#include <iterator>
#include <memory>

using namespace llvm;
using namespace bolt;
Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Core/HashUtilities.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@

#include "bolt/Core/HashUtilities.h"
#include "bolt/Core/BinaryContext.h"
#include "bolt/Core/BinaryFunction.h"
#include "llvm/MC/MCInstPrinter.h"

namespace llvm {
Expand Down
15 changes: 14 additions & 1 deletion bolt/lib/Core/MCPlusBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,22 +12,30 @@

#include "bolt/Core/MCPlusBuilder.h"
#include "bolt/Core/MCPlus.h"
#include "bolt/Utils/CommandLineOpts.h"
#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCInstrAnalysis.h"
#include "llvm/MC/MCInstrDesc.h"
#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
#include <cstdint>
#include <queue>

#define DEBUG_TYPE "mcplus"

using namespace llvm;
using namespace bolt;
using namespace MCPlus;

namespace opts {
cl::opt<bool>
TerminalTrap("terminal-trap",
cl::desc("Assume that execution stops at trap instruction"),
cl::init(true), cl::Hidden, cl::cat(BoltCategory));
}

bool MCPlusBuilder::equals(const MCInst &A, const MCInst &B,
CompFuncTy Comp) const {
if (A.getOpcode() != B.getOpcode())
Expand Down Expand Up @@ -121,6 +129,11 @@ bool MCPlusBuilder::equals(const MCTargetExpr &A, const MCTargetExpr &B,
llvm_unreachable("target-specific expressions are unsupported");
}

bool MCPlusBuilder::isTerminator(const MCInst &Inst) const {
return Analysis->isTerminator(Inst) ||
(opts::TerminalTrap && Info->get(Inst.getOpcode()).isTrap());
}

void MCPlusBuilder::setTailCall(MCInst &Inst) const {
assert(!hasAnnotation(Inst, MCAnnotation::kTailCall));
setAnnotationOpValue(Inst, MCAnnotation::kTailCall, true);
Expand Down
195 changes: 144 additions & 51 deletions bolt/lib/Core/Relocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -774,60 +774,95 @@ static bool isPCRelativeRISCV(uint64_t Type) {
}

bool Relocation::isSupported(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
return false;
case Triple::aarch64:
return isSupportedAArch64(Type);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return isSupportedRISCV(Type);
return isSupportedX86(Type);
case Triple::x86_64:
return isSupportedX86(Type);
}
}

size_t Relocation::getSizeForType(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return getSizeForTypeAArch64(Type);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return getSizeForTypeRISCV(Type);
return getSizeForTypeX86(Type);
case Triple::x86_64:
return getSizeForTypeX86(Type);
}
}

bool Relocation::skipRelocationType(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return skipRelocationTypeAArch64(Type);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return skipRelocationTypeRISCV(Type);
return skipRelocationTypeX86(Type);
case Triple::x86_64:
return skipRelocationTypeX86(Type);
}
}

bool Relocation::skipRelocationProcess(uint64_t &Type, uint64_t Contents) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return skipRelocationProcessAArch64(Type, Contents);
if (Arch == Triple::riscv64)
skipRelocationProcessRISCV(Type, Contents);
return skipRelocationProcessX86(Type, Contents);
case Triple::riscv64:
return skipRelocationProcessRISCV(Type, Contents);
case Triple::x86_64:
return skipRelocationProcessX86(Type, Contents);
}
}

uint64_t Relocation::encodeValue(uint64_t Type, uint64_t Value, uint64_t PC) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return encodeValueAArch64(Type, Value, PC);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return encodeValueRISCV(Type, Value, PC);
return encodeValueX86(Type, Value, PC);
case Triple::x86_64:
return encodeValueX86(Type, Value, PC);
}
}

uint64_t Relocation::extractValue(uint64_t Type, uint64_t Contents,
uint64_t PC) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return extractValueAArch64(Type, Contents, PC);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return extractValueRISCV(Type, Contents, PC);
return extractValueX86(Type, Contents, PC);
case Triple::x86_64:
return extractValueX86(Type, Contents, PC);
}
}

bool Relocation::isGOT(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return isGOTAArch64(Type);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return isGOTRISCV(Type);
return isGOTX86(Type);
case Triple::x86_64:
return isGOTX86(Type);
}
}

bool Relocation::isX86GOTPCRELX(uint64_t Type) {
Expand All @@ -845,27 +880,42 @@ bool Relocation::isX86GOTPC64(uint64_t Type) {
bool Relocation::isNone(uint64_t Type) { return Type == getNone(); }

bool Relocation::isRelative(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return Type == ELF::R_AARCH64_RELATIVE;
if (Arch == Triple::riscv64)
case Triple::riscv64:
return Type == ELF::R_RISCV_RELATIVE;
return Type == ELF::R_X86_64_RELATIVE;
case Triple::x86_64:
return Type == ELF::R_X86_64_RELATIVE;
}
}

bool Relocation::isIRelative(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return Type == ELF::R_AARCH64_IRELATIVE;
if (Arch == Triple::riscv64)
case Triple::riscv64:
llvm_unreachable("not implemented");
return Type == ELF::R_X86_64_IRELATIVE;
case Triple::x86_64:
return Type == ELF::R_X86_64_IRELATIVE;
}
}

bool Relocation::isTLS(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return isTLSAArch64(Type);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return isTLSRISCV(Type);
return isTLSX86(Type);
case Triple::x86_64:
return isTLSX86(Type);
}
}

bool Relocation::isInstructionReference(uint64_t Type) {
Expand All @@ -882,49 +932,81 @@ bool Relocation::isInstructionReference(uint64_t Type) {
}

uint64_t Relocation::getNone() {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return ELF::R_AARCH64_NONE;
if (Arch == Triple::riscv64)
case Triple::riscv64:
return ELF::R_RISCV_NONE;
return ELF::R_X86_64_NONE;
case Triple::x86_64:
return ELF::R_X86_64_NONE;
}
}

uint64_t Relocation::getPC32() {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return ELF::R_AARCH64_PREL32;
if (Arch == Triple::riscv64)
case Triple::riscv64:
return ELF::R_RISCV_32_PCREL;
return ELF::R_X86_64_PC32;
case Triple::x86_64:
return ELF::R_X86_64_PC32;
}
}

uint64_t Relocation::getPC64() {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return ELF::R_AARCH64_PREL64;
if (Arch == Triple::riscv64)
case Triple::riscv64:
llvm_unreachable("not implemented");
return ELF::R_X86_64_PC64;
case Triple::x86_64:
return ELF::R_X86_64_PC64;
}
}

bool Relocation::isPCRelative(uint64_t Type) {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return isPCRelativeAArch64(Type);
if (Arch == Triple::riscv64)
case Triple::riscv64:
return isPCRelativeRISCV(Type);
return isPCRelativeX86(Type);
case Triple::x86_64:
return isPCRelativeX86(Type);
}
}

uint64_t Relocation::getAbs64() {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return ELF::R_AARCH64_ABS64;
if (Arch == Triple::riscv64)
case Triple::riscv64:
return ELF::R_RISCV_64;
return ELF::R_X86_64_64;
case Triple::x86_64:
return ELF::R_X86_64_64;
}
}

uint64_t Relocation::getRelative() {
if (Arch == Triple::aarch64)
switch (Arch) {
default:
llvm_unreachable("Unsupported architecture");
case Triple::aarch64:
return ELF::R_AARCH64_RELATIVE;
return ELF::R_X86_64_RELATIVE;
case Triple::riscv64:
llvm_unreachable("not implemented");
case Triple::x86_64:
return ELF::R_X86_64_RELATIVE;
}
}

size_t Relocation::emit(MCStreamer *Streamer) const {
Expand Down Expand Up @@ -991,9 +1073,16 @@ void Relocation::print(raw_ostream &OS) const {
static const char *AArch64RelocNames[] = {
#include "llvm/BinaryFormat/ELFRelocs/AArch64.def"
};
if (Arch == Triple::aarch64)
switch (Arch) {
default:
OS << "RType:" << Twine::utohexstr(Type);
break;

case Triple::aarch64:
OS << AArch64RelocNames[Type];
else if (Arch == Triple::riscv64) {
break;

case Triple::riscv64:
// RISC-V relocations are not sequentially numbered so we cannot use an
// array
switch (Type) {
Expand All @@ -1006,8 +1095,12 @@ void Relocation::print(raw_ostream &OS) const {
break;
#include "llvm/BinaryFormat/ELFRelocs/RISCV.def"
}
} else
break;

case Triple::x86_64:
OS << X86RelocNames[Type];
break;
}
OS << ", 0x" << Twine::utohexstr(Offset);
if (Symbol) {
OS << ", " << Symbol->getName();
Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/CMOVConversion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/ErrorHandling.h"
#include <numeric>

#define DEBUG_TYPE "cmov"

Expand Down
8 changes: 8 additions & 0 deletions bolt/lib/Passes/FixRISCVCallsPass.cpp
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
//===- bolt/Passes/FixRISCVCallsPass.cpp ------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "bolt/Passes/FixRISCVCallsPass.h"
#include "bolt/Core/ParallelUtilities.h"

Expand Down
8 changes: 8 additions & 0 deletions bolt/lib/Passes/FixRelaxationPass.cpp
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
//===- bolt/Passes/FixRelaxationPass.cpp ------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "bolt/Passes/FixRelaxationPass.h"
#include "bolt/Core/ParallelUtilities.h"

Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/FrameOptimizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
#include "bolt/Utils/CommandLineOpts.h"
#include "llvm/Support/Timer.h"
#include <deque>
#include <unordered_map>

#define DEBUG_TYPE "fop"

Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/Hugify.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
//===----------------------------------------------------------------------===//

#include "bolt/Passes/Hugify.h"
#include "llvm/Support/CommandLine.h"

#define DEBUG_TYPE "bolt-hugify"

Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/Inliner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
#include "bolt/Passes/Inliner.h"
#include "bolt/Core/MCPlus.h"
#include "llvm/Support/CommandLine.h"
#include <map>

#define DEBUG_TYPE "bolt-inliner"

Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/ShrinkWrapping.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
//===----------------------------------------------------------------------===//

#include "bolt/Passes/ShrinkWrapping.h"
#include "bolt/Core/MCPlus.h"
#include "bolt/Passes/DataflowInfoManager.h"
#include "bolt/Passes/MCF.h"
#include "bolt/Utils/CommandLineOpts.h"
Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/SplitFunctions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
#include "bolt/Core/ParallelUtilities.h"
#include "bolt/Utils/CommandLineOpts.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Sequence.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/iterator_range.h"
#include "llvm/Support/CommandLine.h"
Expand Down
2 changes: 1 addition & 1 deletion bolt/lib/Passes/TailDuplication.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
#include "bolt/Passes/TailDuplication.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/MC/MCRegisterInfo.h"
#include <queue>

#include <numeric>
#include <queue>

#define DEBUG_TYPE "taildup"

Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Passes/ValidateInternalCalls.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#include "bolt/Core/BinaryBasicBlock.h"
#include "bolt/Passes/DataflowInfoManager.h"
#include "bolt/Passes/FrameAnalysis.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/MC/MCInstPrinter.h"
#include <optional>
#include <queue>
Expand Down
210 changes: 164 additions & 46 deletions bolt/lib/Profile/BoltAddressTranslation.cpp

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion bolt/lib/Profile/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ add_llvm_library(LLVMBOLTProfile
DataAggregator.cpp
DataReader.cpp
Heatmap.cpp
ProfileReaderBase.cpp
StaleProfileMatching.cpp
YAMLProfileReader.cpp
YAMLProfileWriter.cpp
Expand Down
171 changes: 133 additions & 38 deletions bolt/lib/Profile/DataAggregator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -662,18 +662,19 @@ DataAggregator::getBinaryFunctionContainingAddress(uint64_t Address) const {
/*UseMaxSize=*/true);
}

StringRef DataAggregator::getLocationName(BinaryFunction &Func,
uint64_t Count) {
BinaryFunction *
DataAggregator::getBATParentFunction(const BinaryFunction &Func) const {
if (BAT)
if (const uint64_t HotAddr = BAT->fetchParentAddress(Func.getAddress()))
return getBinaryFunctionContainingAddress(HotAddr);
return nullptr;
}

StringRef DataAggregator::getLocationName(const BinaryFunction &Func) const {
if (!BAT)
return Func.getOneName();

const BinaryFunction *OrigFunc = &Func;
if (const uint64_t HotAddr = BAT->fetchParentAddress(Func.getAddress())) {
NumColdSamples += Count;
BinaryFunction *HotFunc = getBinaryFunctionContainingAddress(HotAddr);
if (HotFunc)
OrigFunc = HotFunc;
}
// If it is a local function, prefer the name containing the file name where
// the local function was declared
for (StringRef AlternativeName : OrigFunc->getNames()) {
Expand All @@ -688,12 +689,17 @@ StringRef DataAggregator::getLocationName(BinaryFunction &Func,
return OrigFunc->getOneName();
}

bool DataAggregator::doSample(BinaryFunction &Func, uint64_t Address,
bool DataAggregator::doSample(BinaryFunction &OrigFunc, uint64_t Address,
uint64_t Count) {
BinaryFunction *ParentFunc = getBATParentFunction(OrigFunc);
BinaryFunction &Func = ParentFunc ? *ParentFunc : OrigFunc;
if (ParentFunc)
NumColdSamples += Count;

auto I = NamesToSamples.find(Func.getOneName());
if (I == NamesToSamples.end()) {
bool Success;
StringRef LocName = getLocationName(Func, Count);
StringRef LocName = getLocationName(Func);
std::tie(I, Success) = NamesToSamples.insert(
std::make_pair(Func.getOneName(),
FuncSampleData(LocName, FuncSampleData::ContainerTy())));
Expand All @@ -713,22 +719,12 @@ bool DataAggregator::doIntraBranch(BinaryFunction &Func, uint64_t From,
FuncBranchData *AggrData = getBranchData(Func);
if (!AggrData) {
AggrData = &NamesToBranches[Func.getOneName()];
AggrData->Name = getLocationName(Func, Count);
AggrData->Name = getLocationName(Func);
setBranchData(Func, AggrData);
}

From -= Func.getAddress();
To -= Func.getAddress();
LLVM_DEBUG(dbgs() << "BOLT-DEBUG: bumpBranchCount: "
<< formatv("{0} @ {1:x} -> {0} @ {2:x}\n", Func, From, To));
if (BAT) {
From = BAT->translate(Func.getAddress(), From, /*IsBranchSrc=*/true);
To = BAT->translate(Func.getAddress(), To, /*IsBranchSrc=*/false);
LLVM_DEBUG(
dbgs() << "BOLT-DEBUG: BAT translation on bumpBranchCount: "
<< formatv("{0} @ {1:x} -> {0} @ {2:x}\n", Func, From, To));
}

AggrData->bumpBranchCount(From, To, Count, Mispreds);
return true;
}
Expand All @@ -742,30 +738,24 @@ bool DataAggregator::doInterBranch(BinaryFunction *FromFunc,
StringRef SrcFunc;
StringRef DstFunc;
if (FromFunc) {
SrcFunc = getLocationName(*FromFunc, Count);
SrcFunc = getLocationName(*FromFunc);
FromAggrData = getBranchData(*FromFunc);
if (!FromAggrData) {
FromAggrData = &NamesToBranches[FromFunc->getOneName()];
FromAggrData->Name = SrcFunc;
setBranchData(*FromFunc, FromAggrData);
}
From -= FromFunc->getAddress();
if (BAT)
From = BAT->translate(FromFunc->getAddress(), From, /*IsBranchSrc=*/true);

recordExit(*FromFunc, From, Mispreds, Count);
}
if (ToFunc) {
DstFunc = getLocationName(*ToFunc, 0);
DstFunc = getLocationName(*ToFunc);
ToAggrData = getBranchData(*ToFunc);
if (!ToAggrData) {
ToAggrData = &NamesToBranches[ToFunc->getOneName()];
ToAggrData->Name = DstFunc;
setBranchData(*ToFunc, ToAggrData);
}
To -= ToFunc->getAddress();
if (BAT)
To = BAT->translate(ToFunc->getAddress(), To, /*IsBranchSrc=*/false);

recordEntry(*ToFunc, To, Mispreds, Count);
}
Expand All @@ -781,15 +771,32 @@ bool DataAggregator::doInterBranch(BinaryFunction *FromFunc,

bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
uint64_t Mispreds) {
BinaryFunction *FromFunc = getBinaryFunctionContainingAddress(From);
BinaryFunction *ToFunc = getBinaryFunctionContainingAddress(To);
auto handleAddress = [&](uint64_t &Addr, bool IsFrom) -> BinaryFunction * {
if (BinaryFunction *Func = getBinaryFunctionContainingAddress(Addr)) {
Addr -= Func->getAddress();

if (BAT)
Addr = BAT->translate(Func->getAddress(), Addr, IsFrom);

if (BinaryFunction *ParentFunc = getBATParentFunction(*Func)) {
Func = ParentFunc;
if (IsFrom)
NumColdSamples += Count;
}

return Func;
}
return nullptr;
};

BinaryFunction *FromFunc = handleAddress(From, /*IsFrom=*/true);
BinaryFunction *ToFunc = handleAddress(To, /*IsFrom=*/false);
if (!FromFunc && !ToFunc)
return false;

// Treat recursive control transfers as inter-branches.
if (FromFunc == ToFunc && (To != ToFunc->getAddress())) {
recordBranch(*FromFunc, From - FromFunc->getAddress(),
To - FromFunc->getAddress(), Count, Mispreds);
if (FromFunc == ToFunc && To != 0) {
recordBranch(*FromFunc, From, To, Count, Mispreds);
return doIntraBranch(*FromFunc, From, To, Count, Mispreds);
}

Expand Down Expand Up @@ -840,9 +847,14 @@ bool DataAggregator::doTrace(const LBREntry &First, const LBREntry &Second,
<< FromFunc->getPrintName() << ":"
<< Twine::utohexstr(First.To) << " to "
<< Twine::utohexstr(Second.From) << ".\n");
for (const std::pair<uint64_t, uint64_t> &Pair : *FTs)
doIntraBranch(*FromFunc, Pair.first + FromFunc->getAddress(),
Pair.second + FromFunc->getAddress(), Count, false);
BinaryFunction *ParentFunc = getBATParentFunction(*FromFunc);
for (auto [From, To] : *FTs) {
if (BAT) {
From = BAT->translate(FromFunc->getAddress(), From, /*IsBranchSrc=*/true);
To = BAT->translate(FromFunc->getAddress(), To, /*IsBranchSrc=*/false);
}
doIntraBranch(ParentFunc ? *ParentFunc : *FromFunc, From, To, Count, false);
}

return true;
}
Expand Down Expand Up @@ -2308,7 +2320,90 @@ std::error_code DataAggregator::writeBATYAML(BinaryContext &BC,
if (BAT->isBATFunction(Function.getAddress()))
continue;
BP.Functions.emplace_back(
YAMLProfileWriter::convert(Function, /*UseDFS=*/false));
YAMLProfileWriter::convert(Function, /*UseDFS=*/false, BAT));
}

for (const auto &KV : NamesToBranches) {
const StringRef FuncName = KV.first;
const FuncBranchData &Branches = KV.second;
yaml::bolt::BinaryFunctionProfile YamlBF;
BinaryData *BD = BC.getBinaryDataByName(FuncName);
assert(BD);
uint64_t FuncAddress = BD->getAddress();
if (!BAT->isBATFunction(FuncAddress))
continue;
BinaryFunction *BF = BC.getBinaryFunctionAtAddress(FuncAddress);
assert(BF);
YamlBF.Name = FuncName.str();
YamlBF.Id = BF->getFunctionNumber();
YamlBF.Hash = BAT->getBFHash(FuncAddress);
YamlBF.ExecCount = BF->getKnownExecutionCount();
YamlBF.NumBasicBlocks = BAT->getNumBasicBlocks(FuncAddress);
const BoltAddressTranslation::BBHashMapTy &BlockMap =
BAT->getBBHashMap(FuncAddress);
YamlBF.Blocks.resize(YamlBF.NumBasicBlocks);

for (auto &&[Idx, YamlBB] : llvm::enumerate(YamlBF.Blocks))
YamlBB.Index = Idx;

for (auto BI = BlockMap.begin(), BE = BlockMap.end(); BI != BE; ++BI)
YamlBF.Blocks[BI->second.getBBIndex()].Hash = BI->second.getBBHash();

auto getSuccessorInfo = [&](uint32_t SuccOffset, unsigned SuccDataIdx) {
const llvm::bolt::BranchInfo &BI = Branches.Data.at(SuccDataIdx);
yaml::bolt::SuccessorInfo SI;
SI.Index = BlockMap.getBBIndex(SuccOffset);
SI.Count = BI.Branches;
SI.Mispreds = BI.Mispreds;
return SI;
};

auto getCallSiteInfo = [&](Location CallToLoc, unsigned CallToIdx,
uint32_t Offset) {
const llvm::bolt::BranchInfo &BI = Branches.Data.at(CallToIdx);
yaml::bolt::CallSiteInfo CSI;
CSI.DestId = 0; // designated for unknown functions
CSI.EntryDiscriminator = 0;
CSI.Count = BI.Branches;
CSI.Mispreds = BI.Mispreds;
CSI.Offset = Offset;
if (BinaryData *BD = BC.getBinaryDataByName(CallToLoc.Name))
YAMLProfileWriter::setCSIDestination(BC, CSI, BD->getSymbol(), BAT,
CallToLoc.Offset);
return CSI;
};

for (const auto &[FromOffset, SuccKV] : Branches.IntraIndex) {
if (!BlockMap.isInputBlock(FromOffset))
continue;
const unsigned Index = BlockMap.getBBIndex(FromOffset);
yaml::bolt::BinaryBasicBlockProfile &YamlBB = YamlBF.Blocks[Index];
for (const auto &[SuccOffset, SuccDataIdx] : SuccKV)
if (BlockMap.isInputBlock(SuccOffset))
YamlBB.Successors.emplace_back(
getSuccessorInfo(SuccOffset, SuccDataIdx));
}
for (const auto &[FromOffset, CallTo] : Branches.InterIndex) {
auto BlockIt = BlockMap.upper_bound(FromOffset);
--BlockIt;
const unsigned BlockOffset = BlockIt->first;
const unsigned BlockIndex = BlockIt->second.getBBIndex();
yaml::bolt::BinaryBasicBlockProfile &YamlBB = YamlBF.Blocks[BlockIndex];
const uint32_t Offset = FromOffset - BlockOffset;
for (const auto &[CallToLoc, CallToIdx] : CallTo)
YamlBB.CallSites.emplace_back(
getCallSiteInfo(CallToLoc, CallToIdx, Offset));
llvm::sort(YamlBB.CallSites, [](yaml::bolt::CallSiteInfo &A,
yaml::bolt::CallSiteInfo &B) {
return A.Offset < B.Offset;
});
}
// Drop blocks without a hash, won't be useful for stale matching.
llvm::erase_if(YamlBF.Blocks,
[](const yaml::bolt::BinaryBasicBlockProfile &YamlBB) {
return YamlBB.Hash == (yaml::Hex64)0;
});
BP.Functions.emplace_back(YamlBF);
}
}

Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Profile/DataReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/Errc.h"
#include <map>

#undef DEBUG_TYPE
#define DEBUG_TYPE "bolt-prof"
Expand Down
1 change: 0 additions & 1 deletion bolt/lib/Profile/Heatmap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
#include "bolt/Utils/CommandLineOpts.h"
#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/Twine.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Format.h"
Expand Down
47 changes: 30 additions & 17 deletions bolt/lib/Profile/YAMLProfileWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include "bolt/Profile/YAMLProfileWriter.h"
#include "bolt/Core/BinaryBasicBlock.h"
#include "bolt/Core/BinaryFunction.h"
#include "bolt/Profile/BoltAddressTranslation.h"
#include "bolt/Profile/ProfileReaderBase.h"
#include "bolt/Rewrite/RewriteInstance.h"
#include "llvm/Support/CommandLine.h"
Expand All @@ -25,8 +26,30 @@ extern llvm::cl::opt<bool> ProfileUseDFS;
namespace llvm {
namespace bolt {

const BinaryFunction *YAMLProfileWriter::setCSIDestination(
const BinaryContext &BC, yaml::bolt::CallSiteInfo &CSI,
const MCSymbol *Symbol, const BoltAddressTranslation *BAT,
uint32_t Offset) {
CSI.DestId = 0; // designated for unknown functions
CSI.EntryDiscriminator = 0;

if (Symbol) {
uint64_t EntryID = 0;
if (const BinaryFunction *Callee =
BC.getFunctionForSymbol(Symbol, &EntryID)) {
if (BAT && BAT->isBATFunction(Callee->getAddress()))
std::tie(Callee, EntryID) = BAT->translateSymbol(BC, *Symbol, Offset);
CSI.DestId = Callee->getFunctionNumber();
CSI.EntryDiscriminator = EntryID;
return Callee;
}
}
return nullptr;
}

yaml::bolt::BinaryFunctionProfile
YAMLProfileWriter::convert(const BinaryFunction &BF, bool UseDFS) {
YAMLProfileWriter::convert(const BinaryFunction &BF, bool UseDFS,
const BoltAddressTranslation *BAT) {
yaml::bolt::BinaryFunctionProfile YamlBF;
const BinaryContext &BC = BF.getBinaryContext();

Expand Down Expand Up @@ -79,31 +102,21 @@ YAMLProfileWriter::convert(const BinaryFunction &BF, bool UseDFS) {
continue;
for (const IndirectCallProfile &CSP : ICSP.get()) {
StringRef TargetName = "";
CSI.DestId = 0; // designated for unknown functions
CSI.EntryDiscriminator = 0;
if (CSP.Symbol) {
const BinaryFunction *Callee = BC.getFunctionForSymbol(CSP.Symbol);
if (Callee) {
CSI.DestId = Callee->getFunctionNumber();
TargetName = Callee->getOneName();
}
}
const BinaryFunction *Callee =
setCSIDestination(BC, CSI, CSP.Symbol, BAT);
if (Callee)
TargetName = Callee->getOneName();
CSI.Count = CSP.Count;
CSI.Mispreds = CSP.Mispreds;
CSTargets.emplace_back(TargetName, CSI);
}
} else { // direct call or a tail call
uint64_t EntryID = 0;
CSI.DestId = 0;
StringRef TargetName = "";
const MCSymbol *CalleeSymbol = BC.MIB->getTargetSymbol(Instr);
const BinaryFunction *const Callee =
BC.getFunctionForSymbol(CalleeSymbol, &EntryID);
if (Callee) {
CSI.DestId = Callee->getFunctionNumber();
CSI.EntryDiscriminator = EntryID;
setCSIDestination(BC, CSI, CalleeSymbol, BAT);
if (Callee)
TargetName = Callee->getOneName();
}

auto getAnnotationWithDefault = [&](const MCInst &Inst, StringRef Ann) {
return BC.MIB->getAnnotationWithDefault(Instr, Ann, 0ull);
Expand Down
7 changes: 4 additions & 3 deletions bolt/lib/Rewrite/BinaryPassManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ static cl::opt<bool> JTFootprintReductionFlag(
"instructions at jump sites"),
cl::cat(BoltOptCategory));

static cl::opt<bool>
cl::opt<bool>
KeepNops("keep-nops",
cl::desc("keep no-op instructions. By default they are removed."),
cl::Hidden, cl::cat(BoltOptCategory));
Expand Down Expand Up @@ -377,8 +377,9 @@ Error BinaryFunctionPassManager::runAllPasses(BinaryContext &BC) {

Manager.registerPass(std::make_unique<NormalizeCFG>(PrintNormalized));

Manager.registerPass(std::make_unique<StripRepRet>(NeverPrint),
opts::StripRepRet);
if (BC.isX86())
Manager.registerPass(std::make_unique<StripRepRet>(NeverPrint),
opts::StripRepRet);

Manager.registerPass(std::make_unique<IdenticalCodeFolding>(PrintICF),
opts::ICF);
Expand Down
46 changes: 34 additions & 12 deletions bolt/lib/Rewrite/DWARFRewriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
#include "bolt/Core/DynoStats.h"
#include "bolt/Core/ParallelUtilities.h"
#include "bolt/Rewrite/RewriteInstance.h"
#include "bolt/Utils/Utils.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"
Expand Down Expand Up @@ -375,12 +374,11 @@ static cl::opt<bool> AlwaysConvertToRanges(
extern cl::opt<std::string> CompDirOverride;
} // namespace opts

static bool getLowAndHighPC(const DIE &Die, const DWARFUnit &DU,
uint64_t &LowPC, uint64_t &HighPC,
uint64_t &SectionIndex) {
/// If DW_AT_low_pc exists sets LowPC and returns true.
static bool getLowPC(const DIE &Die, const DWARFUnit &DU, uint64_t &LowPC,
uint64_t &SectionIndex) {
DIEValue DvalLowPc = Die.findAttribute(dwarf::DW_AT_low_pc);
DIEValue DvalHighPc = Die.findAttribute(dwarf::DW_AT_high_pc);
if (!DvalLowPc || !DvalHighPc)
if (!DvalLowPc)
return false;

dwarf::Form Form = DvalLowPc.getForm();
Expand All @@ -403,14 +401,39 @@ static bool getLowAndHighPC(const DIE &Die, const DWARFUnit &DU,
LowPC = LowPcValue;
SectionIndex = 0;
}
return true;
}

/// If DW_AT_high_pc exists sets HighPC and returns true.
static bool getHighPC(const DIE &Die, const uint64_t LowPC, uint64_t &HighPC) {
DIEValue DvalHighPc = Die.findAttribute(dwarf::DW_AT_high_pc);
if (!DvalHighPc)
return false;
if (DvalHighPc.getForm() == dwarf::DW_FORM_addr)
HighPC = DvalHighPc.getDIEInteger().getValue();
else
HighPC = LowPC + DvalHighPc.getDIEInteger().getValue();

return true;
}

/// If DW_AT_low_pc and DW_AT_high_pc exist sets LowPC and HighPC and returns
/// true.
static bool getLowAndHighPC(const DIE &Die, const DWARFUnit &DU,
uint64_t &LowPC, uint64_t &HighPC,
uint64_t &SectionIndex) {
uint64_t TempLowPC = LowPC;
uint64_t TempHighPC = HighPC;
uint64_t TempSectionIndex = SectionIndex;
if (getLowPC(Die, DU, TempLowPC, TempSectionIndex) &&
getHighPC(Die, TempLowPC, TempHighPC)) {
LowPC = TempLowPC;
HighPC = TempHighPC;
SectionIndex = TempSectionIndex;
return true;
}
return false;
}

static Expected<llvm::DWARFAddressRangesVector>
getDIEAddressRanges(const DIE &Die, DWARFUnit &DU) {
uint64_t LowPC, HighPC, Index;
Expand Down Expand Up @@ -1248,10 +1271,9 @@ void DWARFRewriter::updateUnitDebugInfo(
}
}
} else if (LowPCAttrInfo) {
const std::optional<uint64_t> Result =
LowPCAttrInfo.getDIEInteger().getValue();
if (Result.has_value()) {
const uint64_t Address = Result.value();
uint64_t Address = 0;
uint64_t SectionIndex = 0;
if (getLowPC(*Die, Unit, Address, SectionIndex)) {
uint64_t NewAddress = 0;
if (const BinaryFunction *Function =
BC.getBinaryFunctionContainingAddress(Address)) {
Expand Down Expand Up @@ -1662,7 +1684,7 @@ namespace {
std::unique_ptr<BinaryContext>
createDwarfOnlyBC(const object::ObjectFile &File) {
return cantFail(BinaryContext::createBinaryContext(
&File, false,
File.makeTriple(), File.getFileName(), nullptr, false,
DWARFContext::create(File, DWARFContext::ProcessDebugRelocations::Ignore,
nullptr, "", WithColor::defaultErrorHandler,
WithColor::defaultWarningHandler),
Expand Down
4 changes: 3 additions & 1 deletion bolt/lib/Rewrite/JITLinkLinker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "bolt/Rewrite/JITLinkLinker.h"
#include "bolt/Core/BinaryContext.h"
#include "bolt/Core/BinaryData.h"
#include "bolt/Rewrite/RewriteInstance.h"
#include "bolt/Core/BinarySection.h"
#include "llvm/ExecutionEngine/JITLink/ELF_riscv.h"
#include "llvm/ExecutionEngine/JITLink/JITLink.h"
#include "llvm/ExecutionEngine/Orc/Shared/ExecutorAddress.h"
Expand Down
119 changes: 113 additions & 6 deletions bolt/lib/Rewrite/LinuxKernelRewriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,11 @@ class LinuxKernelRewriter final : public MetadataRewriter {
/// Size of bug_entry struct.
static constexpr size_t BUG_TABLE_ENTRY_SIZE = 12;

/// List of bug entries per function.
using FunctionBugListType =
DenseMap<BinaryFunction *, SmallVector<uint32_t, 2>>;
FunctionBugListType FunctionBugList;

/// .pci_fixup section.
ErrorOr<BinarySection &> PCIFixupSection = std::errc::bad_address;
static constexpr size_t PCI_FIXUP_ENTRY_SIZE = 16;
Expand Down Expand Up @@ -252,11 +257,19 @@ class LinuxKernelRewriter final : public MetadataRewriter {

/// Paravirtual instruction patch sites.
Error readParaInstructions();
Error rewriteParaInstructions();

/// __bug_table section handling.
Error readBugTable();
Error rewriteBugTable();

/// Do no process functions containing instruction annotated with
/// \p Annotation.
void skipFunctionsWithAnnotation(StringRef Annotation) const;

/// Read alternative instruction info from .altinstructions.
/// Handle alternative instruction info from .altinstructions.
Error readAltInstructions();
Error rewriteAltInstructions();

/// Read .pci_fixup
Error readPCIFixupTable();
Expand Down Expand Up @@ -318,6 +331,12 @@ class LinuxKernelRewriter final : public MetadataRewriter {
if (Error E = rewriteExceptionTable())
return E;

if (Error E = rewriteAltInstructions())
return E;

if (Error E = rewriteParaInstructions())
return E;

if (Error E = rewriteORCTables())
return E;

Expand All @@ -327,6 +346,9 @@ class LinuxKernelRewriter final : public MetadataRewriter {
if (Error E = rewriteStaticKeysJumpTable())
return E;

if (Error E = rewriteBugTable())
return E;

return Error::success();
}

Expand Down Expand Up @@ -1126,16 +1148,43 @@ Error LinuxKernelRewriter::readParaInstructions() {
return Error::success();
}

void LinuxKernelRewriter::skipFunctionsWithAnnotation(
StringRef Annotation) const {
for (BinaryFunction &BF : llvm::make_second_range(BC.getBinaryFunctions())) {
if (!BC.shouldEmit(BF))
continue;
for (const BinaryBasicBlock &BB : BF) {
const bool HasAnnotation = llvm::any_of(BB, [&](const MCInst &Inst) {
return BC.MIB->hasAnnotation(Inst, Annotation);
});
if (HasAnnotation) {
BF.setSimple(false);
break;
}
}
}
}

Error LinuxKernelRewriter::rewriteParaInstructions() {
// Disable output of functions with paravirtual instructions before the
// rewrite support is complete.
skipFunctionsWithAnnotation("ParaSite");

return Error::success();
}

/// Process __bug_table section.
/// This section contains information useful for kernel debugging.
/// This section contains information useful for kernel debugging, mostly
/// utilized by WARN()/WARN_ON() macros and deprecated BUG()/BUG_ON().
///
/// Each entry in the section is a struct bug_entry that contains a pointer to
/// the ud2 instruction corresponding to the bug, corresponding file name (both
/// pointers use PC relative offset addressing), line number, and flags.
/// The definition of the struct bug_entry can be found in
/// `include/asm-generic/bug.h`
///
/// NB: find_bug() uses linear search to match an address to an entry in the bug
/// table. Hence there is no need to sort entries when rewriting the table.
/// `include/asm-generic/bug.h`. The first entry in the struct is an instruction
/// address encoded as a PC-relative offset. In theory, it could be an absolute
/// address if CONFIG_GENERIC_BUG_RELATIVE_POINTERS is not set, but in practice
/// the kernel code relies on it being a relative offset on x86-64.
Error LinuxKernelRewriter::readBugTable() {
BugTableSection = BC.getUniqueSectionByName("__bug_table");
if (!BugTableSection)
Expand Down Expand Up @@ -1178,6 +1227,8 @@ Error LinuxKernelRewriter::readBugTable() {
" referenced by bug table entry %d",
InstAddress, EntryID);
BC.MIB->addAnnotation(*Inst, "BugEntry", EntryID);

FunctionBugList[BF].push_back(EntryID);
}
}

Expand All @@ -1186,6 +1237,52 @@ Error LinuxKernelRewriter::readBugTable() {
return Error::success();
}

/// find_bug() uses linear search to match an address to an entry in the bug
/// table. Hence, there is no need to sort entries when rewriting the table.
/// When we need to erase an entry, we set its instruction address to zero.
Error LinuxKernelRewriter::rewriteBugTable() {
if (!BugTableSection)
return Error::success();

for (BinaryFunction &BF : llvm::make_second_range(BC.getBinaryFunctions())) {
if (!BC.shouldEmit(BF))
continue;

if (!FunctionBugList.count(&BF))
continue;

// Bugs that will be emitted for this function.
DenseSet<uint32_t> EmittedIDs;
for (BinaryBasicBlock &BB : BF) {
for (MCInst &Inst : BB) {
if (!BC.MIB->hasAnnotation(Inst, "BugEntry"))
continue;
const uint32_t ID = BC.MIB->getAnnotationAs<uint32_t>(Inst, "BugEntry");
EmittedIDs.insert(ID);

// Create a relocation entry for this bug entry.
MCSymbol *Label =
BC.MIB->getOrCreateInstLabel(Inst, "__BUG_", BC.Ctx.get());
const uint64_t EntryOffset = (ID - 1) * BUG_TABLE_ENTRY_SIZE;
BugTableSection->addRelocation(EntryOffset, Label, ELF::R_X86_64_PC32,
/*Addend*/ 0);
}
}

// Clear bug entries that were not emitted for this function, e.g. as a
// result of DCE, but setting their instruction address to zero.
for (const uint32_t ID : FunctionBugList[&BF]) {
if (!EmittedIDs.count(ID)) {
const uint64_t EntryOffset = (ID - 1) * BUG_TABLE_ENTRY_SIZE;
BugTableSection->addRelocation(EntryOffset, nullptr, ELF::R_X86_64_PC32,
/*Addend*/ 0);
}
}
}

return Error::success();
}

/// The kernel can replace certain instruction sequences depending on hardware
/// it is running on and features specified during boot time. The information
/// about alternative instruction sequences is stored in .altinstructions
Expand Down Expand Up @@ -1305,6 +1402,14 @@ Error LinuxKernelRewriter::readAltInstructions() {
return Error::success();
}

Error LinuxKernelRewriter::rewriteAltInstructions() {
// Disable output of functions with alt instructions before the rewrite
// support is complete.
skipFunctionsWithAnnotation("AltInst");

return Error::success();
}

/// When the Linux kernel needs to handle an error associated with a given PCI
/// device, it uses a table stored in .pci_fixup section to locate a fixup code
/// specific to the vendor and the problematic device. The section contains a
Expand Down Expand Up @@ -1679,6 +1784,8 @@ Error LinuxKernelRewriter::updateStaticKeysJumpTablePostEmit() {
<< "\n\tTargetAddress: 0x" << Twine::utohexstr(TargetAddress)
<< "\n\tKeyAddress: 0x" << Twine::utohexstr(KeyAddress) << '\n';
});
(void)TargetAddress;
(void)KeyAddress;

BinaryFunction *BF =
BC.getBinaryFunctionContainingAddress(JumpAddress,
Expand Down
35 changes: 3 additions & 32 deletions bolt/lib/Rewrite/MachORewriteInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "bolt/Rewrite/BinaryPassManager.h"
#include "bolt/Rewrite/ExecutableFileMemoryManager.h"
#include "bolt/Rewrite/JITLinkLinker.h"
#include "bolt/Rewrite/RewriteInstance.h"
#include "bolt/RuntimeLibs/InstrumentationRuntimeLibrary.h"
#include "bolt/Utils/Utils.h"
#include "llvm/MC/MCObjectStreamer.h"
Expand Down Expand Up @@ -54,37 +55,6 @@ extern cl::opt<unsigned> Verbosity;
namespace llvm {
namespace bolt {

extern MCPlusBuilder *createX86MCPlusBuilder(const MCInstrAnalysis *,
const MCInstrInfo *,
const MCRegisterInfo *,
const MCSubtargetInfo *);
extern MCPlusBuilder *createAArch64MCPlusBuilder(const MCInstrAnalysis *,
const MCInstrInfo *,
const MCRegisterInfo *,
const MCSubtargetInfo *);

namespace {

MCPlusBuilder *createMCPlusBuilder(const Triple::ArchType Arch,
const MCInstrAnalysis *Analysis,
const MCInstrInfo *Info,
const MCRegisterInfo *RegInfo,
const MCSubtargetInfo *STI) {
#ifdef X86_AVAILABLE
if (Arch == Triple::x86_64)
return createX86MCPlusBuilder(Analysis, Info, RegInfo, STI);
#endif

#ifdef AARCH64_AVAILABLE
if (Arch == Triple::aarch64)
return createAArch64MCPlusBuilder(Analysis, Info, RegInfo, STI);
#endif

llvm_unreachable("architecture unsupported by MCPlusBuilder");
}

} // anonymous namespace

#define DEBUG_TYPE "bolt"

Expected<std::unique_ptr<MachORewriteInstance>>
Expand All @@ -103,7 +73,8 @@ MachORewriteInstance::MachORewriteInstance(object::MachOObjectFile *InputFile,
: InputFile(InputFile), ToolPath(ToolPath) {
ErrorAsOutParameter EAO(&Err);
auto BCOrErr = BinaryContext::createBinaryContext(
InputFile, /* IsPIC */ true, DWARFContext::create(*InputFile),
InputFile->makeTriple(), InputFile->getFileName(), nullptr,
/* IsPIC */ true, DWARFContext::create(*InputFile),
{llvm::outs(), llvm::errs()});
if (Error E = BCOrErr.takeError()) {
Err = std::move(E);
Expand Down
60 changes: 46 additions & 14 deletions bolt/lib/Rewrite/RewriteInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,10 @@ extern cl::list<std::string> HotTextMoveSections;
extern cl::opt<bool> Hugify;
extern cl::opt<bool> Instrument;
extern cl::opt<JumpTableSupportLevel> JumpTables;
extern cl::opt<bool> KeepNops;
extern cl::list<std::string> ReorderData;
extern cl::opt<bolt::ReorderFunctions::ReorderType> ReorderFunctions;
extern cl::opt<bool> TerminalTrap;
extern cl::opt<bool> TimeBuild;

cl::opt<bool> AllowStripped("allow-stripped",
Expand Down Expand Up @@ -266,6 +268,10 @@ namespace bolt {

extern const char *BoltRevision;

// Weird location for createMCPlusBuilder, but this is here to avoid a
// cyclic dependency of libCore (its natural place) and libTarget. libRewrite
// can depend on libTarget, but not libCore. Since libRewrite is the only
// user of this function, we define it here.
MCPlusBuilder *createMCPlusBuilder(const Triple::ArchType Arch,
const MCInstrAnalysis *Analysis,
const MCInstrInfo *Info,
Expand Down Expand Up @@ -343,8 +349,21 @@ RewriteInstance::RewriteInstance(ELFObjectFileBase *File, const int Argc,
Stderr.SetUnbuffered();
LLVM_DEBUG(dbgs().SetUnbuffered());

// Read RISCV subtarget features from input file
std::unique_ptr<SubtargetFeatures> Features;
Triple TheTriple = File->makeTriple();
if (TheTriple.getArch() == llvm::Triple::riscv64) {
Expected<SubtargetFeatures> FeaturesOrErr = File->getFeatures();
if (auto E = FeaturesOrErr.takeError()) {
Err = std::move(E);
return;
} else {
Features.reset(new SubtargetFeatures(*FeaturesOrErr));
}
}

auto BCOrErr = BinaryContext::createBinaryContext(
File, IsPIC,
TheTriple, File->getFileName(), Features.get(), IsPIC,
DWARFContext::create(*File, DWARFContext::ProcessDebugRelocations::Ignore,
nullptr, opts::DWPPathName,
WithColor::defaultErrorHandler,
Expand Down Expand Up @@ -537,7 +556,7 @@ Error RewriteInstance::discoverStorage() {
if (Error E = SectionNameOrErr.takeError())
return E;
StringRef SectionName = SectionNameOrErr.get();
if (SectionName == ".text") {
if (SectionName == BC->getMainCodeSectionName()) {
BC->OldTextSectionAddress = Section.getAddress();
BC->OldTextSectionSize = Section.getSize();

Expand Down Expand Up @@ -1845,7 +1864,8 @@ Error RewriteInstance::readSpecialSections() {
"Use -update-debug-sections to keep it.\n";
}

HasTextRelocations = (bool)BC->getUniqueSectionByName(".rela.text");
HasTextRelocations = (bool)BC->getUniqueSectionByName(
".rela" + std::string(BC->getMainCodeSectionName()));
HasSymbolTable = (bool)BC->getUniqueSectionByName(".symtab");
EHFrameSection = BC->getUniqueSectionByName(".eh_frame");
BuildIDSection = BC->getUniqueSectionByName(".note.gnu.build-id");
Expand Down Expand Up @@ -2031,6 +2051,15 @@ void RewriteInstance::adjustCommandLineOptions() {

if (opts::Lite)
BC->outs() << "BOLT-INFO: enabling lite mode\n";

if (BC->IsLinuxKernel) {
if (!opts::KeepNops.getNumOccurrences())
opts::KeepNops = true;

// Linux kernel may resume execution after a trap instruction in some cases.
if (!opts::TerminalTrap.getNumOccurrences())
opts::TerminalTrap = false;
}
}

namespace {
Expand Down Expand Up @@ -2277,9 +2306,13 @@ void RewriteInstance::processRelocations() {
return;

for (const SectionRef &Section : InputFile->sections()) {
if (cantFail(Section.getRelocatedSection()) != InputFile->section_end() &&
!BinarySection(*BC, Section).isAllocatable())
readRelocations(Section);
section_iterator SecIter = cantFail(Section.getRelocatedSection());
if (SecIter == InputFile->section_end())
continue;
if (BinarySection(*BC, Section).isAllocatable())
continue;

readRelocations(Section);
}

if (NumFailedRelocations)
Expand Down Expand Up @@ -3413,7 +3446,8 @@ void RewriteInstance::emitAndLink() {
ErrorOr<BinarySection &> TextSection =
BC->getUniqueSectionByName(BC->getMainCodeSectionName());
if (BC->HasRelocations && TextSection)
BC->renameSection(*TextSection, getOrgSecPrefix() + ".text");
BC->renameSection(*TextSection,
getOrgSecPrefix() + BC->getMainCodeSectionName());

//////////////////////////////////////////////////////////////////////////////
// Assign addresses to new sections.
Expand Down Expand Up @@ -4270,18 +4304,17 @@ RewriteInstance::getOutputSections(ELFObjectFile<ELFT> *File,
for (auto &SectionKV : OutputSections) {
ELFShdrTy &Section = SectionKV.second;

// Ignore TLS sections as they don't take any space in the file.
// Ignore NOBITS sections as they don't take any space in the file.
if (Section.sh_type == ELF::SHT_NOBITS)
continue;

// Note that address continuity is not guaranteed as sections could be
// placed in different loadable segments.
if (PrevSection &&
PrevSection->sh_offset + PrevSection->sh_size > Section.sh_offset) {
if (opts::Verbosity > 1) {
if (opts::Verbosity > 1)
BC->outs() << "BOLT-INFO: adjusting size for section "
<< PrevBinSec->getOutputName() << '\n';
}
PrevSection->sh_size = Section.sh_offset - PrevSection->sh_offset;
}

Expand Down Expand Up @@ -4389,6 +4422,7 @@ void RewriteInstance::patchELFSectionHeaderTable(ELFObjectFile<ELFT> *File) {
raw_fd_ostream &OS = Out->os();
const ELFFile<ELFT> &Obj = File->getELFFile();

// Mapping from old section indices to new ones
std::vector<uint32_t> NewSectionIndex;
std::vector<ELFShdrTy> OutputSections =
getOutputSections(File, NewSectionIndex);
Expand All @@ -4406,10 +4440,8 @@ void RewriteInstance::patchELFSectionHeaderTable(ELFObjectFile<ELFT> *File) {
// Write all section header entries while patching section references.
for (ELFShdrTy &Section : OutputSections) {
Section.sh_link = NewSectionIndex[Section.sh_link];
if (Section.sh_type == ELF::SHT_REL || Section.sh_type == ELF::SHT_RELA) {
if (Section.sh_info)
Section.sh_info = NewSectionIndex[Section.sh_info];
}
if (Section.sh_type == ELF::SHT_REL || Section.sh_type == ELF::SHT_RELA)
Section.sh_info = NewSectionIndex[Section.sh_info];
OS.write(reinterpret_cast<const char *>(&Section), sizeof(Section));
}

Expand Down
3 changes: 1 addition & 2 deletions bolt/lib/RuntimeLibs/HugifyRuntimeLibrary.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,9 @@
//===----------------------------------------------------------------------===//

#include "bolt/RuntimeLibs/HugifyRuntimeLibrary.h"
#include "bolt/Core/BinaryFunction.h"
#include "bolt/Core/BinaryContext.h"
#include "bolt/Core/Linker.h"
#include "llvm/MC/MCStreamer.h"
#include "llvm/Support/Alignment.h"
#include "llvm/Support/CommandLine.h"

using namespace llvm;
Expand Down
3 changes: 0 additions & 3 deletions bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,7 @@
#include "llvm/BinaryFormat/ELF.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/Format.h"
#include "llvm/Support/raw_ostream.h"

#define DEBUG_TYPE "mcplus"

Expand Down
7 changes: 0 additions & 7 deletions bolt/lib/Target/X86/X86MCPlusBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -211,13 +211,6 @@ class X86MCPlusBuilder : public MCPlusBuilder {
return false;
}

// FIXME: For compatibility with old LLVM only!
bool isTerminator(const MCInst &Inst) const override {
unsigned Opcode = Inst.getOpcode();
return Info->get(Opcode).isTerminator() || X86::isUD1(Opcode) ||
X86::isUD2(Opcode);
}

bool isIndirectCall(const MCInst &Inst) const override {
return isCall(Inst) &&
((getMemoryOperandNo(Inst) != -1) || Inst.getOperand(0).isReg());
Expand Down
33 changes: 31 additions & 2 deletions bolt/test/X86/bolt-address-translation-yaml.test
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,14 @@ RUN: | FileCheck --check-prefix CHECK-BOLT-YAML %s

WRITE-BAT-CHECK: BOLT-INFO: Wrote 5 BAT maps
WRITE-BAT-CHECK: BOLT-INFO: Wrote 4 function and 22 basic block hashes
WRITE-BAT-CHECK: BOLT-INFO: BAT section size (bytes): 344
WRITE-BAT-CHECK: BOLT-INFO: BAT section size (bytes): 384

READ-BAT-CHECK-NOT: BOLT-ERROR: unable to save profile in YAML format for input file processed by BOLT
READ-BAT-CHECK: BOLT-INFO: Parsed 5 BAT entries
READ-BAT-CHECK: PERF2BOLT: read 79 aggregated LBR entries

YAML-BAT-CHECK: functions:
# Function not covered by BAT - has insns in basic block
YAML-BAT-CHECK: - name: main
YAML-BAT-CHECK-NEXT: fid: 2
YAML-BAT-CHECK-NEXT: hash: 0x9895746D48B2C876
Expand All @@ -35,6 +36,34 @@ YAML-BAT-CHECK-NEXT: - bid: 0
YAML-BAT-CHECK-NEXT: insns: 26
YAML-BAT-CHECK-NEXT: hash: 0xA900AE79CFD40000
YAML-BAT-CHECK-NEXT: succ: [ { bid: 3, cnt: 0 }, { bid: 1, cnt: 0 } ]
# Calls from no-BAT to BAT function
YAML-BAT-CHECK: - bid: 28
YAML-BAT-CHECK-NEXT: insns: 13
YAML-BAT-CHECK-NEXT: hash: 0xB2F04C1F25F00400
YAML-BAT-CHECK-NEXT: calls: [ { off: 0x21, fid: [[#SOLVECUBIC:]], cnt: 25 }, { off: 0x2D, fid: [[#]], cnt: 9 } ]
# Function covered by BAT with calls
YAML-BAT-CHECK: - name: SolveCubic
YAML-BAT-CHECK-NEXT: fid: [[#SOLVECUBIC]]
YAML-BAT-CHECK-NEXT: hash: 0x6AF7E61EA3966722
YAML-BAT-CHECK-NEXT: exec: 25
YAML-BAT-CHECK-NEXT: nblocks: 15
YAML-BAT-CHECK-NEXT: blocks:
YAML-BAT-CHECK: - bid: 3
YAML-BAT-CHECK-NEXT: insns: [[#]]
YAML-BAT-CHECK-NEXT: hash: 0xDDA1DC5F69F900AC
YAML-BAT-CHECK-NEXT: calls: [ { off: 0x26, fid: [[#]], cnt: [[#]] } ]
YAML-BAT-CHECK-NEXT: succ: [ { bid: 5, cnt: [[#]] }
# Function covered by BAT - doesn't have insns in basic block
YAML-BAT-CHECK: - name: usqrt
YAML-BAT-CHECK-NEXT: fid: [[#]]
YAML-BAT-CHECK-NEXT: hash: 0x99E67ED32A203023
YAML-BAT-CHECK-NEXT: exec: 21
YAML-BAT-CHECK-NEXT: nblocks: 5
YAML-BAT-CHECK-NEXT: blocks:
YAML-BAT-CHECK: - bid: 1
YAML-BAT-CHECK-NEXT: insns: [[#]]
YAML-BAT-CHECK-NEXT: hash: 0xD70DC695320E0010
YAML-BAT-CHECK-NEXT: succ: {{.*}} { bid: 2, cnt: [[#]] }

CHECK-BOLT-YAML: pre-processing profile using YAML profile reader
CHECK-BOLT-YAML-NEXT: 1 out of 16 functions in the binary (6.2%) have non-empty execution profile
CHECK-BOLT-YAML-NEXT: 5 out of 16 functions in the binary (31.2%) have non-empty execution profile
2 changes: 1 addition & 1 deletion bolt/test/X86/bolt-address-translation.test
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
# CHECK: BOLT: 3 out of 7 functions were overwritten.
# CHECK: BOLT-INFO: Wrote 6 BAT maps
# CHECK: BOLT-INFO: Wrote 3 function and 58 basic block hashes
# CHECK: BOLT-INFO: BAT section size (bytes): 816
# CHECK: BOLT-INFO: BAT section size (bytes): 924
#
# usqrt mappings (hot part). We match against any key (left side containing
# the bolted binary offsets) because BOLT may change where it puts instructions
Expand Down
Loading