Skip to content
This repository has been archived by the owner on Jul 1, 2023. It is now read-only.

[BOLT] [NFC] Remove special DWARF expressions handling from LLVM #196

Closed
wants to merge 921 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
921 commits
Select commit Hold shift + click to select a range
ef9bc41
[BOLT] Delete ExecutableFileMemoryManager::registerNoteSection()
maksfb Feb 24, 2020
1bd8813
[BOLT][NFC] Remove unused BinarySection member functions
maksfb Feb 25, 2020
d409c68
[BOLT][NFC] Minor refactoring of RewriteInstance
maksfb Feb 25, 2020
dadb567
[BOLT] Fix shrink wrapping to check pops
rafaelauler Feb 19, 2020
13093cc
[BOLT][NFC] Factor out relocation processing
maksfb Feb 25, 2020
2c54b7f
[BOLT] Fix begin decrementing
Mar 3, 2020
9864776
[BOLT][NFC] Get rid of BestFit parameter
maksfb Mar 3, 2020
876a732
[BOLT] Remove allow-section-relocations option
maksfb Mar 3, 2020
78e4214
[BOLT] Mark functions containing data as non-simple
Mar 3, 2020
7e83a0c
[BOLT] Uniquify names of local symbols
Mar 5, 2020
72471e7
[BOLT] Refactor emission of original .eh_frame
maksfb Mar 7, 2020
d5c9267
[BOLT] Refactor ELF parts of instrumentation code
maksfb Mar 9, 2020
1cffaaf
[BOLT] Refactor code and data emission code
maksfb Mar 6, 2020
d963884
[BOLT] Refactor section prefixes
maksfb Mar 11, 2020
69d1e88
[BOLT] Refactor ELF symbol table rewriting code
maksfb Feb 27, 2020
55c46c0
[BOLT][DWARF] Add support for base address in DWARF location lists
maksfb Mar 25, 2020
c3b2320
[BOLT] Verify exceptions action table equivalence in ICF
maksfb Mar 31, 2020
cac0abb
[BOLT] Fix ICF non-determinism in non-relocation mode
maksfb Apr 5, 2020
75a872f
[BOLT] Speedup ICF by better function hashing
maksfb Apr 7, 2020
b2613fc
[BOLT] Further speedup ICF
maksfb Apr 8, 2020
8984e5c
[BOLT-X86] Fix instrumentation issue with indirect calls
rafaelauler Apr 7, 2020
98ea95c
[BOLT] Speedup RTDyld external symbol resolution
maksfb Nov 11, 2019
acbb22e
[BOLT] Fix .eh_frame update with ICF in non-relocation mode
maksfb Apr 16, 2020
ce288a1
[BOLT] Emit ICF symbols for large functions
maksfb Apr 16, 2020
ce92b98
[BOLT] Option to control .text alignment
maksfb Apr 19, 2020
d0f6e86
[BOLT] Do not emit old .eh_frame in relocation mode
maksfb Apr 19, 2020
57b05a6
[BOLT] Option to fail if invalid profile detected
maksfb Apr 22, 2020
ed19f13
[BOLT] Speedup PLT processing
maksfb Apr 24, 2020
c2c2d8b
[BOLT][NFC] Change wording while reporting functions stats
maksfb Apr 24, 2020
65eb201
[BOLT] Change symbol handling for secondary function entries
maksfb Apr 20, 2020
dd6546e
[BOLT][BFC] Refactor code for adding secondary function entries
maksfb Apr 27, 2020
3b4af7c
[BOLT] Cover PIC jump table reference in non-strict mode
maksfb Apr 27, 2020
3525619
[BOLT] Fix dyno stats after ICF in non-reloc mode
maksfb May 2, 2020
4316646
[BOLT] Introduce isIgnored() function attribute
maksfb May 3, 2020
cacd1cb
[BOLT] Introduce lite processing mode without relocations
maksfb May 3, 2020
eab8b76
Check runtime lib format within archiver
lxfind May 4, 2020
27ce83d
[BOLT] Ignore kernel interrupts by default
maksfb May 6, 2020
8c6549f
[BOLT] Change .debug_line emission for non-simple functions
maksfb May 6, 2020
8874183
[BOLT] Add option to tag version
rafaelauler May 7, 2020
a6d0712
[BOLT] Remove StringRef from IndirectCallProfile
maksfb May 15, 2020
3586a86
[BOLT] Refactor profile-handling code
maksfb May 8, 2020
d67ad8f
Remove const call to take_front
lxfind May 21, 2020
26d6c8e
Use shuffle instead of random_shuffle
lxfind May 21, 2020
fdb05d8
Emit functions on MachO
May 26, 2020
c4574e5
Refactor runtime library
lxfind May 21, 2020
1bd3c99
Adding automatic huge page support
lxfind May 2, 2020
5b878a3
[BOLT] Update section index for symbols from unemitted functions
maksfb Jun 10, 2020
578ca97
Generate heatmap for linux kernel
takhandipu Jun 11, 2020
25ec72e
Provide a redundant declaration of KernelBaseAddr
Jun 15, 2020
bf19156
Link functions on MachO
Jun 13, 2020
80e0083
Be more flexible when locating runtime libs
lxfind Jun 16, 2020
acb4384
[BOLT] Support for lite mode with relocations
maksfb Jun 15, 2020
bab2927
[BOLT] Disable trapping on AVX-512 by default
maksfb Jun 18, 2020
da8417a
[BOLT] Support -hot-text in lite mode
maksfb Jun 18, 2020
fc343ec
[BOLT] Fix memory error
maksfb Jun 19, 2020
8f05818
[BOLT] Properly register symbols at secondary entry points
maksfb Jun 22, 2020
8cbe626
[BOLT] Fixes for scanExternalRefs()
maksfb Jun 22, 2020
a49c9d7
[BOLT] Create entry points for internal refs from external code
maksfb Jun 22, 2020
dd587ac
[BOLT] Ignore functions that failed validation
maksfb Jun 22, 2020
2d2b00d
[BOLT] Allow to overwrite -use-old-text option
maksfb Jun 22, 2020
6a57050
[BOLT] Fix getNewValueForSymbol()
maksfb Jun 22, 2020
e227010
[BOLT] Add '-force-patch' to forcefully patch old entries
maksfb Jun 22, 2020
c6122d2
[BOLT] Ignore duplicate relocations
maksfb Jun 23, 2020
1348698
[perf2bolt] Relax rules for aggregation in strict mode
maksfb Jun 25, 2020
1d70551
[BOLT] Add static binary support
maksfb Jun 26, 2020
c182eb8
[BOLT] Do not emit duplicate org symbols
maksfb Jun 24, 2020
2fb54b6
Update X86/pre-aggregated-perf.test
rafaelauler Jun 25, 2020
ea1d188
[TESTS] Re-add issue20/issue26 tests
rafaelauler Jul 1, 2020
81ddfb6
[BOLT] Skip R_X86_64_PLT32 relocation verification
maksfb Jul 1, 2020
357efca
[Bolt] Improve coding style for runtime lib related code
lxfind Jul 2, 2020
cd288e4
Support for CDF distribution of heatmap buckets
takhandipu Jun 18, 2020
376a3f9
[BOLT] Ignore addresses from non-allocatable sections
maksfb Jul 6, 2020
49dbcf5
Report stale sample count and percentage
takhandipu Jul 7, 2020
e9dc0a0
[BOLT] Add the FeatureMiner pass to extract Calder's features.
angelica-moreira Jul 8, 2020
4e9c55e
[BOLT] Fix fix-branches in presence of JRCXZ and friends
rafaelauler Jul 16, 2020
9301fe5
Revert "[BOLT] Add the FeatureMiner pass to extract Calder's features."
rafaelauler Jul 17, 2020
709c438
[BOLT] Allow to specify -reorder-functions option multiple times
maksfb Jul 17, 2020
10e90f8
Extracted sequence insertion function into helper function
aaupov Jul 18, 2020
866c7d8
Handle intra-function call in instrumentOneTarget
aaupov Jul 18, 2020
e66aff7
[BOLT] Fix hot_end symbol update with user function order
rafaelauler Jul 24, 2020
c277543
[BOLT] Fix stack alignment for runtime lib
rafaelauler Jul 27, 2020
6d393d5
Added execution count threshold option
aaupov Jul 28, 2020
3d28992
[perf2bolt] Fix for SKL bug workaround
maksfb Aug 4, 2020
2725163
Linux kernel marker to update special sections
takhandipu Aug 4, 2020
6938c1e
Print when we are operating in lite mode
rafaelauler Aug 6, 2020
7bc1c33
Add first bits to support emitting more than 255 sections on MachO
Jul 22, 2020
b183e54
[perf2bolt] Issue error when writing YAML for BOLTed input
maksfb Aug 13, 2020
1dc7172
Fix BAT cold-to-hot mappings
rafaelauler Aug 18, 2020
8920da6
Bugfix for splitting critical edges in shrink wrapping
aaupov Aug 21, 2020
6a9a296
[BOLT] Do no map sections with zero address
maksfb Sep 14, 2020
5d7386b
[BOLT] Eliminate "shallow" function lookup
maksfb Sep 14, 2020
5e74b22
[BOLT][Linux] Initial support for special Linux Kernel sections
maksfb Sep 15, 2020
3882638
Set InputFileOffset for MachO sections
Sep 24, 2020
bf30398
postProcessEntryPoints: return after setIgnored and setSimple are set
aaupov Sep 30, 2020
50a5c1e
Read the entry point address on MachO
Oct 1, 2020
e41f7d1
[BOLT] Fix sign issue when validating X86 relocations
rafaelauler Oct 5, 2020
36c9c0e
Add -check-overlapping-elements option
Oct 5, 2020
5ad0da4
Precompute symbol section indices on MachO
Oct 6, 2020
499e7de
[BOLT] Refactor relocations class impl per arch, NFC
rafaelauler Oct 7, 2020
fc1198b
Add ToolPath field to MachORewriteInstance
Oct 8, 2020
d5d93bc
[BOLT] Refactor PatchEntries pass
maksfb Oct 9, 2020
0437140
[BOLT] Disable PatchEntries in non-relocation mode on ELF
maksfb Oct 10, 2020
edd9b7f
Add support for emitting code into a new segment on MachO
Oct 3, 2020
183edce
[BOLT] Change label name for cold fragments
maksfb Oct 12, 2020
0add15d
Fix handling of _end symbol on MachO
Oct 12, 2020
9d60cdc
[BOLT] Emit symbol size for functions
maksfb Oct 12, 2020
11606f3
Add first bits to support emitting instrumented code on MachO
Oct 12, 2020
b78828f
[BOLT] Fix debug line info in lite relocation mode
maksfb Oct 13, 2020
6a65776
[BOLT] Refactor reading of debug line info
maksfb Oct 13, 2020
ceb6b65
[BOLT] In shrinkwrap, do not split prefix/instr
rafaelauler Oct 14, 2020
2c861df
Add first bits to cross-compile the runtime for OSX
Oct 15, 2020
26601a5
[BOLT][DWARF] Streamline processing of DWARF unit DIEs
maksfb Oct 16, 2020
0f4a72f
Inject a hook into the entry point on MachO
Oct 15, 2020
13a20b6
[BOLT] Ignore __hot_start, __hot_end from input
rafaelauler Oct 17, 2020
7079462
[BOLT] Enable lite mode by default with relocations
maksfb Oct 17, 2020
da27f1d
[BOLT] Fix PatchEntries pass
maksfb Oct 21, 2020
7ffb486
Add pass number to dot dump filename
aaupov Oct 22, 2020
1533f92
[BOLT] Always keep dynamic symbols defined
maksfb Oct 22, 2020
468062d
[BOLT] Fix no-asserts build
rafaelauler Oct 31, 2020
c3e9378
[DOCS] Add instrumentation instructions to README
rafaelauler Oct 30, 2020
f155de3
[BOLT] Please sanitizers
rafaelauler Oct 30, 2020
c090188
[BOLT] Remove threaded EliminateUnreachableBlock version
rafaelauler Nov 3, 2020
92e61a3
[BOLT] Fix C++ exceptions for shared objects
maksfb Nov 4, 2020
d9e2ddf
[BOLT][PR] Handle TLS relocations on AArch64
yota9 Nov 5, 2020
a588d12
Extract BinaryContext::registerFragment
aaupov Nov 6, 2020
013ab98
processInterproceduralReferences: record references to cold fragments…
aaupov Nov 6, 2020
09f5ed6
Conservatively handle jump tables in split functions
aaupov Nov 6, 2020
50ef29b
Lost in rebase: call registerFragment with a reference to TargetBF
aaupov Nov 6, 2020
463ce24
Improve cold fragment name matching
aaupov Nov 9, 2020
08be0d0
[BOLT] Disable DynoStats printing after SCTC
aaupov Nov 10, 2020
8b13e5e
Minimize X86/shrinkwrapping-critedge test case
aaupov Nov 11, 2020
658fbd2
[BOLT] Debug logging in analyzeJumpTable
aaupov Nov 12, 2020
91a6e0f
[BOLT] Add invalid offset for a JT entry pointing to a fragment
aaupov Nov 12, 2020
7a01cd0
[BOLT] Support jump tables in split fragments with entries pointing b…
aaupov Nov 12, 2020
5ab4d00
a new version of hfsort+
spupyrev Nov 14, 2020
4ca141a
[BOLT] Fix data race while running split functions pass
maksfb Nov 16, 2020
a11b02f
Link the instrumentation runtime on OSX
Nov 17, 2020
450b88d
[BOLT] Handle insertion of updated CFI at the first basic block
aaupov Nov 18, 2020
9a04b54
Refactor syscall wrappers for OSX
Nov 19, 2020
9ff0c8b
Inject instrumentation's global dtor on MachO
Nov 20, 2020
10895c4
[BOLT] Fix shrinkwrapping bug when changing frame alignment
rafaelauler Dec 4, 2020
09e8658
[TEST] Remove dependency on debug output
rafaelauler Dec 9, 2020
ae60c9d
[BOLT] Add threshold options for lite mode
rafaelauler Dec 30, 2020
f5fa8a5
[PERF2BOLT] Relax segment matching requirements
rafaelauler Jan 11, 2021
2850777
[BOLT] Fix missing newlines in debug prints
aaupov Jan 20, 2021
bded93c
[BOLT] Fix operator new signature
Jan 20, 2021
6519118
[BOLT] Enable intToStr for MacOS
Jan 21, 2021
ffa8f71
an updated version of ExtTSP
spupyrev Jan 28, 2021
9923646
[BOLT] Add support for __literal16 section on MachO
Jan 28, 2021
35ea7d1
[BOLT] Add support for dumping counters on MacOS
Jan 28, 2021
0f68e4b
[BOLT] Add support for dumping profile on MacOS
Jan 28, 2021
53e617f
[BOLT] Add support for reading profile on Mach-O
Jan 30, 2021
4557216
Rebase: Merge BOLT codebase in monorepo
aaupov Dec 2, 2020
b661fe3
[BOLT] Update license headers
rafaelauler Mar 16, 2021
b8875b2
Update DW_AT_stmt_list for .debug_types
ayermolo Feb 17, 2021
882967a
Fix license for a few remaining files
rafaelauler Mar 17, 2021
0da3ee4
Fix up test for Update DW_AT_stmt_list for .debug_types
ayermolo Mar 18, 2021
caa6a44
[BOLT][PR] readDynamicRelocations: Skip NONE relocations
yota9 Feb 17, 2021
8e53a2d
[BOLT] Ignore TBSS section at layout time
maksfb Mar 5, 2021
e390f96
[BOLT][PR] Instrumentation: Introduce -no-counters-clear and -wait-fo…
yota9 Mar 10, 2021
b9b0349
[BOLT] Fix false references to zero-sized objects
maksfb Mar 15, 2021
3acfc4b
[BOLT] Fix instrumentation bug in duplicated JTs
rafaelauler Mar 15, 2021
0b03337
[BOLT] Do not assert on jump table heuristic failure
maksfb Mar 23, 2021
b8a9245
Rebase: [cherry-pick] [BOLT] Add option to skip writing an output file
aaupov Mar 29, 2021
460ba02
[BOLT] Refactor SectionPatchers map to a Patcher in BinarySection
aaupov Mar 18, 2021
41f1dad
[BOLT] Remove cantFail in getAddressRanges calls
rafaelauler Apr 6, 2021
ccd63a9
[BOLT] Fix value invalidation bug in runtimelib
rafaelauler Apr 8, 2021
db65d39
Rebase: [BOLT][NFC] Expand auto types
aaupov Apr 8, 2021
6c0b2b5
[BOLT][NFC] Use const reference for MCInstrDesc
aaupov Apr 18, 2021
1c8c456
[BOLT][NFC] Remove RewriteInstance::EHFrame
maksfb Apr 21, 2021
0bcd158
[BOLT] Remove -dump-eh-frame option
maksfb Apr 21, 2021
4592e45
[BOLT][NFC] Remove CFIReaderWriter::fdes()
maksfb Apr 21, 2021
450b272
[perf2bolt] Further relax segment matching
maksfb Apr 30, 2021
013482a
Rebase: [BOLT][NFC] Remove unneeded includes with include-what-you-use
aaupov Apr 30, 2021
6c2c9ed
Rebase: [BOLT][NFC] Avoid binutils in tests
aaupov May 4, 2021
b8e500f
[BOLT][NFC] Avoid unnecessary copies with push_back
aaupov May 8, 2021
ea18a73
[PR] Fix bb reordering optimization
Apr 23, 2021
aa88709
[PR] Fix tests build with -no-pie option
yota9 May 11, 2021
58ed088
[PR] Add missing includes
yota9 May 11, 2021
72f3094
[BOLT][NFC] Follow LLVM variable initialization style
maksfb May 13, 2021
25dbf22
[BOLT][NFC] Address warning about ProgramPoint implicit copy constructor
aaupov May 10, 2021
37eed9d
[BOLT][NFC] Change interface for searching relocations
maksfb May 13, 2021
793833f
[BOLT] Preserve original jump table relocations
maksfb May 13, 2021
80f8a68
[BOLT][NFC][TEST] Added llvm-dwarfdump and llvm-mc to BOLT_TEST_DEPS
aaupov May 13, 2021
4d7f3fd
Rebase: [BOLT] DebugFission Support
aaupov Apr 1, 2021
8f7f090
[PR] Introduce loop inversion pass
yota9 May 11, 2021
23d75aa
[PR] Instrumentation: Emit paddings to preserve data alignment
yota9 May 14, 2021
a8896f8
[BOLT][NFC] Disable ProcessAllSections in RuntimeDyld
maksfb May 26, 2021
0a51b7a
Added Github Actions workflow for building Docker image
aaupov May 14, 2021
db9cd7f
[BOLT] Resolve JumpTable namespace issue in pseudo probe decoder migr…
luoj1 Jun 3, 2021
b9d84e4
[BOLT][TEST] Fix test case to conform to analyzePICJumpTable pattern …
aaupov Jun 2, 2021
cf30f4d
[BOLT][NFC] Fix debug info printouts for inlined functions
maksfb Jun 4, 2021
a525069
[BOLT] Hugify: check for THP support via sysfs
aaupov Jun 3, 2021
32ee104
[BOLT] Change how DF DWO logging is handled
ayermolo Jun 9, 2021
ae29532
Rebase: [BOLT] Refactor the Pseudo Probe decoder
aaupov Jun 9, 2021
a02b16a
[BOLT] Add pseudo probe print out for all addesses
luoj1 Jun 11, 2021
032efcf
[BOLT][CSSPGO] Pseudo probe decoding
luoj1 Jun 11, 2021
18e4a9c
[BOLT][NFC] Suppress addList override warning
aaupov Jun 3, 2021
e57d8cd
[BOLT] Fix rodata load simplification pass
maksfb Jun 13, 2021
8609e16
[PR] Instrumentation: Disable signals on mutex lock
yota9 Jun 4, 2021
02154fb
[PR] Patch allocatable relocations for AArch64
yota9 Jun 1, 2021
fd9fbed
[BOLT][DebugFission] Fix reading support for DWP
ayermolo Jun 16, 2021
f18c7e4
[PR][BOLT] Print revision in perf2bolt and bolt-diff modes"
Sameeranjoshi Jun 8, 2021
90caf47
[BOLT] Fix undefined symbol warnings/errors
maksfb Jun 18, 2021
7181e52
Throw an error in instrument for dynamic libs
jthamanfb Jun 21, 2021
5a9e664
[BOLT][TESTS] Fix ICF test case
maksfb Jun 23, 2021
71f0047
[BOLT] Handle R_X86_64_64 in flushPendingRelocations
aaupov Jun 24, 2021
817c225
[BOLT][NFC] Use MCPlusBuilder::isPseudo
aaupov Jun 16, 2021
1e88071
[BOLT][CSSPGO] Relate decoded pseudo probe basic blocks
luoj1 Jun 25, 2021
9bd30f8
[BOLT][NFC] Readability improvements in X86,Aarch64 MCPlusBuilder
aaupov Jun 18, 2021
cb62b32
[BOLT][NFC] Refactor handlePCRelOperand
aaupov Jun 18, 2021
a3c768d
[BOLT][NFC] Always process runtime relocations
maksfb Jun 22, 2021
863b4e2
[BOLT][NFC] Delete MoveRelocations entirely
aaupov Jun 26, 2021
06eb7d9
[BOLT][NFC] Un-inline adding external references out of disassemble loop
aaupov Jun 26, 2021
805d4d2
[BOLT][NFC] Un-inline indirect branch handling out of disassemble loop
aaupov Jun 26, 2021
55be664
[BOLT][NFC] Un-inline checking AArch64 linker veneers out of disassem…
aaupov Jun 26, 2021
cb49286
[BOLT][TESTS] Remove dynamic relocations from YAML tests
maksfb Jun 30, 2021
d64c4ad
[BOLT][DWARF] Fix writing out dwo with DWP as input
ayermolo Jun 18, 2021
983fdc1
[BOLT] Read all dynamic relocations and refactor code
maksfb Jun 30, 2021
04f1f11
[BOLT][NFC] Resolved all clang-12 warnings for bolt
jthamanfb Jun 29, 2021
572e468
[BOLT] Add support for .plt.sec and refactor PLT-reading code
maksfb Jun 30, 2021
75f898a
fixup! Rebase: Merge BOLT codebase in monorepo
aaupov Jul 9, 2021
5e5d034
[BOLT] Dump dynamic execution per instruction opcode
bzinodev May 25, 2021
2dc731a
[BOLT] Tail duplication analysis pass
jthamanfb Jul 1, 2021
975de53
fixup! [BOLT] Refactor the Pseudo Probe decoder
luoj1 Jul 15, 2021
253508d
[BOLT][CSSPGO] Encode pseudo probe section to binary
luoj1 Jul 15, 2021
ac6ab95
[BOLT][CSSPGO] Handle indirect call promotion in Pseudo Probe Integra…
luoj1 Jul 16, 2021
0a7a6aa
RewriteInstance: account .stab and .stabstr as debug sections
Jun 25, 2021
0e443f8
Replace LLVM's README with our own
rafaelauler Jul 21, 2021
c9f0287
[BOLT][GitHub] Updated the branch that triggers docker-image Action
aaupov Jul 16, 2021
fa8b0a4
[BOLT] Tail Duplication active pass
jthamanfb Jul 16, 2021
fb12b4e
[BOLT] Update build instructions in README
aaupov Jul 28, 2021
ac54ba7
[BOLT] Support PLT sections with variable entry sizes
maksfb Jul 14, 2021
c770c11
[BOLT][NFC] Un-const some MCPlusBuilder methods in preparation for ta…
aaupov Jul 22, 2021
914fbfc
fixup! Rebase: Merge BOLT codebase in monorepo
aaupov Jul 21, 2021
08c9c99
[BOLT][NFC] Unify isTailCall interface across X86 and AArch64
aaupov Jul 30, 2021
0c4ae33
[PR] Instrumentation: Generate and use _start and _fini trampolines
Jun 18, 2021
e264ce8
[PR] Instrumentation: Add readlink and getdents support
ElvinaYakubova Jan 18, 2021
c09246b
[PR] Instrumentation: Add support for opening libs based on links /pr…
ElvinaYakubova Jan 18, 2021
8664d97
[PR] Instrumentation: Initial support for static executables
Jun 20, 2021
03700b7
[PR] Instrumentation: Fix runtime handlers for PIE files
yota9 Jun 23, 2021
1437d30
[PR] README: remove note about experimental status of instrumentation
Jun 25, 2021
c09bf3f
[PR] Instrumentation: Introduce instrumentation-binpath argument
Jul 30, 2021
083a705
[PR] Instrumentation: Fix start and fini trampoline pointers
yota9 Jul 30, 2021
db9a6a0
[PR] Instrumentation: Avoid generating GOT table in instrumentation l…
yota9 Jul 21, 2021
440f71f
[PR] Tests: add instrumentation tests for PIE exec & shared libs
Jun 19, 2021
a1a27f3
[DWP] Refactoring llvm-dwp in to a library
ayermolo Aug 4, 2021
ddb78af
[DWP] Fix for Refactoring llvm-dwp in to a library
ayermolo Aug 4, 2021
dafed12
[DWP] Refactoring llvm-dwp in to a library part 2
ayermolo Aug 4, 2021
743e9a2
[BOLT] [NFC] Remove special DWARF expressions handling from LLVM
rafaelauler Jul 1, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
@@ -0,0 +1 @@
.git
31 changes: 31 additions & 0 deletions .github/workflows/Dockerfile
@@ -0,0 +1,31 @@
FROM ubuntu:20.04 AS builder

ARG DEBIAN_FRONTEND=noninteractive
ENV TZ=UTC

RUN apt-get update && \
apt-get install -y --no-install-recommends ca-certificates git \
build-essential cmake ninja-build python3 libjemalloc-dev && \
rm -rf /var/lib/apt/lists

WORKDIR /home/bolt

COPY . llvm-bolt

WORKDIR build

RUN cmake -G Ninja ../llvm-bolt/llvm \
-DLLVM_ENABLE_PROJECTS="bolt" \
-DLLVM_TARGETS_TO_BUILD="X86;AArch64" \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DCMAKE_EXE_LINKER_FLAGS="-Wl,--push-state -Wl,-whole-archive -ljemalloc_pic -Wl,--pop-state -lpthread -lstdc++ -lm -ldl" \
-DCMAKE_INSTALL_PREFIX=/home/bolt/install

RUN ninja check-bolt && \
ninja install-llvm-bolt install-perf2bolt install-merge-fdata \
install-llvm-boltdiff install-bolt_rt

FROM ubuntu:20.04

COPY --from=builder /home/bolt/install /usr/local
16 changes: 16 additions & 0 deletions .github/workflows/docker-image.yml
@@ -0,0 +1,16 @@
name: Docker Image CI

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
build:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Build the Docker image and test
run: docker build . --file .github/workflows/Dockerfile --tag ubuntu-bolt:$(date +%s)
266 changes: 185 additions & 81 deletions README.md
@@ -1,106 +1,210 @@
# The LLVM Compiler Infrastructure
# BOLT

This directory and its sub-directories contain source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and run-time environments.
BOLT is a post-link optimizer developed to speed up large applications.
It achieves the improvements by optimizing application's code layout based on
execution profile gathered by sampling profiler, such as Linux `perf` tool.
An overview of the ideas implemented in BOLT along with a discussion of its
potential and current results is available in
[CGO'19 paper](https://research.fb.com/publications/bolt-a-practical-binary-optimizer-for-data-centers-and-beyond/).

The README briefly describes how to get started with building LLVM.
For more information on how to contribute to the LLVM project, please
take a look at the
[Contributing to LLVM](https://llvm.org/docs/Contributing.html) guide.
## Input Binary Requirements

## Getting Started with the LLVM System
BOLT operates on X86-64 and AArch64 ELF binaries. At the minimum, the binaries
should have an unstripped symbol table, and, to get maximum performance gains,
they should be linked with relocations (`--emit-relocs` or `-q` linker flag).

Taken from https://llvm.org/docs/GettingStarted.html.
BOLT disassembles functions and reconstructs the control flow graph (CFG)
before it runs optimizations. Since this is a nontrivial task,
especially when indirect branches are present, we rely on certain heuristics
to accomplish it. These heuristics have been tested on a code generated with
Clang and GCC compilers. The main requirement for C/C++ code is not to rely
on code layout properties, such as function pointer deltas.
Assembly code can be processed too. Requirements for it include a clear
separation of code and data, with data objects being placed into data
sections/segments. If indirect jumps are used for intra-function control
transfer (e.g., jump tables), the code patterns should be matching those
generated by Clang/GCC.

### Overview
NOTE: BOLT is currently incompatible with the `-freorder-blocks-and-partition`
compiler option. Since GCC8 enables this option by default, you have to
explicitly disable it by adding `-fno-reorder-blocks-and-partition` flag if
you are compiling with GCC8.

PIE and .so support has been added recently. Please report bugs if you
encounter any issues.

Welcome to the LLVM project!
## Installation

The LLVM project has multiple components. The core of the project is
itself called "LLVM". This contains all of the tools, libraries, and header
files needed to process intermediate representations and convert them into
object files. Tools include an assembler, disassembler, bitcode analyzer, and
bitcode optimizer. It also contains basic regression tests.
### Docker Image

C-like languages use the [Clang](http://clang.llvm.org/) front end. This
component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode
-- and from there into object files, using LLVM.
You can build and use the docker image containing BOLT using our [docker file](./bolt/utils/docker/Dockerfile).
Alternatively, you can build BOLT manually using the steps below.

### Manual Build

BOLT heavily uses LLVM libraries, and by design, it is built as one of LLVM
tools. The build process is not much different from a regular LLVM build.
The following instructions are assuming that you are running under Linux.

Start with cloning LLVM and BOLT repos:

```
> git clone https://github.com/facebookincubator/BOLT llvm-bolt
> mkdir build
> cd build
> cmake -G Ninja ../llvm-bolt/llvm -DLLVM_TARGETS_TO_BUILD="X86;AArch64" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="bolt"
> ninja
```

Other components include:
the [libc++ C++ standard library](https://libcxx.llvm.org),
the [LLD linker](https://lld.llvm.org), and more.
`llvm-bolt` will be available under `bin/`. Add this directory to your path to
ensure the rest of the commands in this tutorial work.

### Getting the Source Code and Building LLVM
Note that we use a specific snapshot of LLVM monorepo as we currently
rely on a set of patches that are not yet upstreamed.

## Optimizing BOLT's Performance

The LLVM Getting Started documentation may be out of date. The [Clang
Getting Started](http://clang.llvm.org/get_started.html) page might have more
accurate information.
BOLT runs many internal passes in parallel. If you foresee heavy usage of
BOLT, you can improve the processing time by linking against one of memory
allocation libraries with good support for concurrency. E.g. to use jemalloc:

This is an example work-flow and configuration to get and build the LLVM source:
```
> sudo yum install jemalloc-devel
> LD_PRELOAD=/usr/lib64/libjemalloc.so llvm-bolt ....
```
Or if you rather use tcmalloc:
```
> sudo yum install gperftools-devel
> LD_PRELOAD=/usr/lib64/libtcmalloc_minimal.so llvm-bolt ....
```

## Usage

1. Checkout LLVM (including related sub-projects like Clang):
For a complete practical guide of using BOLT see [Optimizing Clang with BOLT](./bolt/docs/OptimizingClang.md).

* ``git clone https://github.com/llvm/llvm-project.git``
### Step 0

* Or, on windows, ``git clone --config core.autocrlf=false
https://github.com/llvm/llvm-project.git``
In order to allow BOLT to re-arrange functions (in addition to re-arranging
code within functions) in your program, it needs a little help from the linker.
Add `--emit-relocs` to the final link step of your application. You can verify
the presence of relocations by checking for `.rela.text` section in the binary.
BOLT will also report if it detects relocations while processing the binary.

### Step 1: Collect Profile

This step is different for different kinds of executables. If you can invoke
your program to run on a representative input from a command line, then check
**For Applications** section below. If your program typically runs as a
server/service, then skip to **For Services** section.

2. Configure and build LLVM and Clang:
The version of `perf` command used for the following steps has to support
`-F brstack` option. We recommend using `perf` version 4.5 or later.

#### For Applications

This assumes you can run your program from a command line with a typical input.
In this case, simply prepend the command line invocation with `perf`:
```
$ perf record -e cycles:u -j any,u -o perf.data -- <executable> <args> ...
```

#### For Services

Once you get the service deployed and warmed-up, it is time to collect perf
data with LBR (branch information). The exact perf command to use will depend
on the service. E.g., to collect the data for all processes running on the
server for the next 3 minutes use:
```
$ perf record -e cycles:u -j any,u -a -o perf.data -- sleep 180
```

* ``cd llvm-project``
Depending on the application, you may need more samples to be included with
your profile. It's hard to tell upfront what would be a sweet spot for your
application. We recommend the profile to cover 1B instructions as reported
by BOLT `-dyno-stats` option. If you need to increase the number of samples
in the profile, you can either run the `sleep` command for longer and use
`-F<N>` option with `perf` to increase sampling frequency.

* ``cmake -S llvm -B build -G <generator> [options]``
Note that for profile collection we recommend using cycle events and not
`BR_INST_RETIRED.*`. Empirically we found it to produce better results.

Some common build system generators are:
If the collection of a profile with branches is not available, e.g., when you run on
a VM or on hardware that does not support it, then you can use only sample
events, such as cycles. In this case, the quality of the profile information
would not be as good, and performance gains with BOLT are expected to be lower.

* ``Ninja`` --- for generating [Ninja](https://ninja-build.org)
build files. Most llvm developers use Ninja.
* ``Unix Makefiles`` --- for generating make-compatible parallel makefiles.
* ``Visual Studio`` --- for generating Visual Studio projects and
solutions.
* ``Xcode`` --- for generating Xcode projects.
#### With instrumentation (experimental)

Some Common options:
If perf record is not available to you, you may collect profile by first
instrumenting the binary with BOLT and then running it.
```
llvm-bolt <executable> -instrument -o <instrumented-executable>
```

After you run instrumented-executable with the desired workload, its BOLT
profile should be ready for you in `/tmp/prof.fdata` and you can skip
**Step 2**.

Run BOLT with the `-help` option and check the category "BOLT instrumentation
options" for a quick reference on instrumentation knobs. Instrumentation is
experimental and currently does not work for PIEs/SOs.

### Step 2: Convert Profile to BOLT Format

NOTE: you can skip this step and feed `perf.data` directly to BOLT using
experimental `-p perf.data` option.

* ``-DLLVM_ENABLE_PROJECTS='...'`` --- semicolon-separated list of the LLVM
sub-projects you'd like to additionally build. Can include any of: clang,
clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld,
polly, or cross-project-tests.
For this step, you will need `perf.data` file collected from the previous step and
a copy of the binary that was running. The binary has to be either
unstripped, or should have a symbol table intact (i.e., running `strip -g` is
okay).

For example, to build LLVM, Clang, libcxx, and libcxxabi, use
``-DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi"``.
Make sure `perf` is in your `PATH`, and execute `perf2bolt`:
```
$ perf2bolt -p perf.data -o perf.fdata <executable>
```

This command will aggregate branch data from `perf.data` and store it in a
format that is both more compact and more resilient to binary modifications.

If the profile was collected without LBRs, you will need to add `-nl` flag to
the command line above.

### Step 3: Optimize with BOLT

Once you have `perf.fdata` ready, you can use it for optimizations with
BOLT. Assuming your environment is setup to include the right path, execute
`llvm-bolt`:
```
$ llvm-bolt <executable> -o <executable>.bolt -data=perf.fdata -reorder-blocks=cache+ -reorder-functions=hfsort -split-functions=2 -split-all-cold -split-eh -dyno-stats
```

* ``-DCMAKE_INSTALL_PREFIX=directory`` --- Specify for *directory* the full
path name of where you want the LLVM tools and libraries to be installed
(default ``/usr/local``).
If you do need an updated debug info, then add `-update-debug-sections` option
to the command above. The processing time will be slightly longer.

* ``-DCMAKE_BUILD_TYPE=type`` --- Valid options for *type* are Debug,
Release, RelWithDebInfo, and MinSizeRel. Default is Debug.

* ``-DLLVM_ENABLE_ASSERTIONS=On`` --- Compile with assertion checks enabled
(default is Yes for Debug builds, No for all other build types).

* ``cmake --build build [-- [options] <target>]`` or your build system specified above
directly.

* The default target (i.e. ``ninja`` or ``make``) will build all of LLVM.

* The ``check-all`` target (i.e. ``ninja check-all``) will run the
regression tests to ensure everything is in working order.

* CMake will generate targets for each tool and library, and most
LLVM sub-projects generate their own ``check-<project>`` target.

* Running a serial build will be **slow**. To improve speed, try running a
parallel build. That's done by default in Ninja; for ``make``, use the option
``-j NNN``, where ``NNN`` is the number of parallel jobs, e.g. the number of
CPUs you have.

* For more information see [CMake](https://llvm.org/docs/CMake.html)

Consult the
[Getting Started with LLVM](https://llvm.org/docs/GettingStarted.html#getting-started-with-llvm)
page for detailed information on configuring and compiling LLVM. You can visit
[Directory Layout](https://llvm.org/docs/GettingStarted.html#directory-layout)
to learn about the layout of the source code tree.
For a full list of options see `-help`/`-help-hidden` output.

The input binary for this step does not have to 100% match the binary used for
profile collection in **Step 1**. This could happen when you are doing active
development, and the source code constantly changes, yet you want to benefit
from profile-guided optimizations. However, since the binary is not precisely the
same, the profile information could become invalid or stale, and BOLT will
report the number of functions with a stale profile. The higher the
number, the less performance improvement should be expected. Thus, it is
crucial to update `.fdata` for release branches.

## Multiple Profiles

Suppose your application can run in different modes, and you can generate
multiple profiles for each one of them. To generate a single binary that can
benefit all modes (assuming the profiles don't contradict each other) you can
use `merge-fdata` tool:
```
$ merge-fdata *.fdata > combined.fdata
```
Use `combined.fdata` for **Step 3** above to generate a universally optimized
binary.

## License

BOLT is licensed under the [Apache License v2.0 with LLVM Exceptions](./LICENSE.TXT).
29 changes: 29 additions & 0 deletions bolt/CMakeLists.txt
@@ -0,0 +1,29 @@
include(ExternalProject)

set(BOLT_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
set(BOLT_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})
set(CMAKE_CXX_STANDARD 14)

ExternalProject_Add(bolt_rt
SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/runtime"
STAMP_DIR ${CMAKE_CURRENT_BINARY_DIR}/bolt_rt-stamps
BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR}/bolt_rt-bins
CMAKE_ARGS -DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}
-DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_MAKE_PROGRAM=${CMAKE_MAKE_PROGRAM}
-DCMAKE_INSTALL_PREFIX=${LLVM_BINARY_DIR}
# You might want to set this to True if actively developing bolt_rt, otherwise
# cmake will not rebuild it after source code changes
BUILD_ALWAYS True
)

install(CODE "execute_process\(COMMAND \${CMAKE_COMMAND} -DCMAKE_INSTALL_PREFIX=\${CMAKE_INSTALL_PREFIX} -P ${CMAKE_CURRENT_BINARY_DIR}/bolt_rt-bins/cmake_install.cmake \)"
COMPONENT bolt_rt)

add_llvm_install_targets(install-bolt_rt
DEPENDS bolt_rt
COMPONENT bolt_rt)

add_subdirectory(src)
add_subdirectory(test)