Skip to content

Commit ceb7214

Browse files
authored
[BOLT] Introduce binary analysis tool based on BOLT (llvm#115330)
This initial commit does not add any specific binary analyses yet, it merely contains the boilerplate to introduce a new BOLT-based tool. This basically combines the 4 first patches from the prototype pac-ret and stack-clash binary analyzer discussed in RFC https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148 and published at llvm/llvm-project@main...kbeyls:llvm-project:bolt-gadget-scanner-prototype The introduction of such a BOLT-based binary analysis tool was proposed and discussed in at least the following places: - The RFC pointed to above - EuroLLVM 2024 round table https://discourse.llvm.org/t/summary-of-bolt-as-a-binary-analysis-tool-round-table-at-eurollvm/78441 The round table showed quite a few people interested in being able to build a custom binary analysis quickly with a tool like this. - Also at the US LLVM dev meeting a few weeks ago, I heard interest from a few people, asking when the tool would be available upstream. - The presentation "Adding Pointer Authentication ABI support for your ELF platform" (https://llvm.swoogo.com/2024devmtg/session/2512720/adding-pointer-authentication-abi-support-for-your-elf-platform) explicitly mentioned interest to extend the prototype tool to verify correct implementation of pauthabi.
1 parent 5ca26d7 commit ceb7214

File tree

13 files changed

+219
-0
lines changed

13 files changed

+219
-0
lines changed

bolt/docs/BinaryAnalysis.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# BOLT-based binary analysis
2+
3+
As part of post-link-time optimizing, BOLT needs to perform a range of analyses
4+
on binaries such as recontructing control flow graphs, and more.
5+
6+
The `llvm-bolt-binary-analysis` tool enables running requested binary analyses
7+
on binaries, and generating reports. It does this by building on top of the
8+
analyses implemented in the BOLT libraries.
9+
10+
## Which binary analyses are implemented?
11+
12+
At the moment, no binary analyses are implemented.
13+
14+
The goal is to make it easy using a plug-in framework to add your own analyses.
15+
16+
## How to add your own binary analysis
17+
18+
_TODO: this section needs to be written. Ideally, we should have a simple
19+
"example" or "template" analysis that can be the starting point for implementing
20+
custom analyses_

bolt/include/bolt/Rewrite/RewriteInstance.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,9 @@ class RewriteInstance {
164164

165165
void preregisterSections();
166166

167+
/// run analyses requested in binary analysis mode.
168+
void runBinaryAnalyses();
169+
167170
/// Run optimizations that operate at the binary, or post-linker, level.
168171
void runOptimizationPasses();
169172

bolt/include/bolt/Utils/CommandLineOpts.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
namespace opts {
1919

2020
extern bool HeatmapMode;
21+
extern bool BinaryAnalysisMode;
2122

2223
extern llvm::cl::OptionCategory BoltCategory;
2324
extern llvm::cl::OptionCategory BoltDiffCategory;
@@ -27,6 +28,7 @@ extern llvm::cl::OptionCategory BoltOutputCategory;
2728
extern llvm::cl::OptionCategory AggregatorCategory;
2829
extern llvm::cl::OptionCategory BoltInstrCategory;
2930
extern llvm::cl::OptionCategory HeatmapCategory;
31+
extern llvm::cl::OptionCategory BinaryAnalysisCategory;
3032

3133
extern llvm::cl::opt<unsigned> AlignText;
3234
extern llvm::cl::opt<unsigned> AlignFunctions;

bolt/lib/Rewrite/RewriteInstance.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -699,6 +699,11 @@ Error RewriteInstance::run() {
699699
if (opts::DiffOnly)
700700
return Error::success();
701701

702+
if (opts::BinaryAnalysisMode) {
703+
runBinaryAnalyses();
704+
return Error::success();
705+
}
706+
702707
preregisterSections();
703708

704709
runOptimizationPasses();
@@ -3475,6 +3480,8 @@ void RewriteInstance::runOptimizationPasses() {
34753480
BC->logBOLTErrorsAndQuitOnFatal(BinaryFunctionPassManager::runAllPasses(*BC));
34763481
}
34773482

3483+
void RewriteInstance::runBinaryAnalyses() {}
3484+
34783485
void RewriteInstance::preregisterSections() {
34793486
// Preregister sections before emission to set their order in the output.
34803487
const unsigned ROFlags = BinarySection::getFlags(/*IsReadOnly*/ true,

bolt/lib/Utils/CommandLineOpts.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ const char *BoltRevision =
2929
namespace opts {
3030

3131
bool HeatmapMode = false;
32+
bool BinaryAnalysisMode = false;
3233

3334
cl::OptionCategory BoltCategory("BOLT generic options");
3435
cl::OptionCategory BoltDiffCategory("BOLTDIFF generic options");
@@ -38,6 +39,7 @@ cl::OptionCategory BoltOutputCategory("Output options");
3839
cl::OptionCategory AggregatorCategory("Data aggregation options");
3940
cl::OptionCategory BoltInstrCategory("BOLT instrumentation options");
4041
cl::OptionCategory HeatmapCategory("Heatmap options");
42+
cl::OptionCategory BinaryAnalysisCategory("BinaryAnalysis options");
4143

4244
cl::opt<unsigned> AlignText("align-text",
4345
cl::desc("alignment of .text section"), cl::Hidden,

bolt/test/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ list(APPEND BOLT_TEST_DEPS
3737
lld
3838
llvm-config
3939
llvm-bolt
40+
llvm-bolt-binary-analysis
4041
llvm-bolt-heatmap
4142
llvm-bat-dump
4243
llvm-dwarfdump
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
dummy
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# This file tests error messages produced on invalid command line arguments.
2+
# It also checks that help messages are generated as expected.
3+
4+
# Verify that an error message is provided if an input file is missing or incorrect
5+
6+
RUN: not llvm-bolt-binary-analysis 2>&1 | FileCheck -check-prefix=NOFILEARG %s
7+
NOFILEARG: llvm-bolt-binary-analysis: Not enough positional command line arguments specified!
8+
NOFILEARG-NEXT: Must specify at least 1 positional argument: See: {{.*}}llvm-bolt-binary-analysis --help
9+
10+
RUN: not llvm-bolt-binary-analysis non-existing-file 2>&1 | FileCheck -check-prefix=NONEXISTINGFILEARG %s
11+
NONEXISTINGFILEARG: llvm-bolt-binary-analysis: 'non-existing-file': No such file or directory.
12+
13+
RUN: not llvm-bolt-binary-analysis %p/Inputs/dummy.txt 2>&1 | FileCheck -check-prefix=NOELFFILEARG %s
14+
NOELFFILEARG: llvm-bolt-binary-analysis: '{{.*}}/Inputs/dummy.txt': The file was not recognized as a valid object file.
15+
16+
RUN: %clang %cflags %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
17+
RUN: llvm-bolt-binary-analysis %t.exe 2>&1 | FileCheck -check-prefix=VALIDELFFILEARG --allow-empty %s
18+
# Check that there are no BOLT-WARNING or BOLT-ERROR output lines
19+
VALIDELFFILEARG: BOLT-INFO:
20+
VALIDELFFILEARG-NOT: BOLT-WARNING:
21+
VALIDELFFILEARG-NOT: BOLT-ERROR:
22+
23+
# Check --help output
24+
25+
RUN: llvm-bolt-binary-analysis --help 2>&1 | FileCheck -check-prefix=HELP %s
26+
27+
HELP: OVERVIEW: BinaryAnalysis
28+
HELP-EMPTY:
29+
HELP-NEXT: USAGE: llvm-bolt-binary-analysis [options] <executable>
30+
HELP-EMPTY:
31+
HELP-NEXT: OPTIONS:
32+
HELP-EMPTY:
33+
HELP-NEXT: Generic Options:
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
if "AArch64" not in config.root.targets:
2+
config.unsupported = True
3+
4+
flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding -Wl,--emit-relocs"
5+
6+
config.substitutions.insert(0, ("%cflags", f"%cflags {flags}"))
7+
config.substitutions.insert(0, ("%cxxflags", f"%cxxflags {flags}"))

bolt/test/lit.cfg.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@
110110
),
111111
ToolSubst("llvm-boltdiff", unresolved="fatal"),
112112
ToolSubst("llvm-bolt-heatmap", unresolved="fatal"),
113+
ToolSubst("llvm-bolt-binary-analysis", unresolved="fatal"),
113114
ToolSubst("llvm-bat-dump", unresolved="fatal"),
114115
ToolSubst("perf2bolt", unresolved="fatal"),
115116
ToolSubst("yaml2obj", unresolved="fatal"),

bolt/tools/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@ add_subdirectory(llvm-bolt-fuzzer)
77
add_subdirectory(bat-dump)
88
add_subdirectory(merge-fdata)
99
add_subdirectory(heatmap)
10+
add_subdirectory(binary-analysis)
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
set(LLVM_LINK_COMPONENTS
2+
${LLVM_TARGETS_TO_BUILD}
3+
MC
4+
Object
5+
Support
6+
)
7+
8+
add_bolt_tool(llvm-bolt-binary-analysis
9+
binary-analysis.cpp
10+
DISABLE_LLVM_LINK_LLVM_DYLIB
11+
)
12+
13+
target_link_libraries(llvm-bolt-binary-analysis
14+
PRIVATE
15+
LLVMBOLTRewrite
16+
LLVMBOLTUtils
17+
)
18+
19+
add_dependencies(bolt llvm-bolt-binary-analysis)
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
//===- bolt/tools/binary-analysis/binary-analysis.cpp ---------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
//
9+
// This is a generic binary analysis tool, where multiple different specific
10+
// binary analyses can be plugged in to. The binary analyses are mostly built
11+
// on top of BOLT components.
12+
//
13+
//===----------------------------------------------------------------------===//
14+
15+
#include "bolt/Rewrite/RewriteInstance.h"
16+
#include "bolt/Utils/CommandLineOpts.h"
17+
#include "llvm/MC/TargetRegistry.h"
18+
#include "llvm/Object/Binary.h"
19+
#include "llvm/Object/ELFObjectFile.h"
20+
#include "llvm/Support/CommandLine.h"
21+
#include "llvm/Support/Errc.h"
22+
#include "llvm/Support/ManagedStatic.h"
23+
#include "llvm/Support/PrettyStackTrace.h"
24+
#include "llvm/Support/Program.h"
25+
#include "llvm/Support/Signals.h"
26+
#include "llvm/Support/TargetSelect.h"
27+
#include "llvm/Support/VirtualFileSystem.h"
28+
29+
#define DEBUG_TYPE "bolt"
30+
31+
using namespace llvm;
32+
using namespace object;
33+
using namespace bolt;
34+
35+
namespace opts {
36+
37+
static cl::OptionCategory *BinaryAnalysisCategories[] = {
38+
&BinaryAnalysisCategory};
39+
40+
static cl::opt<std::string> InputFilename(cl::Positional,
41+
cl::desc("<executable>"),
42+
cl::Required,
43+
cl::cat(BinaryAnalysisCategory),
44+
cl::sub(cl::SubCommand::getAll()));
45+
46+
} // namespace opts
47+
48+
static StringRef ToolName = "llvm-bolt-binary-analysis";
49+
50+
static void report_error(StringRef Message, std::error_code EC) {
51+
assert(EC);
52+
errs() << ToolName << ": '" << Message << "': " << EC.message() << ".\n";
53+
exit(1);
54+
}
55+
56+
static void report_error(StringRef Message, Error E) {
57+
assert(E);
58+
errs() << ToolName << ": '" << Message << "': " << toString(std::move(E))
59+
<< ".\n";
60+
exit(1);
61+
}
62+
63+
void ParseCommandLine(int argc, char **argv) {
64+
cl::HideUnrelatedOptions(ArrayRef(opts::BinaryAnalysisCategories));
65+
// Register the target printer for --version.
66+
cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);
67+
68+
cl::ParseCommandLineOptions(argc, argv, "BinaryAnalysis\n");
69+
}
70+
71+
static std::string GetExecutablePath(const char *Argv0) {
72+
SmallString<256> ExecutablePath(Argv0);
73+
// Do a PATH lookup if Argv0 isn't a valid path.
74+
if (!llvm::sys::fs::exists(ExecutablePath))
75+
if (llvm::ErrorOr<std::string> P =
76+
llvm::sys::findProgramByName(ExecutablePath))
77+
ExecutablePath = *P;
78+
return std::string(ExecutablePath.str());
79+
}
80+
81+
int main(int argc, char **argv) {
82+
// Print a stack trace if we signal out.
83+
sys::PrintStackTraceOnErrorSignal(argv[0]);
84+
PrettyStackTraceProgram X(argc, argv);
85+
86+
std::string ToolPath = GetExecutablePath(argv[0]);
87+
88+
llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.
89+
90+
// Initialize targets and assembly printers/parsers.
91+
llvm::InitializeAllTargetInfos();
92+
llvm::InitializeAllTargetMCs();
93+
llvm::InitializeAllAsmParsers();
94+
llvm::InitializeAllDisassemblers();
95+
96+
llvm::InitializeAllTargets();
97+
llvm::InitializeAllAsmPrinters();
98+
99+
ParseCommandLine(argc, argv);
100+
101+
opts::BinaryAnalysisMode = true;
102+
103+
if (!sys::fs::exists(opts::InputFilename))
104+
report_error(opts::InputFilename, errc::no_such_file_or_directory);
105+
106+
Expected<OwningBinary<Binary>> BinaryOrErr =
107+
createBinary(opts::InputFilename);
108+
if (Error E = BinaryOrErr.takeError())
109+
report_error(opts::InputFilename, std::move(E));
110+
Binary &Binary = *BinaryOrErr.get().getBinary();
111+
112+
if (auto *e = dyn_cast<ELFObjectFileBase>(&Binary)) {
113+
auto RIOrErr = RewriteInstance::create(e, argc, argv, ToolPath);
114+
if (Error E = RIOrErr.takeError())
115+
report_error(opts::InputFilename, std::move(E));
116+
RewriteInstance &RI = *RIOrErr.get();
117+
if (Error E = RI.run())
118+
report_error(opts::InputFilename, std::move(E));
119+
}
120+
121+
return EXIT_SUCCESS;
122+
}

0 commit comments

Comments
 (0)