Skip to content

Commit 39e9c40

Browse files
necipfazilPrabhuk
authored andcommitted
[𝘀𝗽𝗿] initial version
Created using spr 1.3.6-beta.1
2 parents 97c7124 + fd28df5 commit 39e9c40

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+1387
-19
lines changed

clang/docs/CallGraphSection.rst

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
==================
2+
Call Graph Section
3+
==================
4+
5+
Introduction
6+
============
7+
8+
With ``-fcall-graph-section``, the compiler will create a call graph section
9+
in the object file. It will include type identifiers for indirect calls and
10+
targets. This information can be used to map indirect calls to their receivers
11+
with matching types. A complete and high-precision call graph can be
12+
reconstructed by complementing this information with disassembly
13+
(see ``llvm-objdump --call-graph-info``).
14+
15+
Semantics
16+
=========
17+
18+
A coarse-grained, type-agnostic call graph may allow indirect calls to target
19+
any function in the program. This approach ensures completeness since no
20+
indirect call edge is missing. However, it is generally poor in precision
21+
due to having unneeded edges.
22+
23+
A call graph section provides type identifiers for indirect calls and targets.
24+
This information can be used to restrict the receivers of an indirect target to
25+
indirect calls with matching type. Consequently, the precision for indirect
26+
call edges are improved while maintaining the completeness.
27+
28+
The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
29+
full call graph information by parsing the content of the call graph section
30+
and disassembling the program for complementary information, e.g., direct
31+
calls.
32+
33+
Section layout
34+
==============
35+
36+
A call graph section consists of zero or more call graph entries.
37+
Each entry contains information on a function and its indirect calls.
38+
39+
An entry of a call graph section has the following layout in the binary:
40+
41+
+---------------------+-----------------------------------------------------------------------+
42+
| Element | Content |
43+
+=====================+=======================================================================+
44+
| FormatVersionNumber | Format version number. |
45+
+---------------------+-----------------------------------------------------------------------+
46+
| FunctionEntryPc | Function entry address. |
47+
+---------------------+-----------------------------------+-----------------------------------+
48+
| | A flag whether the function is an | - 0: not an indirect target |
49+
| FunctionKind | indirect target, and if so, | - 1: indirect target, unknown id |
50+
| | whether its type id is known. | - 2: indirect target, known id |
51+
+---------------------+-----------------------------------+-----------------------------------+
52+
| FunctionTypeId | Type id for the indirect target. Present only when FunctionKind is 2. |
53+
+---------------------+-----------------------------------------------------------------------+
54+
| CallSiteCount | Number of type id to indirect call site mappings that follow. |
55+
+---------------------+-----------------------------------------------------------------------+
56+
| CallSiteList | List of type id and indirect call site pc pairs. |
57+
+---------------------+-----------------------------------------------------------------------+
58+
59+
Each element in an entry (including each element of the contained lists and
60+
pairs) occupies 64-bit space.
61+
62+
The format version number is repeated per entry to support concatenation of
63+
call graph sections with different format versions by the linker.
64+
65+
As of now, the only supported format version is described above and has version
66+
number 0.
67+
68+
Type identifiers
69+
================
70+
71+
The type for an indirect call or target is the function signature.
72+
The mapping from a type to an identifier is an ABI detail.
73+
In the current experimental implementation, an identifier of type T is
74+
computed as follows:
75+
76+
- Obtain the generalized mangled name for “typeinfo name for T”.
77+
- Compute MD5 hash of the name as a string.
78+
- Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer.
79+
80+
To avoid mismatched pointer types, generalizations are applied.
81+
Pointers in return and argument types are treated as equivalent as long as the
82+
qualifiers for the type they point to match.
83+
For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
84+
types. However, ``char*`` and ``const char*`` are considered separate types.
85+
86+
Missing type identifiers
87+
========================
88+
89+
For functions, two cases need to be considered. First, if the compiler cannot
90+
deduce a type id for an indirect target, it will be listed as an indirect target
91+
without a type id. Second, if an object without a call graph section gets
92+
linked, the final call graph section will lack information on functions from
93+
the object. For completeness, these functions need to be taken as receiver to
94+
any indirect call regardless of their type id.
95+
``llvm-objdump --call-graph-info`` lists these functions as indirect targets
96+
with `UNKNOWN` type id.
97+
98+
For indirect calls, current implementation guarantees a type id for each
99+
compiled call. However, if an object without a call graph section gets linked,
100+
no type id will be present for its indirect calls. For completeness, these calls
101+
need to be taken to target any indirect target regardless of their type id. For
102+
indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of
103+
indirect calls, 2) type id to indirect call mappings. The difference of these
104+
lists allow to deduce the indirect calls with missing type ids.
105+
106+
TODO: measure and report the ratio of missed type ids
107+
108+
Performance
109+
===========
110+
111+
A call graph section does not affect the executable code and does not occupy
112+
memory during process execution. Therefore, there is no performance overhead.
113+
114+
The scheme has not yet been optimized for binary size.
115+
116+
TODO: measure and report the increase in the binary size
117+
118+
Example
119+
=======
120+
121+
For example, consider the following C++ code:
122+
123+
.. code-block:: cpp
124+
125+
namespace {
126+
// Not an indirect target
127+
void foo() {}
128+
}
129+
130+
// Indirect target 1
131+
void bar() {}
132+
133+
// Indirect target 2
134+
int baz(char a, float *b) {
135+
return 0;
136+
}
137+
138+
// Indirect target 3
139+
int main() {
140+
char a;
141+
float b;
142+
void (*fp_bar)() = bar;
143+
int (*fp_baz1)(char, float*) = baz;
144+
int (*fp_baz2)(char, float*) = baz;
145+
146+
// Indirect call site 1
147+
fp_bar();
148+
149+
// Indirect call site 2
150+
fp_baz1(a, &b);
151+
152+
// Indirect call site 3: shares the type id with indirect call site 2
153+
fp_baz2(a, &b);
154+
155+
// Direct call sites
156+
foo();
157+
bar();
158+
baz(a, &b);
159+
160+
return 0;
161+
}
162+
163+
Following will compile it with a call graph section created in the binary:
164+
165+
.. code-block:: bash
166+
167+
$ clang -fcall-graph-section example.cpp
168+
169+
During the construction of the call graph section, the type identifiers are
170+
computed as follows:
171+
172+
+---------------+-----------------------+----------------------------+----------------------------+
173+
| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) |
174+
+===============+=======================+============================+============================+
175+
| bar | void () | _ZTSFvvE.generalized | f85c699bb8ef20a2 |
176+
+---------------+-----------------------+----------------------------+----------------------------+
177+
| baz | int (char, void*) | _ZTSFicPvE.generalized | e3804d2a7f2b03fe |
178+
+---------------+-----------------------+----------------------------+----------------------------+
179+
| main | int () | _ZTSFivE.generalized | a9494def81a01dc |
180+
+---------------+-----------------------+----------------------------+----------------------------+
181+
182+
The call graph section will have the following content:
183+
184+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
185+
| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList |
186+
+===============+=================+==============+================+===============+======================================+
187+
| 0 | EntryPc(foo) | 0 | (empty) | 0 | (empty) |
188+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
189+
| 0 | EntryPc(bar) | 2 | TypeId(bar) | 0 | (empty) |
190+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
191+
| 0 | EntryPc(baz) | 2 | TypeId(baz) | 0 | (empty) |
192+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
193+
| 0 | EntryPc(main) | 2 | TypeId(main) | 3 | * TypeId(bar), CallSitePc(fp_bar()) |
194+
| | | | | | * TypeId(baz), CallSitePc(fp_baz1()) |
195+
| | | | | | * TypeId(baz), CallSitePc(fp_baz2()) |
196+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
197+
198+
199+
The ``llvm-objdump`` utility can parse the call graph section and disassemble
200+
the program to provide complete call graph information. This includes any
201+
additional call sites from the binary:
202+
203+
.. code-block:: bash
204+
205+
$ llvm-objdump --call-graph-info a.out
206+
207+
# Comments are not a part of the llvm-objdump's output but inserted for clarifications.
208+
209+
a.out: file format elf64-x86-64
210+
# These warnings are due to the functions and the indirect calls coming from linked objects.
211+
llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls
212+
llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions
213+
214+
# Unknown targets are the 10 functions the warnings mention.
215+
INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,])
216+
UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230
217+
a9494def81a01dc 401150 # main()
218+
f85c699bb8ef20a2 401120 # bar()
219+
e3804d2a7f2b03fe 401130 # baz()
220+
221+
# Notice that the call sites share the same type id as target functions
222+
INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,])
223+
f85c699bb8ef20a2 401181 # Indirect call site 1 (fp_bar())
224+
e3804d2a7f2b03fe 401191 4011a1 # Indirect call site 2 and 3 (fp_baz1() and fp_baz2())
225+
226+
INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,])
227+
401000 401012 # _init
228+
401150 401181 401191 4011a1 # main calls fp_bar(), fp_baz1(), fp_baz2()
229+
4011d0 401215 # __libc_csu_init
230+
401020 40104a # _start
231+
232+
DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),])
233+
4010d0 4010e2 401060 # __do_global_dtors_aux
234+
401150 4011a6 401110 4011ab 401120 4011ba 401130 # main calls foo(), bar(), baz()
235+
4011d0 4011fd 401000 # __libc_csu_init
236+
237+
FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME)
238+
401000 _init
239+
401100 frame_dummy
240+
401234 _fini
241+
401050 _dl_relocate_static_pie
242+
401090 register_tm_clones
243+
4010d0 __do_global_dtors_aux
244+
401110 _ZN12_GLOBAL__N_13fooEv # (anonymous namespace)::foo()
245+
401150 main # main
246+
4011d0 __libc_csu_init
247+
401020 _start
248+
401060 deregister_tm_clones
249+
401120 _Z3barv # bar()
250+
401130 _Z3bazcPf # baz(char, float*)
251+
401230 __libc_csu_fini

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ CODEGENOPT(EnableNoundefAttrs, 1, 0) ///< Enable emitting `noundef` attributes o
7676
CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
7777
///< pass manager.
7878
CODEGENOPT(DisableRedZone , 1, 0) ///< Set when -mno-red-zone is enabled.
79+
CODEGENOPT(CallGraphSection, 1, 0) ///< Emit a call graph section into the
80+
///< object file.
7981
CODEGENOPT(EmitCallSiteInfo, 1, 0) ///< Emit call site info only in the case of
8082
///< '-g' + 'O>0' level.
8183
CODEGENOPT(IndirectTlsSegRefs, 1, 0) ///< Set when -mno-tls-direct-seg-refs

clang/include/clang/Driver/Options.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4099,6 +4099,10 @@ defm data_sections : BoolFOption<"data-sections",
40994099
PosFlag<SetTrue, [], [ClangOption, CC1Option],
41004100
"Place each data in its own section">,
41014101
NegFlag<SetFalse>>;
4102+
defm call_graph_section : BoolFOption<"call-graph-section",
4103+
CodeGenOpts<"CallGraphSection">, DefaultFalse,
4104+
PosFlag<SetTrue, [], [CC1Option], "Emit a call graph section">,
4105+
NegFlag<SetFalse>>;
41024106
defm stack_size_section : BoolFOption<"stack-size-section",
41034107
CodeGenOpts<"StackSizeSection">, DefaultFalse,
41044108
PosFlag<SetTrue, [], [ClangOption, CC1Option],

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -416,6 +416,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
416416
Options.StackUsageOutput = CodeGenOpts.StackUsageOutput;
417417
Options.EmitAddrsig = CodeGenOpts.Addrsig;
418418
Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
419+
Options.EmitCallGraphSection = CodeGenOpts.CallGraphSection;
419420
Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
420421
Options.EnableAIXExtendedAltivecABI = LangOpts.EnableAIXExtendedAltivecABI;
421422
Options.XRayFunctionIndex = CodeGenOpts.XRayFunctionIndex;

clang/lib/CodeGen/CGCall.cpp

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5687,6 +5687,28 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
56875687
AllocAlignAttrEmitter AllocAlignAttrEmitter(*this, TargetDecl, CallArgs);
56885688
Attrs = AllocAlignAttrEmitter.TryEmitAsCallSiteAttribute(Attrs);
56895689

5690+
if (CGM.getCodeGenOpts().CallGraphSection) {
5691+
// FIXME: create operand bundle only for indirect calls, not for all
5692+
5693+
assert((TargetDecl && TargetDecl->getFunctionType() ||
5694+
Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
5695+
"cannot find callsite type");
5696+
5697+
QualType CST;
5698+
if (TargetDecl && TargetDecl->getFunctionType())
5699+
CST = QualType(TargetDecl->getFunctionType(), 0);
5700+
else if (const auto *FPT =
5701+
Callee.getAbstractInfo().getCalleeFunctionProtoType())
5702+
CST = QualType(FPT, 0);
5703+
5704+
if (!CST.isNull()) {
5705+
auto *TypeIdMD = CGM.CreateMetadataIdentifierGeneralized(CST);
5706+
auto *TypeIdMDVal =
5707+
llvm::MetadataAsValue::get(getLLVMContext(), TypeIdMD);
5708+
BundleList.emplace_back("type", TypeIdMDVal);
5709+
}
5710+
}
5711+
56905712
// Emit the actual call/invoke instruction.
56915713
llvm::CallBase *CI;
56925714
if (!InvokeDest) {

clang/lib/CodeGen/CodeGenModule.cpp

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2685,7 +2685,17 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const NamedDecl *ND) {
26852685

26862686
void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
26872687
llvm::Function *F) {
2688-
// Only if we are checking indirect calls.
2688+
bool EmittedMDIdGeneralized = false;
2689+
if (CodeGenOpts.CallGraphSection &&
2690+
(!F->hasLocalLinkage() ||
2691+
F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
2692+
/* IgnoreAssumeLikeCalls */ true,
2693+
/* IgnoreLLVMUsed */ false))) {
2694+
F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
2695+
EmittedMDIdGeneralized = true;
2696+
}
2697+
2698+
// Add additional metadata only if we are checking indirect calls with CFI.
26892699
if (!LangOpts.Sanitize.has(SanitizerKind::CFIICall))
26902700
return;
26912701

@@ -2696,7 +2706,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
26962706

26972707
llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
26982708
F->addTypeMetadata(0, MD);
2699-
F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
2709+
// Add the generalized identifier if not added already.
2710+
if (!EmittedMDIdGeneralized)
2711+
F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
27002712

27012713
// Emit a hash-based bit set entry for cross-DSO calls.
27022714
if (CodeGenOpts.SanitizeCfiCrossDso)
@@ -2831,7 +2843,8 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F,
28312843
// are non-canonical then we need type metadata in order to produce the local
28322844
// jump table.
28332845
if (!CodeGenOpts.SanitizeCfiCrossDso ||
2834-
!CodeGenOpts.SanitizeCfiCanonicalJumpTables)
2846+
!CodeGenOpts.SanitizeCfiCanonicalJumpTables ||
2847+
CodeGenOpts.CallGraphSection)
28352848
CreateFunctionTypeMetadataForIcall(FD, F);
28362849

28372850
if (LangOpts.Sanitize.has(SanitizerKind::KCFI))

clang/lib/Driver/ToolChains/Clang.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6444,6 +6444,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
64446444
CmdArgs.push_back(A->getValue());
64456445
}
64466446

6447+
if (Args.hasFlag(options::OPT_fcall_graph_section,
6448+
options::OPT_fno_call_graph_section, false))
6449+
CmdArgs.push_back("-fcall-graph-section");
6450+
64476451
Args.addOptInFlag(CmdArgs, options::OPT_fstack_size_section,
64486452
options::OPT_fno_stack_size_section);
64496453

0 commit comments

Comments
 (0)