|
| 1 | +================== |
| 2 | +Call Graph Section |
| 3 | +================== |
| 4 | + |
| 5 | +Introduction |
| 6 | +============ |
| 7 | + |
| 8 | +With ``-fcall-graph-section``, the compiler will create a call graph section |
| 9 | +in the object file. It will include type identifiers for indirect calls and |
| 10 | +targets. This information can be used to map indirect calls to their receivers |
| 11 | +with matching types. A complete and high-precision call graph can be |
| 12 | +reconstructed by complementing this information with disassembly |
| 13 | +(see ``llvm-objdump --call-graph-info``). |
| 14 | + |
| 15 | +Semantics |
| 16 | +========= |
| 17 | + |
| 18 | +A coarse-grained, type-agnostic call graph may allow indirect calls to target |
| 19 | +any function in the program. This approach ensures completeness since no |
| 20 | +indirect call edge is missing. However, it is generally poor in precision |
| 21 | +due to having unneeded edges. |
| 22 | + |
| 23 | +A call graph section provides type identifiers for indirect calls and targets. |
| 24 | +This information can be used to restrict the receivers of an indirect target to |
| 25 | +indirect calls with matching type. Consequently, the precision for indirect |
| 26 | +call edges are improved while maintaining the completeness. |
| 27 | + |
| 28 | +The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract |
| 29 | +full call graph information by parsing the content of the call graph section |
| 30 | +and disassembling the program for complementary information, e.g., direct |
| 31 | +calls. |
| 32 | + |
| 33 | +Section layout |
| 34 | +============== |
| 35 | + |
| 36 | +A call graph section consists of zero or more call graph entries. |
| 37 | +Each entry contains information on a function and its indirect calls. |
| 38 | + |
| 39 | +An entry of a call graph section has the following layout in the binary: |
| 40 | + |
| 41 | ++---------------------+-----------------------------------------------------------------------+ |
| 42 | +| Element | Content | |
| 43 | ++=====================+=======================================================================+ |
| 44 | +| FormatVersionNumber | Format version number. | |
| 45 | ++---------------------+-----------------------------------------------------------------------+ |
| 46 | +| FunctionEntryPc | Function entry address. | |
| 47 | ++---------------------+-----------------------------------+-----------------------------------+ |
| 48 | +| | A flag whether the function is an | - 0: not an indirect target | |
| 49 | +| FunctionKind | indirect target, and if so, | - 1: indirect target, unknown id | |
| 50 | +| | whether its type id is known. | - 2: indirect target, known id | |
| 51 | ++---------------------+-----------------------------------+-----------------------------------+ |
| 52 | +| FunctionTypeId | Type id for the indirect target. Present only when FunctionKind is 2. | |
| 53 | ++---------------------+-----------------------------------------------------------------------+ |
| 54 | +| CallSiteCount | Number of type id to indirect call site mappings that follow. | |
| 55 | ++---------------------+-----------------------------------------------------------------------+ |
| 56 | +| CallSiteList | List of type id and indirect call site pc pairs. | |
| 57 | ++---------------------+-----------------------------------------------------------------------+ |
| 58 | + |
| 59 | +Each element in an entry (including each element of the contained lists and |
| 60 | +pairs) occupies 64-bit space. |
| 61 | + |
| 62 | +The format version number is repeated per entry to support concatenation of |
| 63 | +call graph sections with different format versions by the linker. |
| 64 | + |
| 65 | +As of now, the only supported format version is described above and has version |
| 66 | +number 0. |
| 67 | + |
| 68 | +Type identifiers |
| 69 | +================ |
| 70 | + |
| 71 | +The type for an indirect call or target is the function signature. |
| 72 | +The mapping from a type to an identifier is an ABI detail. |
| 73 | +In the current experimental implementation, an identifier of type T is |
| 74 | +computed as follows: |
| 75 | + |
| 76 | + - Obtain the generalized mangled name for “typeinfo name for T”. |
| 77 | + - Compute MD5 hash of the name as a string. |
| 78 | + - Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer. |
| 79 | + |
| 80 | +To avoid mismatched pointer types, generalizations are applied. |
| 81 | +Pointers in return and argument types are treated as equivalent as long as the |
| 82 | +qualifiers for the type they point to match. |
| 83 | +For example, ``char*``, ``char**``, and ``int*`` are considered equivalent |
| 84 | +types. However, ``char*`` and ``const char*`` are considered separate types. |
| 85 | + |
| 86 | +Missing type identifiers |
| 87 | +======================== |
| 88 | + |
| 89 | +For functions, two cases need to be considered. First, if the compiler cannot |
| 90 | +deduce a type id for an indirect target, it will be listed as an indirect target |
| 91 | +without a type id. Second, if an object without a call graph section gets |
| 92 | +linked, the final call graph section will lack information on functions from |
| 93 | +the object. For completeness, these functions need to be taken as receiver to |
| 94 | +any indirect call regardless of their type id. |
| 95 | +``llvm-objdump --call-graph-info`` lists these functions as indirect targets |
| 96 | +with `UNKNOWN` type id. |
| 97 | + |
| 98 | +For indirect calls, current implementation guarantees a type id for each |
| 99 | +compiled call. However, if an object without a call graph section gets linked, |
| 100 | +no type id will be present for its indirect calls. For completeness, these calls |
| 101 | +need to be taken to target any indirect target regardless of their type id. For |
| 102 | +indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of |
| 103 | +indirect calls, 2) type id to indirect call mappings. The difference of these |
| 104 | +lists allow to deduce the indirect calls with missing type ids. |
| 105 | + |
| 106 | +TODO: measure and report the ratio of missed type ids |
| 107 | + |
| 108 | +Performance |
| 109 | +=========== |
| 110 | + |
| 111 | +A call graph section does not affect the executable code and does not occupy |
| 112 | +memory during process execution. Therefore, there is no performance overhead. |
| 113 | + |
| 114 | +The scheme has not yet been optimized for binary size. |
| 115 | + |
| 116 | +TODO: measure and report the increase in the binary size |
| 117 | + |
| 118 | +Example |
| 119 | +======= |
| 120 | + |
| 121 | +For example, consider the following C++ code: |
| 122 | + |
| 123 | +.. code-block:: cpp |
| 124 | +
|
| 125 | + namespace { |
| 126 | + // Not an indirect target |
| 127 | + void foo() {} |
| 128 | + } |
| 129 | +
|
| 130 | + // Indirect target 1 |
| 131 | + void bar() {} |
| 132 | +
|
| 133 | + // Indirect target 2 |
| 134 | + int baz(char a, float *b) { |
| 135 | + return 0; |
| 136 | + } |
| 137 | +
|
| 138 | + // Indirect target 3 |
| 139 | + int main() { |
| 140 | + char a; |
| 141 | + float b; |
| 142 | + void (*fp_bar)() = bar; |
| 143 | + int (*fp_baz1)(char, float*) = baz; |
| 144 | + int (*fp_baz2)(char, float*) = baz; |
| 145 | +
|
| 146 | + // Indirect call site 1 |
| 147 | + fp_bar(); |
| 148 | +
|
| 149 | + // Indirect call site 2 |
| 150 | + fp_baz1(a, &b); |
| 151 | +
|
| 152 | + // Indirect call site 3: shares the type id with indirect call site 2 |
| 153 | + fp_baz2(a, &b); |
| 154 | +
|
| 155 | + // Direct call sites |
| 156 | + foo(); |
| 157 | + bar(); |
| 158 | + baz(a, &b); |
| 159 | +
|
| 160 | + return 0; |
| 161 | + } |
| 162 | +
|
| 163 | +Following will compile it with a call graph section created in the binary: |
| 164 | + |
| 165 | +.. code-block:: bash |
| 166 | +
|
| 167 | + $ clang -fcall-graph-section example.cpp |
| 168 | +
|
| 169 | +During the construction of the call graph section, the type identifiers are |
| 170 | +computed as follows: |
| 171 | + |
| 172 | ++---------------+-----------------------+----------------------------+----------------------------+ |
| 173 | +| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) | |
| 174 | ++===============+=======================+============================+============================+ |
| 175 | +| bar | void () | _ZTSFvvE.generalized | f85c699bb8ef20a2 | |
| 176 | ++---------------+-----------------------+----------------------------+----------------------------+ |
| 177 | +| baz | int (char, void*) | _ZTSFicPvE.generalized | e3804d2a7f2b03fe | |
| 178 | ++---------------+-----------------------+----------------------------+----------------------------+ |
| 179 | +| main | int () | _ZTSFivE.generalized | a9494def81a01dc | |
| 180 | ++---------------+-----------------------+----------------------------+----------------------------+ |
| 181 | + |
| 182 | +The call graph section will have the following content: |
| 183 | + |
| 184 | ++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+ |
| 185 | +| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList | |
| 186 | ++===============+=================+==============+================+===============+======================================+ |
| 187 | +| 0 | EntryPc(foo) | 0 | (empty) | 0 | (empty) | |
| 188 | ++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+ |
| 189 | +| 0 | EntryPc(bar) | 2 | TypeId(bar) | 0 | (empty) | |
| 190 | ++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+ |
| 191 | +| 0 | EntryPc(baz) | 2 | TypeId(baz) | 0 | (empty) | |
| 192 | ++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+ |
| 193 | +| 0 | EntryPc(main) | 2 | TypeId(main) | 3 | * TypeId(bar), CallSitePc(fp_bar()) | |
| 194 | +| | | | | | * TypeId(baz), CallSitePc(fp_baz1()) | |
| 195 | +| | | | | | * TypeId(baz), CallSitePc(fp_baz2()) | |
| 196 | ++---------------+-----------------+--------------+----------------+---------------+--------------------------------------+ |
| 197 | + |
| 198 | + |
| 199 | +The ``llvm-objdump`` utility can parse the call graph section and disassemble |
| 200 | +the program to provide complete call graph information. This includes any |
| 201 | +additional call sites from the binary: |
| 202 | + |
| 203 | +.. code-block:: bash |
| 204 | +
|
| 205 | + $ llvm-objdump --call-graph-info a.out |
| 206 | +
|
| 207 | + # Comments are not a part of the llvm-objdump's output but inserted for clarifications. |
| 208 | +
|
| 209 | + a.out: file format elf64-x86-64 |
| 210 | + # These warnings are due to the functions and the indirect calls coming from linked objects. |
| 211 | + llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls |
| 212 | + llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions |
| 213 | +
|
| 214 | + # Unknown targets are the 10 functions the warnings mention. |
| 215 | + INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,]) |
| 216 | + UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230 |
| 217 | + a9494def81a01dc 401150 # main() |
| 218 | + f85c699bb8ef20a2 401120 # bar() |
| 219 | + e3804d2a7f2b03fe 401130 # baz() |
| 220 | +
|
| 221 | + # Notice that the call sites share the same type id as target functions |
| 222 | + INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,]) |
| 223 | + f85c699bb8ef20a2 401181 # Indirect call site 1 (fp_bar()) |
| 224 | + e3804d2a7f2b03fe 401191 4011a1 # Indirect call site 2 and 3 (fp_baz1() and fp_baz2()) |
| 225 | +
|
| 226 | + INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,]) |
| 227 | + 401000 401012 # _init |
| 228 | + 401150 401181 401191 4011a1 # main calls fp_bar(), fp_baz1(), fp_baz2() |
| 229 | + 4011d0 401215 # __libc_csu_init |
| 230 | + 401020 40104a # _start |
| 231 | +
|
| 232 | + DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),]) |
| 233 | + 4010d0 4010e2 401060 # __do_global_dtors_aux |
| 234 | + 401150 4011a6 401110 4011ab 401120 4011ba 401130 # main calls foo(), bar(), baz() |
| 235 | + 4011d0 4011fd 401000 # __libc_csu_init |
| 236 | +
|
| 237 | + FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME) |
| 238 | + 401000 _init |
| 239 | + 401100 frame_dummy |
| 240 | + 401234 _fini |
| 241 | + 401050 _dl_relocate_static_pie |
| 242 | + 401090 register_tm_clones |
| 243 | + 4010d0 __do_global_dtors_aux |
| 244 | + 401110 _ZN12_GLOBAL__N_13fooEv # (anonymous namespace)::foo() |
| 245 | + 401150 main # main |
| 246 | + 4011d0 __libc_csu_init |
| 247 | + 401020 _start |
| 248 | + 401060 deregister_tm_clones |
| 249 | + 401120 _Z3barv # bar() |
| 250 | + 401130 _Z3bazcPf # baz(char, float*) |
| 251 | + 401230 __libc_csu_fini |
0 commit comments