-
Notifications
You must be signed in to change notification settings - Fork 0
The RepoMetadataGeneration and RepoPruning Passes
The program repository adds two passes — one for metadata generation and a second for “pruning” — to the LLVM pass pipeline as well as a new metadata type for communication. Together they enable the compiler to avoid the code generation phases for functions and global variables whose definitions already have binary representations in the repository.
These new passes derive from the ModulePass class and are placed as early as possible in the pass pipeline and to ensure that we can avoid wasteful work in the later phases of the compiler.
The Program Repository adds a new RepoDefinition type which is used to communicate between the metadata generation, pruning, and back-end “object” writing components. It contains the hash digest of the associated object along with an (initially false) pruned boolean field and other values (such as linkage and visibility).
This pass — named RepoMetadataGeneration — loops over all of the functions and variables defined by the input module. For each, a cryptographic hash digest is calculated to identify it. We then create RepoDefinition metadata for the object with that digest attached.
The hash calculation for functions and variables must consider all aspects of the IR definition and its environment that affects the final generated code. The environment includes the target triple, the data layout for the module’s target platform, and the collection of enabled LLVM passes.
The initial digest of an individual object considers a number of factors including its environments and properties determined by whether it is a function or global variable. It does not include information relating to its connection to other objects in the translation unit.
The function hash calculation needs to consider the following:
- The type and number of formal arguments.
- The function return type.
- The function attributes (including those for its return type and formal arguments).
- Whether the function takes a variable number of arguments.
- The calling convention. Calling conventions may differ in where parameters, return values and return addresses are placed (in registers, on the call stack, a mix of both, or in other memory structures). If the function has input parameters, and the function return type is not void, then the calling conventions need to be considered.
- The function body. We do a control-flow-graph ordered walk since the actual ordering of the basic blocks in the linked list is immaterial. Our walk starts at the entry block for the function, then visits the next block from each terminator in order; as an artefact, this also means that unreachable blocks are ignored. When hashing each instruction, the opcode, return type, operand types, operand values, and any other factors affecting the operation must be considered.
The function name doesn’t need to be considered in the hash calculation.
The global variable hash calculation needs to consider the following:
- The variable’s type.
- Whether the variable is a constant.
- The thread local mode.
- The alignment.
- The
GlobalVariable::UnnamedAddrtype. - The initial value (if present) or the default initializer.
In addition to the object itself, to allow for the effects of any potential optimizations, we need to consider the functions and variables that the object references as well as those by which it is referenced. These relationships take the form of a directed, potentially cyclic, graph. The vertices of the graph are the compilation’s definitions; its edges are formed by two categories of association:
- Contributions
- The contributions of an object x is the set of objects y which transitively reference x and where a potential optimisation to either x or any of y may invalidate both.
- Dependencies
- The dependencies of an object x is the set of objects y which x transitively references but are not contributions.
(Here, an object is an LLVM global object: that is, a function or global variable.) A contribution represents a bi-directional relationship between two objects; a dependency is un-directional.
The final hash of each object is computed by hashing together its initial digest along with the initial digest of each transitively connected object following the graph edges created by the contribution and dependency arrows. Note that the graph-hashing algorithm takes special care to ensure that the shape of the graph is entirely captured including loops such as those caused by recursive functions.
The examples below use the following notation to describe the computation of each object’s final digest.
- Hf(x)
- The final digest of vertex x.
- Hi(x)
- The initial digest of vertex x.
- +
- The concatenation of two hash digests.
int f(void) { return 2; }
int g(void) { return f() + 1; }It is a common optimization for the compiler to “inline” f(), that is, to replace the call to f() in g() with a copy of the body of the function. This gives:
int f(void) { return 2; }
int g(void) { return 2 + 1; }Here if the body of f() changes between compiles then we must also recompile g() so that the behaviour of g() remains correct. However, changing g() does not affect f(). The hash of f() must be included in the final hash of g(). If we encounter a call to a function whose body is present in the same compilation, then we have a dependency from the caller to the callee. This relationship can be shown by a graph:
The blue edge from g→f in the above diagram indicates a dependency relationship recorded on the sayHello object and representing the bi-directional relationship between these two objects. To compute the final hash for each object, we hash its initial digest and walk the graph of connected objects, adding their initial digest as we visit each vertex:
- Hf(f) = Hi(f)
- Hf(g) = Hi(g) + Hi(f)
Consider the following snippet:
#include <stdio.h>
static const char * Str = "Hello, World\n";
void sayHello(void) {
printf(Str);
}It is possible for an optimization to transform this code to:
#include <stdio.h>
static const char * Str1 = "Hello, World";
void sayHello(void) {
puts(Str1);
}It can be seen that the call to printf() in sayHello() has been replaced by puts(). This has also entailed creating a new string from Str with the same contents but without the trailing carriage return. For this reason, a change to either Str or sayHello() must trigger the recompilation of both. This means that the hash of sayHello() must incorporate the hash of Str and vice-versa.
The red bi-directional edge in the above digram indicates a contribution relationship between the sayHello() function and the Str variable where a change to either must trigger re-compilation of both. We can now compute the final digest of each object:
- Hf(Str) = Hi(Str) + Hi(sayHello)
- Hf(sayHello) = Hi(sayHello) + Hi(Str)
static int G = 0;
void setTo(int * const P, int V) {
*P += V;
}
int test(void) {
setTo(&G, 2);
return G;
}Here, test() passes the address of global variable G to setTo() which modifies its value.
Clearly a change to setTo() could change the binary code for test() in the event that it is inlined: this can be represented by a dependency edge (test → setTo).
If we allow that an optimization could propagate the initial value of G to test() (and then potentially remove G altogether), updating G between compilations should trigger a re-compile of test(): we have an edge G → test.
Likewise, an optimization to alter the representation of G based on the possible values it might contain can be imagined in the same way that the optimizer can rewrite the value of Str in example 2. This suggests that, as in that earlier case, the graph must calso ontain an edge test → G.
The bi-directional connection test ⇆ G is represented by a single contribution record.
For this example, the contributions and dependencies of each object are shown in the table below:
| Object | Dependencies | Contributions |
|---|---|---|
G |
∅ | ∅ |
setTo |
∅ | ∅ |
test |
{ setTo } |
{ G } |
This may also be expressed as a directed graph:
We can now compute the final digest of each object:
- Hf(G) = Hi(G) + Hi(test)
- Hf(setTo) = Hi(setTo)
- Hf(test) = Hi(test) + Hi(G) + Hi(setTo)
The RepoPruning pass removes unchanged GOs whose definitions already have binary representations in the repository. If a GO is pruned, it is changed from a definition to a declaration to avoid the later phases of the compiling process. At the same time, RepoDefinition metadata is added for all of the fragment's dependents.
The following example explains the GO’s dependent list.
test.ll:
target triple = "x86_64-pc-linux-gnu"
@.input_str = private unnamed_addr constant [7 x i8] c"hello\0A\00", align 1
declare i32 @printf(i8* nocapture readonly, ...)
define void @test() align 2 {
entry:
%call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.input_str, i32 0, i32 0))
ret void
}Optimised IR can be generated using the following command:
$ rm clang.db
$ opt -O3 -S test.ll -mtriple x64_64-pc-linux-gnu-repo -o test.opt.ll
The RepoMetadataGeneration pass creates the RepoDefinition metadata for each GO. The optimised IR code is given below:
target triple = "x64_64-pc-linux-gnu-repo"
@str = private unnamed_addr constant [6 x i8] c"hello\00"
define void @test() local_unnamed_addr align 2 !repo_ticket !0 {
entry:
%puts = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @str, i64 0, i64 0))
ret void
}
declare i32 @puts(i8* nocapture readonly) local_unnamed_addr
!repo.tickets = !{!0, !1}
!0 = !RepoDefinition(name: "test", digest: [16 x i8] c"l1uX\9Fr\90o\B4\D8\F9\184\A3\9E\AA", linkage: external, pruned: false)
!1 = !RepoDefinition(name: ".input_str", digest: [16 x i8] c"\B4\82v\D2\94k\8A5\A9\BD\E4P\DB\EB\BA\D4", linkage: private, pruned: false)The test() function’s fragment has an external fixup to a symbol str which is a global variable generated by the optimizer. This new global variable doesn’t have RepoDefinition metadata since this is opimization occurs after the RepoMetadataGeneration pass. The MC-repo backend will create a ticket member for the symbol str, save it to the repository, and add it to the fragment's dependents. When the function test is pruned, the RepoDefinition metadata must include test() itself and its dependent symbol str.