### The BAP Handbook

David Brumley and Ivan Jager

February 18, 2011

# **Contents**

| 1 | Intro | roduction                              | 1  |
|---|-------|----------------------------------------|----|
|   | 1.1   | The Need for Binary Analysis           | 1  |
|   | 1.2   | Desired Properties                     | 3  |
|   | 1.3   | BAP Overview                           | 4  |
| 2 | The   | BAP Toolkit                            | 7  |
|   | 2.1   | Toolkit Overview                       | 7  |
|   | 2.2   | Obtaining BAP                          | 7  |
|   | 2.3   | Lifting Binary Code to BIL             | 7  |
|   | 2.4   | The internals of toil                  | 7  |
|   | 2.5   | Transformations on BIL Code            | 9  |
|   | 2.6   | Tools from BIL                         | 10 |
|   |       | 2.6.1 topredicate                      | 10 |
|   |       | •                                      | 10 |
|   |       | 2.6.3 toc                              | 10 |
|   | 2.7   | Discussion                             | 11 |
|   |       | 2.7.1 Why Design a New Infrastructure? | 11 |
|   |       | 2.7.2 Limitations of BAP               | 11 |
|   |       | 2.7.3 Is Lifting Correct?              | 11 |
|   |       | 2.7.4 Size of Lifting IL for a Program | 12 |
| 3 | Forr  | malization of BIL                      | 13 |
|   | 3.1   | The BIL Language                       | 13 |
|   |       |                                        | 13 |
|   |       | 3.1.2 Expressions                      | 15 |
|   |       |                                        | 15 |
|   | 3.2   | Normalized Memory                      | 16 |
|   |       | •                                      | 17 |
| 4 | BAP   | P Development Guide                    | 21 |
|   | 4.1   | •                                      | 21 |
| 5 | Rela  | ated Work                              | 23 |
| 6 | Cred  | dits                                   | 25 |

CONTENTS

# **List of Figures**

| 1.1 | A three instruction x86 program, and its corresponding control flow graph. Note the logic                                                                                              |    |
|-----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|     | for deciding when the jump on overflow instruction (statement 3) is taken, when is often                                                                                               |    |
|     | omitted by other binary analysis platforms.                                                                                                                                            | 2  |
| 1.2 | The semantics of the rep instruction according to Intel [17]. Modern architectures often                                                                                               |    |
|     | have hundreds of instructions that have complex semantics like rep                                                                                                                     | 3  |
| 1.3 | The BAP binary analysis architecture and components. BAP is divided into a front-end, which is responsible for lifting instructions to the BAP IL, and a platform-independent back-    |    |
|     | end for analyses.                                                                                                                                                                      | 4  |
| 2.1 | High-level view of tools provided with BAP                                                                                                                                             | 8  |
| 3.1 | (a) shows an example of little-endian stores as found in x86 that partially overlap. (b) shows memory after executing line 1, and (c) shows memory after executing line 5. Line 7 will |    |
|     | load the value 0x22bb.                                                                                                                                                                 | 16 |
| 3.2 | Normalized version of the store and load from Figure 3.1a                                                                                                                              | 16 |

LIST OF FIGURES LIST OF FIGURES

# **List of Tables**

| 3.1 | The Binary Intermediate Language. Note commas separte operators | 14 |
|-----|-----------------------------------------------------------------|----|
| 3.2 | Operational semantics notation                                  | 17 |
| 3.3 | Operational Semantics                                           | 18 |

LIST OF TABLES

LIST OF TABLES

### Introduction

### 1.1 The Need for Binary Analysis

Binary code is everywhere. In most situations, users only have access to code in binary (i.e., executable) form. Most common, off-the-shelf (COTS) software (e.g., Microsoft Windows, Adobe Acrobat, etc.) is only available to end-users in binary form. Malicious code (i.e., malware) created by attackers is typically only available in binary form. The ubiquity of binary code means any security techniques that only require access to the program binary are likely to be widely applicable. Further, binary code analysis allows us to argue about the security of the code that will run, not just the code that was compiled.

A binary-centric approach to software security requires the ability to perform program analysis on binary code. A program analysis (whether it be static or dynamic) is an algorithm for determining the effects of a set of statements in the programming language under consideration. Thus, a binary-centric approach requires that we 1) be able to analyze each instruction in a manner faithful to its semantics, and 2) provide a way of encoding an algorithm over those instructions.

However, there are two primary challenges to performing accurate and faithful analysis on modern binary code. First, code analysis at the binary level is different than source code because binary code lacks many abstractions found in higher-level languages. Second, binary code is significantly more complex than source code, which introduces new engineering challenges for binary code analysis.

**Binary Code Analysis is Different than Source Code Analysis.** Binary code is different than source code. Thus, we must develop and only use program analyses that are suitable for the unique characteristics of binary code. In particular, binary code lacks abstractions that are often fundamental to source code and source code analysis, such as:

- Functions. The function abstraction does not exist at the binary level. Instead, control flow in a binary program is performed by jumps. For example, the x86 instruction call x is just syntactic sugar (i.e., shorthand) for storing the number in the instruction pointer register eip at the address named by the register esp, decrementing esp by the architecture word size, then loading the eip with number x. Indeed, it is perfectly valid in assembly, and sometimes happens in practice, that one may call into the middle of a "function", or have a single "function" separated into non-contiguous pieces. The lack of a function abstraction poses significant scalability challenges to binary analysis.
- Memory vs. Buffers. Binary code does not have buffers, it has memory. While the OS may determine
  a particular memory page is not valid, memory does not have the traditional semantics of a userspecified type and size. One implication of the difference between buffers and memory is that in



Figure 1.1: A three instruction x86 program, and its corresponding control flow graph. Note the logic for deciding when the jump on overflow instruction (statement 3) is taken, when is often omitted by other binary analysis platforms.

binary code there is no such thing as a buffer overflow. While we may say a particular store violates a higher-level semantics given by the source code, such facts are inferences with respect to the higher-level semantics, not part of the binary code itself. The lack of buffers means we have to conservatively reason about the entire memory space (unless proven otherwise) at each operation.

• No Types. New types cannot be created or used since there is no such thing as a type constructor in binary code. The only types available are those provided by the hardware: registers and memory. Even register types are not necessarily a good choice, since it is common to store values from one register type (e.g., 32-bit register) and read them as another (e.g., 8-bit register). The lack of types means we cannot use type-based analysis, which is often instrumental in scaling traditional analyses.

**Binary Code is Complex.** One approach is to disassemble binary code into a sequence of assembly instructions, then perform program analysis directly over the assembly instructions. This type of assembly-specific approach is naïve because each analysis would have to individually understand the semantics of the assembly, which is difficult.

For example, consider the basic program analysis problem of determining when the conditional jump is taken on line 3 of Figure 1.1. In this example, all operands are 32-bit registers, and all arithmetic is therefore performed mod  $2^{32}$ . Instruction 1 computes the sum eax := eax+ebx (mod  $2^{32}$ ). Instruction 2 computes eax = eax  $\ll$  edx. Both instructions may set the overflow status register OF. The add instruction will set the OF flag if eax + ebx  $\geq 2^{32}$ . The shl instruction will set the OF flag if edx is 1 and the top two bits of eax are not equal on line 2. The instruction on line 3 tests to see if OF is set, and if so, jumps to target, else executes the next sequential instruction.

In order to determine when the jump on line 3 is taken, an analysis must reconstruct all side-effects of add and shl. The proper control flow diagram is shown in Figure 1.1. As demonstrated by this three line program, even though a program may look simple, determining the effects may be complicated. An assembly-specific approach would require each analysis to reason about the complex semantics.

```
IF AddressSize = 16
THEN
   Use CX for CountReg;
ELSE IF AddressSize = 64 and REX.W used
   THEN Use RCX for CountReg; FI;
   Use ECX for CountReg;
FI;
WHILE CountReg \neq 0
    Service pending interrupts (if any);
    Execute associated string instruction (x);
    CountReg \leftarrow (CountReg 1);
    IF CountReg = 0
    THEN exit WHILE loop; FI;
    IF (Repeat prefix is REPZ or REPE) and (ZF = 0)
       or (Repeat prefix is REPNZ or REPNE) and (ZF = 1)
    THEN exit WHILE loop; FI;
OD;
```

Figure 1.2: The semantics of the rep instruction according to Intel [17]. Modern architectures often have hundreds of instructions that have complex semantics like rep.

Worse, many architectures allow instruction prefixes, which can further complicate the semantics. For example, in x86 a rep prefix turns an instruction essentially into a single instruction loop, e.g., rep i s, d will repeatedly execute operation i on arguments s and d. The exact semantics (per Intel [17]) of rep are shown in Figure 1.2. An assembly-specific approach would have to consider this complicated logic for any instruction that may carry to rep prefix. If analyses are written directly on assembly, each analysis would duplicate this logic.

If there were only a few instructions, perhaps the complexity would not be too onerous. Modern architectures, however, typically have hundreds of instructions. x86, for example, has well over 300 instructions (recall the Intel x86 manuals [17] weigh over 11 lbs).

Overall, an assembly specific approach is unattractive because writing analyses over modern complex instruction, such as the logic found in Figures 1.2 and 1.1, tends to be tedious and error-prone. For example, the current version of DynInst [28] (version 5.2) and Phoenix [21] (April 2008 SDK Build), two popular architectures for binary analysis, will not create a correct control flow graph for the three line program. Verifying a program analysis is correct over a large and complicated instruction set seems even more difficult.

### 1.2 Desired Properties

We would like a platform for analyzing binary code that 1) supports writing analyses in a concise and straight-forward fashion, 2) provides abstractions for common semantics across all assemblies, and 3) is architecture independent when possible.

We want an architecture that supports writing analyses in a concise, straight-forward fashion because that makes it easier to write analyses that are correct. We do not want an analysis to have to tangle with complicated instruction semantics: that should be the job of the architecture.

We would also like to provide abstractions that are common to typical program analyses. There are



Figure 1.3: The BAP binary analysis architecture and components. BAP is divided into a front-end, which is responsible for lifting instructions to the BAP IL, and a platform-independent back-end for analyses.

many recurring abstractions. One is to be able to iterate over each instruction type. Another is to build a control flow graph. Yet another is finding data dependencies. We would like to build a single platform so that a new program analysis can easily reuse common abstractions.

Finally, we would like to be architecture independent. Architecture independence would allow us to easily re-target the entire platform to a new architecture without changing the analyses. The more architectures we can handle, the more widely our techniques will be applicable.

#### 1.3 BAP Overview

BAP is designed to facilitate faithful security-relevant binary program analysis by 1) reducing complex instruction sets to a single, small, and formally specified intermediate language that supports writing concise, easy-to-understand analyses 2) providing a set of core program analyses abstractions, and 3) being architecture independent when possible in order to support easy re-targeting.

Figure 1.3 shows a high-level picture of BAP. BAP is divided into an architecture-specific front-end and an architecture-independent back-end. At the core of BAP is an architecture-independent intermediate language (IL), called *BIL*, for assembly. Assembly instructions in the underlying architecture are lifted up to the BIL via the BAP front-end. All analyses are performed on the platform-independent BIL in the back-end.

We lift to the BIL by using open-source utilities to parse the binary format and produce assembly. The assembly is then lifted up to the BAP IL in a syntax-directed manner. The BAP front-end currently supports lifting x86 [17] and ARMv4 [7] to the IL.

The BAP back-end supports a variety of core program analyses and utilities. The back-end has utilities for creating a variety of different graphs, such as control flow and program dependence graphs. The back-

end also provides an optimization framework. The optimization framework is usually used to simplify a specific set of instructions. We also provide program verification capabilities such as symbolic execution, calculating the weakest precondition, and interfacing with decision procedures. BAP can also write out lifted BAP instructions as valid C code via the code generator back-end.

For example, one may want to analyze an execution trace. The execution trace can be lifted to BAP, then optimized to remove instructions from the trace that are not relevant to the analysis. The analysis could then verify properties of the trace by using program verification routines. Finally, the trace itself could be written out as a valid C program, compiled down to assembly, and then run as a straight-line program.

## The BAP Toolkit

#### 2.1 Toolkit Overview

There are several tools built and distributed with BAP. First, we provide a tool called toil which lifts binary code to BIL. Second, we provide a tool called iltrans which takes as input BIL code, transforms it, and then outputs the resulting BIL code. Thus, iltrans is a mapping from BIL to BIL. Finally, we provide a set of additional utilities such as one that evaluates BIL programs, and another that calculates the weakest precondition for a BIL program. Figure 2.1 shows the relationship of these tools. Different output options include optimizing the IL, generating CFG's, PDG's, DDG's, and CDG's, calculating the weakest precondition, and several other options. We describe these tools at a high level in this chapter.

Note: Each tool provides a -help option, which provides the latest set of command-line options.

### 2.2 Obtaining BAP

Please email David Brumley to get the current BAP URL and credentials. We currently distribute BAP as a VMWare virtual machine. We distribute BAP in a virtual machine for several reasons. First, BAP has quite a few dependencies, thus setup is potentially complex. Second, while we do not provide support, issues that we choose to resolve can be done more quickly when we know the exact setup.

### 2.3 Lifting Binary Code to BIL

The toil tool lifts binary code to BIL. The syntax for toil is yet to be determined.

#### **2.4** The internals of toil

Lifting up binary code to the IL consists of three steps:

- **Step 1.** First, the binary file is disassembled. toil currently interfaces with two disassemblers: IDA Pro [13], a commercial disassembler, and a linear sweep disassembler built on top of GNU libopcodes. Interfacing with other disassemblers is straight-forward.
- Step 2. The disassembly is passed to VEX, a third-party library which turns assembly instructions into the VEX intermediate language. The VEX IL is part of the Valgrind dynamic instrumentation tool [27]. The VEX IL is also similar to a RISC-based language. As a result, the lifted IL has only



Figure 2.1: High-level view of tools provided with BAP.

a few instruction types, similar to BAP. However, the VEX IL is insufficient for performing program analysis because it lacks information about the implicit side effects of instructions, e.g., what eflags are set by x86 instructions. This step is mainly performed in order to simplify the development of BAP: we let the existing tool take care of reducing assembly instructions to a basic IL, then step 3 exposes all implicit side-effects so that the analysis is faithful.

• **Step 3.** We lift the VEX IL to BIL. The resulting BIL IL is intended to be faithful to the semantics of the disassembled assembly instructions.

Lifted assembly instructions have all of the side-effects explicitly exposed. As a result, a single typical assembly instruction will be lifted as a sequence of BIL instructions. For example, the add eax, 0x2x86 instruction is lifted as:

The lifted BIL code explicitly detail all the side-effects of the add instruction, including all 6 eflags that are updated by the operation. As another example, an instruction with the rep prefix (whose semantics are in Figure 1.2) is lifted as a sequence of instructions that form a loop.

In addition to binary files, toil can also lift an instruction trace to the IL. Conditional branches in a trace are lifted as assert statements so that the executed branch is followed. This is done to prevent

branching outside the trace to an unknown instruction. The TEMU [4] dynamic analysis tool currently supports generating traces in the BAP trace format.

#### 2.5 Transformations on BIL Code

The iltrans tool translates our IL in various ways. The options to iltrans specify a sequence of transformations, optionally printing out the result of the transformation. For example, executing:

```
iltrans -trans1 -trans2 outfile -trans3 infile
```

would read in infile, perfore transformation trans1, the output of which is fed to trans2, the output of which is again fed to trans3 as well as printed to outfile. Although this may seem a little strange at first, the tool is designed to make it easy to specify exactly what algorithms you wish to perform over the II.

iltrans provides a library of common analyses and utilities, which can be used as building blocks for more advanced analyses. Please consult iltrans —help for exact arguments. We detail below at a high level several of the analysis that can be performed by iltrans

**Graphs.** iltrans provides options for building and manipulating control flow graphs (CFG), including a pretty-printer for the graphviz DOT graph language [2]. iltrans also provides options for building data, control, and program dependence graphs [23]. These graphs can be output in the dot language. The graphs themselves can be quite large. We currently use zgrviewer [5] for viewing the resulting dot files.

One issue when constructing graphs of an assembly program is determining the successors of jumps to computed values, called *indirect* jumps. Resolving indirect jumps usually involves a program analysis that require a CFG, e.g., VSA [8]. Thus, there is a potential circular dependency. Note that an indirect jump may potentially go anywhere, including the heap or code that has not been previously disassembled.

Our solution is to designate a special node as a successor of unresolved indirect jump targets in the CFG. We provide this so an analysis that depends on a correct CFG can recognize that we do not know the subsequent state. For example, a data-flow analysis could widen all facts to the lattice bottom. Most normal analyses will first run an indirect jump resolution analysis in order to build a more precise CFG that resolves indirect jumps to a list of possible jump targets. iltrans provides one such analysis, called Value Set Analysis [8].

**Single Static Assignment.** iltrans supports conversion to and from single static assignment (SSA) form [23]. SSA form makes writing an analysis easier because every variable is defined statically only once. Note we convert both memory and scalars to SSA form. We convert memories so that an analysis can syntactically distinguish between memories before and after a write operation instead of requiring the analysis itself to maintain similar bookkeeping. For example, in the memory normalization example in Figure 3.2, an analysis can syntactically distinguish between the memory state before the write on line 1, the write on line 5, and the read on line 7.

**Chopping.** Given a source and sink node, a program chop [18] is a graph showing the statements that cause definitions of the source to affect uses of the sink. For example, chopping can be used to restrict subsequent analyses to only a portion of code relevant to a given source and sink instead of the whole program. We describe our implementation of chopping in more detail in ??.

**Data-flow and Optimizations.** iltrans interfaces with the generic data-flow engine in BAP. The data-flow engine works on user-defined lattices, such as those for constant propagation, dead code elimination, etc. iltrans interfaces with several implements several data-flow analyses. BAP currently implements Simpson's global value numbering [29], constant propagation and folding [23], dead-code elimination [23], live-variable analysis [23], integer range analysis, and VSA [8].

We have implemented value set analysis [8]. Value set analysis is a data-flow analysis that over-approximates the values for each variable at each program point. Value-set analysis can be used to help resolve indirect jumps. It can also be used as an alias analysis. Two memory accesses are potentially aliased if the intersection of their value sets is non-empty.

Optimizations are useful for simplifying or speeding up subsequent analyses. For example, we have found that the time for the decision procedure STP to return a satisfying answer for a query can be cut in half by first using program optimization to simplify the query [10].

#### 2.6 Tools from BIL

#### 2.6.1 topredicate

topredicate supports program verification in two ways. First, topredicate can convert the IL into Dijkstra's Guarded Command Language (GCL), and calculate the weakest precondition with respect to GCL programs [14]. The weakest precondition for a program with respect to a predicate q is the most general condition such that any input satisfying the condition is guaranteed to terminate (normally) in a state satisfying q. Currently we only support acyclic programs, i.e., we do not support GCL while.

topredicate also interfaces with decision procedures. topredicate can write out expressions (e.g., weakest preconditions) in CVC Lite syntax [1], which is supported by several decision procedures. In addition, topredicate interfaces directly with the STP [16] decision procedure through calls from BAP to the STP library.

#### 2.6.2 evaluate

evaluator evaluates a given BIL program using the operational semantics in in 3.2.1. The evaluator allows us to execute programs without recompiling the IL back down to assembly. For example, we can test that a raised program is correct by executing the IL on an input i, observing the value v, executing the original binary program on i, observing the value v', and verifying v = v'.

#### **2.6.3** toc

toc generates valid C code from the IL. For example, one could use this as a rudimentary decompiler by first raising assembly to BIL, then writing it out as valid C. The ability to export to C also provides a way to compile BIL programs: the IL is written as C, then compiled with a C compiler.

The C code generator implements memories in the IL as arrays. A store operation is a store on the array, and a load is a load from the array. Thus, C-generated code simulates real memory. For example, consider a program that is vulnerable to a buffer overflow attack. It is raised to BIL, then written as C and recompiled. An out-of-bound write on the original program will be simulated in the corresponding C array, but will not lead to a real buffer overflow.

#### 2.7 Discussion

#### 2.7.1 Why Design a New Infrastructure?

At a high level, we designed BAP as a new platform because existing platforms are a) defined for higher-level languages and thus not appropriate for binary code analysis, b) designed for orthogonal goals such as binary instrumentation or decompilation, and/or c) unavailable to use for research purposes. As a result, other tools we tried (e.g., DynInst [28] version 5.2, Phoenix [21] April 2008 SDK, IDA Pro [13] version 5, and others) were inadequate, e.g., could not create a correct control flow graph for the 3 line assembly shown in Figure 1.1. Another reason we created BAP is so that we could be sure of the semantics of analysis. Existing platforms tend not to have publicly available formally defined semantics, thus we would not know exactly what we are using.

Formal semantics are important for several practical reasons. We found that defining the semantics of BAP was helpful in catching design bugs, unsupported assumptions, and other errors that could affect many different kinds of analysis. In addition, the semantics are helpful for communicating how BAP works with other researchers. Further, without a specified semantics, it is difficult to show an analysis is correct. For example, in [11] we show our proposed assembly-level alias analysis [11] is correct with respect to the operational semantics of BAP.

#### 2.7.2 Limitations of BAP

BAP is designed to enable program analysis of binary program states. Therefore, analyses that depends upon more than the operational semantics of the instruction fall outside the scope of BAP. For example, creating an analysis of the timing behavior of binary programs falls outside the current scope of BAP.

Some architectures, such as later versions of ARM, are bi-endian where the endianness is specified by a register status flag and can be changed mid-program. BAP currently supports big and little endian, but does not support bi-endian architectures. Adding support is straight-forward by changing store and load instructions to have an additional field indicating the endianness (which can then be removed by normalizing memory appropriately). If we annotate individual loads and stores, we would need to remove the endianness as part of the memory type declaration in order to prevent the semantics allowing type-incompatible loads and stores on memory.

BAP does not strongly associated an assembly instruction to the corresponding sequence of lifted IL instructions. Currently BAP prefixes a sequence of IL instructions with a comment containing the pretty-printed assembly. However, an analysis itself does not know which IL instructions arose from which assembly. Although users of BAP often seem to want to tie IL instructions to assembly instructions, it is unclear how to accomplish this in flexible way that is also non-intrusive to analyses themselves.

One principle in BAP is to note explicitly in the IL any unhandled instructions during lifting. For each architecture supported, there may be many instructions which BAP cannot lift. Such instructions are noted in the IL as either special or unknown. For example, BAP does not currently support floating point numbers. The central reason lies in properly defining the semantics of floating point with respect to rounding. This is a problem we plan on tackling in future versions of BAP.

#### 2.7.3 Is Lifting Correct?

Assembly instructions are lifted to BAP in a syntax directed manner. One may view that the BAP IL is a model of the underlying assembly. There is a chance, however, that lifting could produce incorrect IL.

Although it is impossible to say that all assembly instructions are correctly lifted, the advantage of BAP's design is that only the lifting process needs to understand the semantics of the original assembly.

We have developed several measures to make sure our IL is correct. First, we have a series of unit tests for IL translations. A unit test executes both the real instruction and the lifted IL in a know pre-state and verifies both terminate in the same post-state. Second, we have performed larger scale tests that verify the executions consisting of millions of instructions produce the same output value on the corresponding IL.

#### 2.7.4 Size of Lifting IL for a Program

A single assembly instruction will typically be translated into several BAP instructions. Thus, the resulting BAP program will have more statements than the corresponding assembly. For example and roughly speaking, x86 assembly instructions are raised to be about 7 BAP instructions: 1 BAP instruction for the direct effect, and 6 for updating processor-specific status flags.

In our experience, the constant-size factor in code size is worth the benefits of simplifying the semantics of assembly. Assembly instructions are designed to be efficient for a computer to execute, not for a human to understand. The IL, on the other hand, is designed to be easy for a human to understand. We have found even experienced assembly-level programmers will comment that they have a hard time keeping track of control and data dependencies since there are few syntactic cues in assembly to help. BAP, on the other hand, obviates all data and control dependencies within the code.

### Formalization of BIL

At the core of BAP is the BAP intermediate language, called BIL. In this chapter we present a formalization of BIL. The formalization is intended to provide an exact description of what we mean by statements in the IL.

*Note:* In our implementation, as discussed elsewhere, there are actually several IL's. Throughout this chapter we discuss the IL as implemented in ast.ml. The other IL's differ in uninteresting ways and are variants of the IL discussed here. In particular, developers will be interested in ssa.ml, which defines the IL in SSA form, and also does not allow recursively defined expressions (i.e., it is an SSA three-address code). Thus, this chapter can be viewed as a specification for the actual code.

### 3.1 The BIL Language

Table 3.1 shows the syntax of BIL. We use the term "instruction" to refer to an assembly-level instruction, and the term "statement" to refer to instructions within BIL. Thus, BAP raises instructions to statements in BIL. In ?? we give basic type-checking rules which disallow certain syntactically valid but nonsensical expressions, and in 3.2.1 we provide the operational semantics. In the remainder of this section we give an informal description and motivation for constructs in the IL.

#### 3.1.1 Values and Types

The base types  $\tau_{\text{reg}}$  in BIL IL are 1, 8, 16, 32, and 64-bit registers (i.e., n-bit vectors), and memories. Memories are given type  $\text{mem}_{\text{treg}}(\tau_{\text{reg}})$ , where  $\tau_{\text{reg}}$  determines the type for memory addresses. For example,  $\text{mem}(\tau_{\text{reg}32,t})$  corresponds to memory on a typical 32-bit machine.

We also have arrays, which are given type  $\mathtt{array\_t}(\tau_{reg}, \tau_{reg})$ . The tuple  $(\tau_{reg}, \tau_{reg})$  specifies the index and element type of an array, respectively. As we will see, we use arrays to *normalize* endianed memory accesses.

There are three types of values in BIL. First, BIL has numbers n of type  $\tau_{\rm reg}$ . Second, BIL has memory values  $\{n_{a1} \to n_{v1}, n_{a2} \to n_{v2}, ...\}$ , where  $n_{ai}$  denotes a number used as an address, and  $n_{vi}$  denotes the value stored at the address. Finally, BIL has a nonsense value  $\perp$ .  $\perp$  values are not exposed to the user and cannot be constructed in the presentation language.  $\perp$  is used internally to indicate a failed execution.

```
::= stmt*
program
stmt
                ::= var := exp \mid jmp(exp) \mid cjmp(exp,exp,exp)
                        | halt(exp) | assert(exp) | label label_kind | special(string)
                ::= \  \, \log(\exp,\exp,\exp,\tau_{\rm reg}) \, | \, {\rm store}(\exp,\exp,\exp,\exp,\tau_{\rm reg}) \, | \, \exp \lozenge_b \, \exp
exp
                        | \lozenge_u exp | var | lab(string) | integer | cast(cast\_kind, \tau_{reg}, exp)
                        | let var = exp in exp | unknown(string, \tau) | name(exp)
label_kind
                ::= integer | string
cast_kind
                ::= unsigned | signed | high | low
                 ::= (string, id<sub>v</sub>, \tau)
var
                ::= +, -, *, /, /_s, \mod, \mod_s, \ll, \gg, \gg_a, \&, |, \oplus, ==, !=, <, \leq, <_s, \leq_s
\Diamond_b
\Diamond_u
                := - (unary minus), \sim (bit-wise not)
                 ::= integer \mid memory \mid string \mid \bot
value
integer
                ::= n(:\tau_{reg})
                ::= \; \left\{ \; \textit{integer} \rightarrow \textit{integer}, \textit{integer} \rightarrow \textit{integer}, \ldots \right\} (:\tau_{\text{mem}})
memory
                 := \tau_{reg} \mid \tau_{mem}
                ::= mem_t(\tau_{reg}) | array_t(\tau_{reg}, \tau_{reg})
\tau_{mem}
                ::= reg1_t | reg8_t | reg16_t | reg32_t | reg64_t
	au_{\it reg}
```

Table 3.1: The Binary Intermediate Language. Note commas separte operators.

#### 3.1.2 Expressions

Expressions in BIL are side-effect free, and are similar to those found in most languages. BIL has binary operations  $\Diamond_b$  (note "&" and "|" are bit-wise), unary operations  $\Diamond_u$ , constants, let bindings, and casting. Casting is used when indexing registers under different addressing modes. For example, the lower 8 bits of eax in x86 are known as al. When lifting x86 instructions, we use casting to project out the lower-bits of the corresponding eax register variable to an al register variable when al is accessed.

The semantics of  $load(e_1, e_2, e_3, \tau_{reg})$  is to load from the memory specified by  $e_1$  at address  $e_2$ . In C, this would loosely be written  $e_1[e_2]$ . The parameter  $e_3$  tells us the endianness to use when loading bytes from memory.  $e_3$  is forced to be of type bool, where we arbitrary affix the meaning that 0 is little endian and 1 is big endian (since 0 is "littler" than 1). Some architecture consistently use the same endianness, e.g., for x86, the value of  $e_3$  will always correspond to little-endianness. However, other architectures such as ARM specify the endianness of a load at run-time. Finally,  $\tau_{reg}$  tells us how many bytes to load. In C, if  $e_1$  is of type  $\tau$ , then  $e_1[e_2]$  loads  $sizeof(\tau)$  bytes.  $\tau_{reg}$  similarly tells us how many bytes to load from memory. While technically we could infer  $\tau_{reg}$  given  $e_1$ , we keep it in the IL explicitly for efficiency.

In BIL, the store operation is pure (i.e., side-effect free). The advantage of pure memory operations in BIL notation is it makes it possible to syntactically distinguish what memory is modified or read. One place we take advantage of this is in SSA (see ??) where both scalars and memory have a unique single static assignment location.

Each store expression must specify what memory to load or store from. The resulting memory is returned as a value. The semantics of  $store(e_1, e_2, e_3, e_4, \tau_{reg})$  are to store in memory  $e_1$ , starting at address  $e_2$  the value  $e_3$ . The store is performed given the endianness of  $e_4$ . This may seem very complicated when reading. However, the operational semantics are quite simple: you may want to read the STORE rules.

The last expression type of note is unknown. An unknown specifies an operation we could not lift to BIL. The purpose of an unknown is to adhere to the BAP principle to *know what you do not know*. Consider the case where Intel adds a new instruction, e.g., as happens in each processor revision. BAP may not know about such instructions, thus cannot raise it to BIL. One option would be to ignore such instructions. However, the result of any analysis would be suspect in this case. A sound option is to abort lifting and raise an error. However, we often end up not interested in particular instructions. Our solution is to raise such instructions, when possible, to an assignment (where the left hand side is of the correct type and name) with the actual operation left unspecified as unknown.

#### 3.1.3 Statements and Programs

A program in BIL is a sequence of statements. There are 7 different kinds of instructions. The language has assignments, jumps, conditional jumps, and labels. The target of all jumps and conditional jumps must be a valid label in our operational semantics, else the program terminates in the error state ( $\bot$ ). Note that a jump to an undefined location (e.g., a location that was not disassembled such as to dynamically generated code) results in the BAP program halting with  $\bot$  (see 3.2.1). A program can halt normally at any time by issuing the halt statement. We also provide assert, which acts similar to a C assert: the asserted expression must be true, else the machine halts with  $\bot$ .

A special in BIL corresponds to a call to an externally defined procedure or function. special statements typically arise from system calls. The id of a special indexes the kind of special, e.g., what system call.

The semantics of special are up to the analysis; its operational semantics are not defined (Chapter 3.2.1). We include special as an instruction type to explicitly distinguish calls that alter the soundness



Figure 3.1: (a) shows an example of little-endian stores as found in x86 that partially overlap. (b) shows memory after executing line 1, and (c) shows memory after executing line 5. Line 7 will load the value 0x22bb.

Figure 3.2: Normalized version of the store and load from Figure 3.1a.

of an analysis. A typical approach to dealing with special is to replace special with an analysis-specific summary function written in the BAP IL that is appropriate for the analysis.

### 3.2 Normalized Memory

The endianness of a machine is usually specified by the byte-ordering of the hardware. Little endian architectures such as x86 put the low-order byte first. Big-endian architecture put the high-order byte first. Some architectures, such as ARM, allow the endianness to be specified by the instruction, e.g., mov would take three arguments: the source, destination, and endianness for which to perform the move.

We must take endianness into account when analyzing memory accesses. Consider the assembly in Figure 3.1a. The mov operation on line 2 writes 4 bytes to memory in little endian order (since x86 is little endian). After executing line 2, the address given by eax contains byte 0xdd, eax+1 contains byte 0xcd, and so on, as shown in Figure 3.1b. Lines 2 and 3 set ebx = eax+2. Line 4 and 5 write the 16-bit value 0x1122 to ebx. An analysis of these few lines of code needs to consider that the write on line 4 overwrites the last byte written on line 1, as shown in Figure 3.1c. Considering such cases requires additional logic in each analysis. For example, the value loaded on line 7 will contain one byte from each of the two stores.

We say a memory is *normalized* for a b-byte addressable memory if all loads and stores are exactly b-bytes and b-byte aligned. In x86, memory is byte addressable, so a normalized memory for x86 has all loads

#### Contexts

Instruction  $\Pi$   $n \mapsto instr$  Maps an instruction address to an instruction.

Variable  $\Delta id \mapsto var$  Maps a variable ID to its value.

Labels  $\Lambda$  label\_kind  $\mapsto n$  Maps a label to the address of the corresponding label instruc-

tion number.

#### **Notation**

 $\Delta \vdash e \Downarrow v$  Expression e evaluates to value v given variable context  $\Delta$  as given

by the expression evaluation rules.

 $\Delta' = \Delta[x \leftarrow v]$   $\Delta'$  is the same as  $\Delta$  except extended to map x to v.

 $\Pi \vdash p : i$   $\Pi$  maps instruction address p to instruction i. If  $p \notin \Pi$ , the machine

gets stuck.

 $\Lambda \vdash v : p$   $\Lambda$  maps instruction label v to instruction address p. If  $v \notin \Lambda$ , then

machine gets stuck. In addition, a well-formed machine should have

 $\Pi \vdash p : i \text{ where } i = \texttt{label} \ \ v, \text{ otherwise the machine is stuck}.$ 

 $(\Delta, p, i) \leadsto (\Delta', p', i')$  An execution step. p and p' are the pre and post step program counters, i and i' are the pre and post step instructions, and  $\Delta$  and  $\Delta'$  are the

pre and post step variable contexts. Note  $\Lambda$  and  $\Pi$  are currently static,

thus for brevity not included in the execution context.

Table 3.2: Operational semantics notation.

and stores at the byte level. The normalized form for the write on Line 1 of Figure 3.1a in BIL is shown in Figure 3.2. Note that the subsequent load on line 7 is with respect to the current memory mem6.

Normalized memory makes writing program analyses involving memory easier. Analysis is easier because normalized memory syntactically exposes memory updates that are otherwise implicitly defined by the endianness. As a result, analyses do not have to reason explicitly about overlapping memory, byte order, etc. BAP provides utilities for normalizing all memory operations so that users can write analysis more easily.

#### 3.2.1 Operational Semantics

The operational semantics for BIL are shown in Table 3.3. The abstract machine configuration is given by the tuple  $(\Pi, \Delta, p, i)$  where  $\Pi$  is the list of instructions,  $\Delta$  is the variable context, p is the instruction pointer, and i is the current instruction. We write  $\Delta' = \Delta[x \leftarrow v]$  to indicate that  $\Delta'$  is the same as  $\Delta$  except that variable x is updated with value v. For simplicity, we use  $\Delta$  both as a scalar and a memory context. When ambiguous, such as in the STORE rule, we indicate the type of the variable in  $\Delta$  in parentheses. We write  $\Pi[p]$  to indicate the instruction given by address p.

The operational semantics can be read as follows. Each step of the execution is associated with a machine configuration  $M = (\Pi, \Delta, p, i)$ . A transition is given by  $M \rightsquigarrow M'$  where the current configuration M matches the left side of  $\rightsquigarrow$  in the conclusion (below the horizontal bar), resulting in a state M' to the right. The transformation from M to M' is given by the rule premise (above the horizontal bar).

*Note:* BAP and BIL do not analyze dynamically generated code. Thus, in a machine state transition  $(\Pi, \Delta, p, i) \rightarrow (\Pi', \Delta, p, i)$ ,  $\Pi = \Pi'$  always. Since  $\Pi$  (the list of instructions) never changes, we omit

#### Instructions

$$\frac{\Delta \vdash e \Downarrow v \quad \Delta' = \Delta[x \leftarrow v] \quad \Pi \vdash p + 1 : i}{\Delta, p, \mathbf{x} := e \leadsto \Delta', p + 1, i} \quad \text{Assign} \quad \frac{\Delta \vdash \ell \Downarrow v \quad \Lambda \vdash v : p' \quad \Pi \vdash p' : \text{label } v}{\Delta, p, \text{jmp} \, (\ell) \, \leadsto \Delta, p', \text{label } v} \quad \text{JMP}$$
 
$$\frac{\Pi[p+1] = i}{\Delta, p, \text{label} \quad \ell \leadsto \Delta, p + 1, i} \quad \text{LABEL} \qquad \frac{\Delta \vdash e \Downarrow v}{\Delta, p, \text{halt} \quad e \leadsto \text{terminate with } v} \quad \text{HALT}$$
 
$$\frac{\Delta \vdash e \Downarrow 1 \quad \Pi[p+1] = i}{\Delta, p, \text{assert}(e) \leadsto \Delta, p + 1, i} \quad \text{ASSERT-T} \qquad \frac{\Delta \vdash e \Downarrow 0}{\Delta, p, \text{assert}(e) \leadsto \text{terminate with } \bot} \quad \text{ASSERT-F}$$
 
$$\frac{\Delta \vdash e \Downarrow 1 \quad \Delta \vdash \ell_T \Downarrow v \quad \Lambda \vdash v : p' \quad \Pi \vdash p' : \text{label} \quad v}{\Delta, p, \text{cjmp} \, (e, \quad \ell_T, \quad \ell_F) \quad \leadsto \Delta, p', \text{label} \quad v} \quad \text{CJMP-T}$$
 
$$\frac{\Delta \vdash e \Downarrow 0 \quad \Delta \vdash \ell_F \Downarrow v \quad \Lambda \vdash v : p' \quad \Pi \vdash p' : \text{label} \quad v}{\Delta, p, \text{cjmp} \, (e, \quad \ell_T, \quad \ell_F) \quad \leadsto \Delta, p', \text{label} \quad v} \quad \text{CJMP-F}$$

No rule for special, when  $v \notin \Lambda$ , and when  $p \notin \Pi$ .

#### **Expressions**

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad v = v_1 \Diamond_b v_2}{\Delta \vdash e_1 \Diamond_b e_2 \Downarrow v} \quad \text{BINOP} \quad \frac{\Delta \vdash e_1 \Downarrow v_1 \quad v = \Diamond_u v_1}{\Delta \vdash \Diamond_u e_1 \Downarrow v} \quad \text{UNOP}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v}{\Delta \vdash v \Downarrow v} \quad \text{VAR} \quad \frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta' = \Delta[x \leftarrow v_1] \quad \Delta' \vdash e_2 \Downarrow v}{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v} \quad \text{LET}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow 0 \quad n = \# \text{ bytes of } \tau_{\text{reg}} \quad v = v_1[v_2..v_2 + n] \text{ in little endian byte order}}{\Delta \vdash 1 \text{ load } (e_1, \quad e_2, \quad e_3, \quad \tau_{\text{reg}}) \Downarrow v} \quad \text{LOAD}_{\text{BIG}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow 1 \quad n = \# \text{ bytes of } \tau_{\text{reg}} \quad v = v_1[v_2..v_2 + n] \text{ in big endian byte order}}{\Delta \vdash 1 \text{ load } (e_1, \quad e_2, \quad e_3, \quad \tau_{\text{reg}}) \Downarrow v} \quad \text{LOAD}_{\text{BIG}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow 1 \quad n = \# \text{ bytes } \sigma_{\text{reg}} \quad v = v_1[v_2..v_2 + n] \text{ in big endian byte order}}{\Delta \vdash 1 \text{ load } (e_1, \quad e_2, \quad e_3, \quad \tau_{\text{reg}}) \Downarrow v} \quad \text{LOAD}_{\text{ARRAY}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow v_3 \quad \Delta \vdash e_4 \Downarrow 0 \quad n = \# \text{ bytes } \tau_{\text{reg}} \quad v = v_1[v_2..v_2 + n \leftarrow v_3] \text{ (little endian)}}{\Delta \vdash \text{ store } (e_1, \quad e_2, \quad e_3, \quad e_4, \quad \tau_{\text{reg}}) \Downarrow v} \quad \text{STORE}_{\text{BIG}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow v_3 \quad \Delta \vdash e_4 \Downarrow 1 \quad n = \# \text{ bytes } \tau_{\text{reg}} \quad v = v_1[v_2..v_2 + n \leftarrow v_3] \text{ (big endian)}}{\Delta \vdash \text{ store } (e_1, \quad e_2, \quad e_3, \quad e_4, \quad \tau_{\text{reg}}) \Downarrow v} \quad \text{STORE}_{\text{BIG}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow v_3 \quad \Delta \vdash e_4 \Downarrow 1 \quad n = \# \text{ bytes } \tau_{\text{reg}} \quad v = v_1[v_2..v_2 + n \leftarrow v_3] \text{ (big endian)}}}{\Delta \vdash \text{ store } (e_1, \quad e_2, \quad e_3, \quad e_4, \quad \tau_{\text{reg}}) \Downarrow v} \quad \text{STORE}_{\text{BIG}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad (e_1 \colon \text{ array.t.}) \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow v_3 \quad (e_4 \text{ ignored}) \quad v = v_1[v_2 \leftarrow v_3]}}{\Delta \vdash \text{ cast } \text{ (unsing end, } \quad \tau_{\text{reg}} \text{ (big endian)}} \quad \text{STORE}_{\text{ARRAY}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad (e_1 \colon \text{ array.t.}) \quad \Delta \vdash e_2 \Downarrow v_2 \quad \Delta \vdash e_3 \Downarrow v_3 \quad (e_4 \text{ ignored}) \quad v = v_1[v_2 \leftarrow v_3]}}{\Delta \vdash \text{ cast } \text{ (unsing end, } \quad \tau_{\text{reg}} \text{ (big endian)}} \quad \text{STORE}_{\text{BIG}}$$

$$\frac{\Delta \vdash e_1 \Downarrow v_1 \quad (e_1 \colon \text{ array.t.}) \quad \Delta \vdash e_2 \Downarrow v_2$$

Table 3.3: Operational Semantics.

 $\Pi$  from the rules for brevity. One could add support for dynamically generated code by adding rules for updating  $\Pi$ .

ASSIGN and LABEL are sequential instructions that carry out the respective operation, then look up and transition to the next sequential instruction p+1. The semantics of LABEL is a no-op: we use labels for jump targets. ASSIGN updates the variable context  $\Delta$  resulting in a new context  $\Delta'$ . As mentioned, there is no rule for special; any program with a special remaining may get stuck.

Control flow is handled by jmp (since cjmp is a derived form per Chapter  $\ref{eq:pmp}$ ). A jmp instruction evaluates the jump target e to a value v, then looks up the instruction associated with v. The NO-INST rule terminates the program in the error state when v is not associated with an instruction. For example, consider the case when a program reads in user input at location v, then issues a jump to v. The user input will be decoded as instructions at run-time. However, since the instruction comes from user input, we cannot include it in the analysis. In BAP, we indicate such possibilities by terminating in error.

# **BAP Development Guide**

#### 4.1 Overview

The BAP infrastructure is implemented in C++ and OCaml. The front-end lifting is implemented primarily in C++. The back-end is implemented in OCaml. We interface the C++ front-end with the OCaml back-end using OCaml via IDL generated stubs.

The front-end interfaces with Valgrind's VEX [27] to help lift instructions (see Chapter ??), GNU BFD for parsing executable objects, and GNU libopcodes for pretty-printing the disassembly. Each disassembled instruction is lifted to the IL, then passed to the back-end.

The BAP top-level directory structure is:

- **libasmir:** This directory contains all the code that interfaces which parses a binary file (when appropriate), interfaces with Valgrind's VEX [27], and produces an initial IL. We are planning to migrate away from most of the C++ code in this directory to OCaml.
- ocaml: This directory contains the core BAP library and routines. All the code is written in ocaml. Only core BAP code should go on this directory. If you must modify it in a project-specific way, please make a project-specific branch for those changes.
- utils: This contains user utilities, again written in OCaml.
- doc: The directory containing this documentation, as well as ocamldoc generated documentation. (Note: ocamldoc needs to be run after compiling.)

### **Related Work**

**Other Binary Analysis Platforms.** BAP is designed to 1) have a formal, well-defined IL, 2) explicitly expose the semantics of complex assembly in terms of the simpler BAP IL, and 3) be easy to re-target. There are several other binary analysis platforms which may have some similarity to BAP, do not fulfill all requirements.

Phoenix is a program analysis environment developed by Microsoft as part of their next generation compiler [21]. One of the Phoenix tools allows Microsoft compiled code to be raised up to a register transfer language (RTL). A RTL is a low-level IR that resembles an architecture-neutral assembly.

Phoenix differs from BAP in several ways. First, Phoenix can only lift code produced by a Microsoft compiler. Second, Phoenix requires debugging information, thus is not a true binary-only analysis platform. Third, Phoenix lifts assembly to a low-level IR that does not expose the semantics of complicated instructions, e.g., register status flags, as part of the IR [22]. Fourth, the semantics of the lifted IR, as well as the lifting semantics and goals, are not well specified [22], and thus not suitable for our research purposes.

The CodeSurfer/x86 platform [9] is a proprietary platform for analyzing x86 programs. At the core of CodeSurfer/x86 is a value-set analysis (VSA) [8]. We have also implemented VSA in BAP. CodeSurfer/x86 was not made available for comparison, thus we do not know whether it supports a well-defined IL, is re-targetable, or what other comparable aspects it may share with BAP.

**Decompilation.** Decompilation is the process of inverting the compilation of a program back to the original source language. Generally, the goal of decompilation is to recover a valid program in the original source language that is semantically equivalent (though need not be exactly the same) as the original source program.

Program analysis is often an important component in the decompilation process. Cifuentes has proposed using data and control flow analyses as part of decompilation [12]. Van Emmerik has shown that analysis on SSA flow graphs is helpful in a variety of tasks such as reconstructing types and resolving indirect jumps [15]. BAP also has SSA; thus it should be straight-forward to implement Van Emmerik's algorithms. Mycroft has proposed type inference for recovering C types during decompilation [24]. Adding type inference is a possible future direction for BAP.

**Binary Instrumentation.** Binary instrumentation is a technique to insert extra code into a binary that monitors the instrumented program's behavior (e.g., [6, 19, 20, 25, 27, 28, 30]). Instrumentation is performed by inserting jumps in the original binary code to the instrumentation code. The instrumentation code then

jumps back to the original code after executing. The instrumentation code must make sure that the execution state is the same before and after the jump in order for the instrumentation to be transparent.

Although many binary instrumentation tools provide a limited amount of program analysis, the end-goal is instrumentation, not facilitating analyses. For example, Pin [20] calculates register liveness information. However, general static analysis is outside the scope of such tools. For instance, instrumentation tools generally do not expose the semantics of all instructions such as register status flag updates.

Other Program Analysis Platforms. BAP shares many of the same goals, such as modularity and ease of writing correct analyses, as other program analysis platforms. For example, CIL [26] and SUIF [3] are both excellent platforms for analyzing C code. However, using higher-level program analysis platforms is inappropriate for binary code because binary code is fundamentally different. While higher-level languages have types, functions, pointers, loops, and local variables, assembly has no types, no functions, one globally addressed memory region, and goto's. The BAP IL and surrounding platform is designed specifically to meet the challenges of faithfully analyzing assembly.

# **Credits**

BAP is currently maintained mostly by Ivan Jager. BAP is a complete rewrite of work David Brumley conducted for his thesis. That work was called Vine. Contributors to the original Vine code include: Juan Caballero, Cody Hartwig, Ivan Jager, Eric Li, James Newsome, Pongsin Poosankam, and Heng Yin. Dawn Song was David's advisor, and provided funds, food, support, and many other things that made Vine (thus BAP) possible.

## **Bibliography**

- [1] CVC Lite documentation. http://www.cs.nyu.edu/acsys/cvcl/doc/. URL checked 7/26/2008.
- [2] The DOT language. http://www.graphviz.org/doc/info/lang.html. URL checked 7/26/2008.
- [3] The SUIF 2 compiler system. http://suif.stanford.edu/suif/suif2/index.html. URL checked 7/27/2008.
- [4] TEMU. http://bitblaze.cs.berkeley.edu/temu.html. URL checked 7/27/2008.
- [5] Zgrviewer. http://zvtm.sourceforge.net/zgrviewer.html. URL Checked 3/6/09.
- [6] On the run building dynamic modifiers for optimization, detection, and security. Original DynamoRIO announcement via PLDI tutorial, June 2002.
- [7] ARM. ARM Architecture Reference Manual, 2005. Doc No. DDI-0100I.
- [8] Gogul Balakrishnan. WYSINWYX: What You See Is Not What You eXecute. PhD thesis, Computer Science Department, University of Wisconsin at Madison, August 2007.
- [9] Gogul Balakrishnan, Radu Gruian, Thomas Reps, and Tim Teitelbaum. Codesurfer/x86 a platform for analyzing x86 executables. In *Proceedings of the International Conference on Compiler Construction*, April 2005.
- [10] David Brumley, Cody Hartwig, Zhenkai Liang, James Newsome, Pongsin Poosankam, Dawn Song, and Heng Yin. Automatically identifying trigger-based behavior in malware. In Wenkee Lee, Cliff Wang, and David Dagon, editors, Botnet Detection, volume 36 of Countering the Largest Security Threat Series: Advances in Information Security. Springer-Verlag, 2008.
- [11] David Brumley and James Newsome. Alias analysis for assembly. Technical Report CMU-CS-06-180, Carnegie Mellon University School of Computer Science, 2006.
- [12] Cristina Cifuentes. *Reverse Compilation Techniques*. PhD thesis, Queensland University of Technology, July 1994.
- [13] DataRescue. IDA Pro. http://www.datarescue.com. URL checked 7/31/2008.
- [14] E.W. Dijkstra. A Discipline of Programming. Prentice Hall, Englewood Cliffs, NJ, 1976.

BIBLIOGRAPHY BIBLIOGRAPHY

[15] Michael James Van Emmerik. *Single Static Assignment for Decompilation*. PhD thesis, The University of Queensland School of Information Technology and Electrical Engineering, May 2007.

- [16] Vijay Ganesh and David L. Dill. A decision procedure for bit-vectors and arrays. In W. Damm and H. Hermanns, editors, *Proceedings on the Conference on Computer Aided Verification*, volume 4590 of *Lecture Notes in Computer Science*, pages 524–536, Berlin, Germany, July 2007. Springer-Verlag.
- [17] Intel Corporation. *Intel 64 and IA-32 Architectures Software Developer's Manual, Volumes 1-5*, April 2008.
- [18] Daniel Jackson and Eugene J. Rollins. Chopping: A generalization of slicing. Technical Report CS-94-169, Carnegie Mellon University School of Computer Science, 1994.
- [19] James Larus and Eric Schnarr. EEL: machine-independent executable editing. In *Proceedings of the ACM Conference on Programming Language Design and Implementation*, pages 291–300, 1995.
- [20] Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In *Proceedings of the ACM Conference on Programming Language Design and Implementation*, June 2005.
- [21] Microsoft. Phoenix framework. http://research.microsoft.com/phoenix/. URL checked 7/31/2008.
- [22] Microsoft. Phoenix project architect posting. http://forums.msdn.microsoft.com/en-US/phoenix/thread/90f5212c-f05a-4aea-9a8f-a5840a6d101d, July 2008. URL checked 7/31/2008.
- [23] Steven S. Muchnick. Advanced Compiler Design and Implementation. Academic Press, 1997.
- [24] Alan Mycroft. Type-based decompilation. In European Symposium on Programming, March 1999.
- [25] Susanta Nanda, Wei Li, Lap chung Lam, and Tzi cker Chiueh. BIRD: Binary interpretation using runtime disassembly. In *Proceedings of the IEEE/ACM Conference on Code Generation and Optimization*, March 2006.
- [26] George C. Necula, Scott McPeak, S.P. Rahul, and Westley Weimer. CIL: Intermediate language and tools for analysis and transformation of C programs. In *Proceedings of the International Conference on Compiler Construction*, 2002.
- [27] Nicholas Nethercote. *Dynamic Binary Analysis and Instrumentation or Building Tools is Easy*. PhD thesis, Trinity College, University of Cambridge, 2004.
- [28] Paradyn/Dyninst. Dyninst: An application program interface for runtime code generation. http://www.dyninst.org. URL checked 9/28/2008.
- [29] Loren Taylor Simpson. Value-Driven Redundancy Elimination. PhD thesis, Rice University Department of Computer Science, 1996.
- [30] Amitabh Srivastava and Alan Eustace. ATOM: A system for building customized program analysis tools. In *Proceedings of the ACM Conference on Programming Language Design and Implementation*, pages 196–205, 1994.