Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Primus - the Microexecution Framework #651
This PR brings lots of new stuff to BAP, but the main contribution is Primus - the microexecution framework. It also adds few libraries, that are independent of BAP, but on which Primus depends, namely the Monads library, and the Ogre library. The PR also reimplements Beagle, and provides its internals as a library, as it will be explained later.
Primus is a framework for microexecution. The documentation will be added soon, so I will provide only short description. Primus consist of an Interpreter that can be extended with components. An interpreter together with the components (such as Memory, IO, Linker, Env) is called a Primus Machine. The interpreter itself is implemented as a mutistate monad transformer. The multistate monad is implemented in the Mondads library (described) below and is basically a State Monad that can have multiple states. In other words, the Primus Machine is a non-deterministic machine that can clone itself. The idea is to make it possible to solve NP-hard problems using polynomial approach by separating the concerns - a P-hard kernel, that is applied non-deterministically (in the Turing sense) to all possible clones of a machine. Putting it more simply, a default use-case is when an analyst is implementing the logic behind his analysis assuming that machine is deterministic and the execution is totally linear (thus totally ignoring the state explosion and path enumeration problems), and then chooses a particular mode of execution that will apply the analysis to some paths of execution.
Currently, the only mode of execution that is implemented (i.e., that is available out of the box from a command line) is a deterministic execution, that follows directly the small-step semantics of the language. Not particularly useful from the analysis perspective, it is mostly intended for verification and testing of the Primus framework, as well as the whole BAP as a white-box. In particular, we are able to execute a non-trivial binary, that uses dynamic memory allocation, string parsing, and IO (see our test-suite), and get the same output as the real execution would provide.
The primus linker can dynamically link code into our program abstraction. The linked code can be loaded from a real binary, stubbed with an arbitrary OCaml (and later with C or Python) program, that interacts with the Machine components directly, or, most commonly, the code can be implemented in the Primus Lisp. The Primus Lisp is a dialect of the Common Lisp, that basically can be seen as a concrete syntax for BIL. Thus the Primus Lisp is sort of an assembler for the Primus Machine. The easiest way to demonstrate the Primus Lisp is to show an example:
(defun strcpy (dst src) (declare (external "strcpy")) (let ((dst dst)) (while (not (points-to-null src)) (copy-byte-shift dst src)) (memory-write dst 0:8)) dst)
Since the abstract machine of the language matches with the primus interpreter, that is a CPU emulator, the Primus Lisp doesn't provide any abstract datatypes, instead its typesystem includes only a family of scalar types indexed with their bitwidth. Thanks to the macro system, as well as to different techologies borrowed from the elisp, such as function advisors, the language is still quite powerfull.
A new monad library is implemented as a part of the Primus effort. This libary doesn't depend on BAP, and provides a set of monad transformers for most (if not all) common Monads. The library provides a rich interface in the Core style, for example, it provides List and Seq modules that contain the corresponding container interfaces lifted into the monad. The old monads and monads transformers are deprecated and will be removed in BAP.2.0.0. To use the new library, add
Ogre is an Open Generic REpresentation - a library and a specification for the NoSQL style database. It was originally designed as a representation that can consume Elves and Dwarves, as well as the information provided by COFF and Mach-O. Eventually, it evolved to a NoSQL database engine, akin to JSON, except that it uses SEXP data representation, disallows recursive datatypes, and follows the Third Manifesto Ideas, trying to be as close to the first order predicate logic, as possible. Unlike other contributions, this library is documented rather thoroughly, so an interested reader may proceed directly to the
The Bap_string library
The Bap_strings library provides several algorithms on strings that are useful in binary analysis and reverse engineering:
The strings plugin implements a functionality of the conventional strings tool
Beagle was totally rewritten in this PR. It now uses Primus framework, and relies on the