A fast disassembler which is accurate enough for the resulting assembly code to be reassembled. The disassembler implemented using the datalog (souffle) declarative logic programming language to compile disassembly rules and heuristics. The disassembler first parses ELF file information and decodes a superset of possible instructions to create an initial set of datalog facts. These facts are analyzed to identify code location, symbolization, and function boundaries. The results of this analysis, a refined set of datalog facts, are then translated to the GTIRB intermediate representation for binary analysis and reverse engineering. The GTIRB pretty printer may then be used to pretty print the GTIRB to reassemblable assembly code.
The analysis contains two parts:
The C++ files take care of reading an elf file and generating facts that represent all the information contained in the binary.
src/datalog/*.dlcontains the specification of the analyses in datalog. It takes the basic facts and computes likely EAs, chunks of code, etc. The results are represented in GTIRB or can be printed to assembler code using the gtirb-pprinter.
The analysis depends on souffle being installed. Configure souffle with
For printing assembler code the datalog disassembler requires the gtirb-pprinter
A C++17 compiler such as gcc 7 or clang 6 is required.
Boost (1.67 or later) and GTIRB are required.
Use the following options to configure cmake:
You can tell CMake which compiler to use with
Normally CMake will find GTIRB automatically, but if it does not you can pass
Once the dependencies are installed, you can configure and build as follows:
$ cmake ./ -Bbuild $ cd build $ make
Running the analysis
ddisasm is built, we can run complete analysis on a file by
build/bin/ddisasm'. For example, we can run the analysis on one
of the examples as follows:
cd build/bin ./ddisasm ../../examples/ex1/ex --asm ex.s
Ddisasm accepts the following parameters:
: produce help message
--sect arg (=.plt.got,.fini,.init,.plt,.text,)
: code sections to decode
--data_sect arg (=.data,.rodata,.fini_array,.init_array,.data.rel.ro,.got.plt,.got,)
: data sections to consider
: GTIRB output file
: GTIRB json output file
: ASM output file
: if the assembly code is printed, it is printed with debugging information
: location to write CSV files for debugging
-K [ --keep-functions ] arg : Print the given functions even if they are skipped by default (e.g. _start)
: This option is useful for debugging. Use relocation information to emit a self diagnosis
of the symbolization process. This option only works if the target
binary contains complete relocation information. You can enable
ld using the option
Rewriting a project
The directory tests/ contains the script
rewrite and test a complete project.
a project using the compiler and compiler flags specified in the
enviroment variables CC and CFLAGS (
make -e), rewrites the binary
and run the project tests on the new binary.
We can rewrite ex1 as follows:
cd examples/ex1 make ddisasm ex --asm ex.s gcc ex.s -o ex_rewritten
tests/ also contains a script
rewriting the examples in
/examples with different compilers and
Please read the DDisasm Code of Conduct.
Please follow the Code Requirements in gtirb/CONTRIBUTING.