Skip to content
A fast and accurate disassembler
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.ci
doc Build with CMake instead of Makefiles. Oct 11, 2018
examples adding challenging example for pointer array heuuristic Mar 1, 2019
src change vector to set in function tables Apr 17, 2019
tests stop generating debug files Apr 10, 2019
.clang-format Formatted and fixed nop padding. May 1, 2018
.gitignore Remove obsolete .gitignore rue Oct 11, 2018
CMakeLists.boost Allow for Boost to be built and linked locally Apr 12, 2019
CMakeLists.txt Update minimum required version of Boost to 1.67 Apr 15, 2019
CODE_OF_CONDUCT.md
LICENSE.txt
PKGBUILD Fix/remove licensing before any distribution Oct 11, 2018
README.md Update minimum required version of Boost to 1.67 Apr 15, 2019

README.md

Datalog Disassembly

A fast disassembler which is accurate enough for the resulting assembly code to be reassembled. The disassembler implemented using the datalog (souffle) declarative logic programming language to compile disassembly rules and heuristics. The disassembler first parses ELF file information and decodes a superset of possible instructions to create an initial set of datalog facts. These facts are analyzed to identify code location, symbolization, and function boundaries. The results of this analysis, a refined set of datalog facts, are then translated to the GTIRB intermediate representation for binary analysis and reverse engineering. The GTIRB pretty printer may then be used to pretty print the GTIRB to reassemblable assembly code.

Introduction

The analysis contains two parts:

  • The C++ files take care of reading an elf file and generating facts that represent all the information contained in the binary.

  • src/datalog/*.dl contains the specification of the analyses in datalog. It takes the basic facts and computes likely EAs, chunks of code, etc. The results are represented in GTIRB or can be printed to assembler code using the gtirb-pprinter.

Dependencies

  • GTIRB

  • The analysis depends on souffle being installed. Configure souffle with --enable-64bit-domain --disable-provenance.

  • For printing assembler code the datalog disassembler requires the gtirb-pprinter

Building ddisasm

A C++17 compiler such as gcc 7 or clang 6 is required.

Boost (1.67 or later) and GTIRB are required.

Use the following options to configure cmake:

  • You can tell CMake which compiler to use with -DCMAKE_CXX_COMPILER=<compiler>.

  • Normally CMake will find GTIRB automatically, but if it does not you can pass -Dgtirb_DIR=<path-to-gtirb-build>.

Once the dependencies are installed, you can configure and build as follows:

$ cmake ./ -Bbuild
$ cd build
$ make

Running the analysis

Once ddisasm is built, we can run complete analysis on a file by calling /bin/ddisasm'. For example, we can run the analysis on one of the examples as follows:

cd build/bin ./ddisasm ../../examples/ex1/ex --asm ex.s

The script accepts the following parameters:

--help : produce help message

--sect arg (=.plt.got,.fini,.init,.plt,.text,) : code sections to decode

--data_sect arg (=.data,.rodata,.fini_array,.init_array,.data.rel.ro,.got.plt,.got,) : data sections to consider

--ir arg : GTIRB output file

--asm arg : ASM output file

--debug : if the assembly code is printed, it is printed with debugging information

--debug-dir arg : location to write CSV files for debugging

Rewriting a project

The directory tests/ contains the script reassemble_and_test.sh to rewrite and test a complete project. reassemble_and_test.sh rebuilds a project using the compiler and compiler flags specified in the enviroment variables CC and CFLAGS (make -e), rewrites the binary and run the project tests on the new binary.

We can rewrite ex1 as follows:

cd examples/ex1
make
ddisasm ex --asm ex.s
gcc ex.s -o ex_rewritten

Testing

The directory tests/ also contains a script test_small.sh for rewriting the examples in /examples with different compilers and optimization flags.

Contributing

Please read the DDisasm Code of Conduct.

Please follow the Code Requirements in gtirb/CONTRIBUTING.

Some References

  1. Souffle

  2. Capstone disassembler

  3. Control Flow Integrity for COTS Binaries

  4. Alias analysis for Assembly

  5. Reassembleable Disassembling

  6. Ramblr: Making disassembly great again

  7. An In-Depth Analysis of Disassembly on Full-Scale x86/x64 Binaries

  8. Binary Code is Not Easy

You can’t perform that action at this time.