Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Welcome to the libbeauty wiki!
libbeauty is a very advanced open source tool that has an initial aim:
"Given an input .o file, it can create a .c file that compiles and has the same function as the original .o file. This will later expand to handling binary files."
During a manual reverse engineering task, there are certain activities and questions the reverse engineer asks while trying to reverse engineer some binary code.
The aim of this tool is to automate some of the tasks that a reverse engineer has to do to achieve that task.
For each task, we have thought about how the task is carried out logically, and how to implement that task in software.
libbeauty is very advanced. The current state of libbeauty is that is can extract the entire instruction call flow and generate .c code from it.
- Encourage more people to join the development.
2) Identify data types and generate .c code for them.
3) Create LLVM IR.
Some data types are already identified, but this is improving all the time as more complex ones are added.
- Understand the entire instruction call flow.
2) Identify loops and if ... then ... else program control flows.
During analysis of data types, we ran into a problem. Better data type analysis requires proper detections of loops, e.g. for() and if ... then ... else. So, the task list above has been modified.
Another problem is identifying structure.
Not all CFGs(Control Flow Graphs) can be represented in AST(Abstract Syntax Tree).
Binary -> RTL -> Execute in VM -> Instruction Log -> CFG -> AST -> .C code with loops and if...then...else structure.
There is therefore sense in finding a good CFG format to use and test with before going further and creating AST. We are therefore using LLVM IR for this CFG format. The creation of LLVM IR is almost complete. Once done, we can continue on with working to generate good AST and discover structure.
A CLANG/LLVM compiler goes from .C -> AST -> CFG (LLVM_IR) -> Binary, so it seemed sensible to go the other way with a decompiler. Binary -> CFG -> AST -> .C
2) [Control Flow Graphs](https://github.com/jcdutton/libbeauty/wiki/Control-Flow-Graphs)