Decompilation
I'll try to show here some specifics of Haxe to Hashlink compilation (structures and such) and how we can reverse the process for decompilation. I'll proceed step by step for each language constructs I was able to decompile correctly. I am assuming the reader knows about the general imperative programming languages terminology like expressions and statements and other terms specifics to Haxe / Object oriented programming languages like classes, methods, ADT enums and such.
This part is about the compilation of OOP structures (classes, inheritance, methods)
...
The haxe source code AST is linearized to Hashlink instructions.
Jump instruction to the end of the statement if the condition is not verified. For else
clauses, a JAlways
is
inserted before to allow jumping over at the end of the main clause.
Example :
Haxe source | HL bytecode | Decompiler output |
---|---|---|
var a = 0;
if (a > 1) {
a = 1;
} else {
a = 2;
}
a = 3; |
|
var a = 0;
if (a > 1) {
a = 1;
} else {
a = 2;
}
a = 3; |
Notice how the if
condition gets inverted (and flipped) in the bytecode from a > 1
to 1 >= a
.
Loops are distinguished from simple branches by the presence of a Label
instruction at the start and a negative jump
at the end.
Example :
Haxe source | HL bytecode | Decompiler output |
---|---|---|
var b = 69;
while (b > 5) {
b -= 2;
} |
|
var b = 69;
while (b > 5) {
b = b - 2;
} |
The decompiler has a special post-processing step done directly on the output AST where it tries to add back some syntactic sugar.
Before | After |
---|---|
static function main() {
var a = __add__("hello", "world");
var c = 3;
var b = __add__("number : ", __alloc__(itos(c, c), c));
} |
static function main() {
var a = "hello" + "world";
var c = 3;
var b = "number : " + c;
} |