Skip to content

Decompilation

Guillaume Anthouard edited this page Sep 21, 2022 · 3 revisions

Haxe to Hashlink compilation and decompilation

I'll try to show here some specifics of Haxe to Hashlink compilation (structures and such) and how we can reverse the process for decompilation. I'll proceed step by step for each language constructs I was able to decompile correctly. I am assuming the reader knows about the general imperative programming languages terminology like expressions and statements and other terms specifics to Haxe / Object oriented programming languages like classes, methods, ADT enums and such.


OOP

This part is about the compilation of OOP structures (classes, inheritance, methods)

Classes

...

Code

The haxe source code AST is linearized to Hashlink instructions.

Branches

Jump instruction to the end of the statement if the condition is not verified. For else clauses, a JAlways is inserted before to allow jumping over at the end of the main clause.

Example :

Haxe source HL bytecode Decompiler output
var a = 0;
if (a > 1) {
    a = 1;
} else {
    a = 2;
}
a = 3;
0: Int             reg0 = 0
1: Int             reg3 = 1
2: JSGte           if reg3 >= reg0 jump to 5
3: Int             reg2 = 1
4: JAlways         jump to 7
5: Int             reg2 = 2
7: Int             reg2 = 3
8: Ret             reg1
var a = 0;
if (a > 1) {
  a = 1;
} else {
  a = 2;
}
a = 3;

Notice how the if condition gets inverted (and flipped) in the bytecode from a > 1 to 1 >= a.

Loops

Loops are distinguished from simple branches by the presence of a Label instruction at the start and a negative jump at the end.

Example :

Haxe source HL bytecode Decompiler output
var b = 69;
while (b > 5) {
    b -= 2;
}
0: Int             reg0 = 69
1: Label
2: Int             reg3 = 5
3: JSGte           if reg3 >= reg0 jump to 8
4: Int             reg3 = 2
5: Sub             reg2 = reg0 - reg3
6: Mov             reg0 = reg2
7: JAlways         jump to 1
8: Ret             reg1
var b = 69;
while (b > 5) {
  b = b - 2;
}

AST post processing

The decompiler has a special post-processing step done directly on the output AST where it tries to add back some syntactic sugar.

Before After
static function main() {
  var a = __add__("hello", "world");
  var c = 3;
  var b = __add__("number : ", __alloc__(itos(c, c), c));
}
static function main() {
  var a = "hello" + "world";
  var c = 3;
  var b = "number : " + c;
}