Skip to content

ImanHosseini/AtX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

atx logo

Translating ARM to X86

I am trying to answer how hard it would be to translate an ARM32 elf binary to a X86_64 elf binary. This is meant as a course project to The Art of Embedded Exploitation course [Spring 2020] taught by Stephen A. Ridley at NYU. The idea was to be able to do this for simple ARM32 binaries, and I started by trying to port one of the simple stack smashing binaries by hand, and then trying to make it 'automated'. I had previous experience with ARM architecture, from my undergrad thesis JAA which was something similar, but for translating JVM bytecode to ARM. The tools I picked for doing this are Capstone Engine for disassembly, LIEF to fiddle with elf headers and decided on NASM- Netwide Assembler for output: so that my translator generates assembly that then NASM would assemble into an X86_64 binary.

Some thoughts

Based on the function calling conventions for the ABIs, I decided on a mapping of registers:

ARM32 X86_64
R0 EDI
R1 ESI
R2 EDX
R3 ECX
R4 EAX
R5 EBX
R6 R8d
R7 R9d
R8 R10d
R9 R11d
R10 R12d
R11 R13d
R12 R14d
LR R15
SP RSP

This way, function calls can become seamless, i.e. first arg is passed on R0 in ARM32, becomes DI in X86_64. Also for the stack, let's just mimic the same stack: anything pushed in ARM, push it in x86. Issues arise regarding how to handle data section and pc-relative addressings: in ARM, the mechanism is via the link-register and you can just pop stuff from stack to lr, in x86 this is not the case with RIP and instead CALL and RET are used (rather than BL and mov PC, LR).

There are small differences like how ARM arithmetic instructions have 3 operand (unlike x86 which has 2, the 1st also acting as destination) which can be rather easily handled: for "ADD Rd, Rx, Ry" generate "MOV Rd, Rx", "ADD Rd, Ry", or like how in ARM any instruction can get executed conditionaly. Like how you have ADD, and then ADDNE -NE for Not Equal- which does an ADD if NE, and NE can be any condition code. Also there are issues which are harder to handle, like the issue with pc-relative addressing.

I am not picky though, it's a fun limited project, I am not thinking through everything, like how to do this provably correct like, there are so many edge cases on everything, even simple arithmetics are really not same due to different imm widths and such and such. My simple model for this is that I am essentialy making an equivalence between the state of the system in each arch; call it f(S) which maps a state S from the ARM machine to an state in x86 machine , and I assume that for each ARM instruction, taking the ARM system from state S->S' there exists a sequence of x86 instructions which take f(S) to f(S').

Releases

No releases published

Packages

No packages published

Languages