<a href="https://colab.research.google.com/github/rdkdaniel/Engineering--Arduino-Raspberry-Pi-etc/blob/main/Assembly_%26_Microprocessor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Key links: https://tonybaloney.github.io/posts/extending-python-with-assembly.html

https://specbranch.com/posts/python-and-asm/

**Python extensions in assembly language**

**Python is similar to assembly langua**ge

In [None]:
Key links: https://tonybaloney.github.io/posts/extending-python-with-assembly.html

https://specbranch.com/posts/python-and-asm/

SyntaxError: ignored

# **Assembly, Disassembly and Emulation using Python**

In [1]:
#https://www.thepythoncode.com/article/arm-x86-64-assembly-disassembly-and-emulation-in-python

In [3]:
!pip3 install keystone-engine capstone unicorn

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting keystone-engine
  Downloading keystone_engine-0.9.2-py2.py3-none-manylinux1_x86_64.whl (1.8 MB)
[K     |████████████████████████████████| 1.8 MB 5.4 MB/s 
[?25hCollecting capstone
  Downloading capstone-4.0.2-py2.py3-none-manylinux1_x86_64.whl (2.1 MB)
[K     |████████████████████████████████| 2.1 MB 33.8 MB/s 
[?25hCollecting unicorn
  Downloading unicorn-2.0.0-py2.py3-none-manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.1 MB)
[K     |████████████████████████████████| 16.1 MB 39.3 MB/s 
[?25hInstalling collected packages: unicorn, keystone-engine, capstone
Successfully installed capstone-4.0.2 keystone-engine-0.9.2 unicorn-2.0.0


# **Assembling ARM**

In [4]:
# We need to emulate ARM
from unicorn import Uc, UC_ARCH_ARM, UC_MODE_ARM, UcError
# for accessing the R0 and R1 registers
from unicorn.arm_const import UC_ARM_REG_R0, UC_ARM_REG_R1
# We need to assemble ARM code
from keystone import Ks, KS_ARCH_ARM, KS_MODE_ARM, KsError

**RM assembly code, which calculates factorial(r0), where r0 is an input register**

In [5]:
ARM_CODE = """
// n is r0, we will pass it from python, ans is r1
mov r1, 1       	// ans = 1
loop:
cmp r0, 0       	// while n >= 0:
mulgt r1, r1, r0	//   ans *= n
subgt r0, r0, 1 	//   n = n - 1
bgt loop        	// 
                	// answer is in r1
"""

**Assemble the above Assembly code (convert it into bytecode)**

In [6]:
print("Assembling the ARM code")
try:
    # initialize the keystone object with the ARM architecture
    ks = Ks(KS_ARCH_ARM, KS_MODE_ARM)
    # Assemble the ARM code
    ARM_BYTECODE, _ = ks.asm(ARM_CODE)
	# convert the array of integers into bytes
    ARM_BYTECODE = bytes(ARM_BYTECODE)
    print(f"Code successfully assembled (length = {len(ARM_BYTECODE)})")
    print("ARM bytecode:", ARM_BYTECODE)
except KsError as e:
    print("Keystone Error: %s" % e)
    exit(1)

Assembling the ARM code
Code successfully assembled (length = 20)
ARM bytecode: b'\x01\x10\xa0\xe3\x00\x00P\xe3\x91\x00\x01\xc0\x01\x00@\xc2\xfb\xff\xff\xca'


The function Ks returns an Assembler in ARM mode, the asm() method assembles the code, and returns the bytes, and the number of instructions it assembled.

The bytecode can now be written on a memory region, and be executed by an ARM processor (or emulated, in our case):

In [7]:
# memory address where emulation starts
ADDRESS = 0x1000000

print("Emulating the ARM code")
try:
    # Initialize emulator in ARM mode
    mu = Uc(UC_ARCH_ARM, UC_MODE_ARM)
    # map 2MB memory for this emulation
    mu.mem_map(ADDRESS, 2 * 1024 * 1024)
    # write machine code to be emulated to memory
    mu.mem_write(ADDRESS, ARM_BYTECODE)
    # Set the r0 register in the code, let's calculate factorial(5)
    mu.reg_write(UC_ARM_REG_R0, 5)
    # emulate code in infinite time and unlimited instructions
    mu.emu_start(ADDRESS, ADDRESS + len(ARM_BYTECODE))
    # now print out the R0 register
    print("Emulation done. Below is the result")
    # retrieve the result from the R1 register
    r1 = mu.reg_read(UC_ARM_REG_R1)
    print(">>  R1 = %u" % r1)
except UcError as e:
    print("Unicorn Error: %s" % e)

Emulating the ARM code
Emulation done. Below is the result
>>  R1 = 120


# **Disassembling x86-64 code**

In [9]:
# We need to emulate ARM and x86 code
from unicorn import Uc, UC_ARCH_X86, UC_MODE_64, UcError
# for accessing the RAX and RDI registers
from unicorn.x86_const import UC_X86_REG_RDI, UC_X86_REG_RAX
# We need to disassemble x86_64 code
from capstone import Cs, CS_ARCH_X86, CS_MODE_64, CsError

X86_MACHINE_CODE = b"\x48\x31\xc0\x48\xff\xc0\x48\x85\xff\x0f\x84\x0d\x00\x00\x00\x48\x99\x48\xf7\xe7\x48\xff\xcf\xe9\xea\xff\xff\xff"
# memory address where emulation starts
ADDRESS = 0x1000000
try:
      # Initialize the disassembler in x86 mode
      md = Cs(CS_ARCH_X86, CS_MODE_64)
      # iterate over each instruction and print it
      for instruction in md.disasm(X86_MACHINE_CODE, 0x1000):
            print("0x%x:\t%s\t%s" % (instruction.address, instruction.mnemonic, instruction.op_str))
except CsError as e:
      print("Capstone Error: %s" % e)

0x1000:	xor	rax, rax
0x1003:	inc	rax
0x1006:	test	rdi, rdi
0x1009:	je	0x101c
0x100f:	cqo	
0x1011:	mul	rdi
0x1014:	dec	rdi
0x1017:	jmp	0x1006


**try to emulate it with Unicorn engine**

In [10]:
try:
    # Initialize emulator in x86_64 mode
    mu = Uc(UC_ARCH_X86, UC_MODE_64)
    # map 2MB memory for this emulation
    mu.mem_map(ADDRESS, 2 * 1024 * 1024)
    # write machine code to be emulated to memory
    mu.mem_write(ADDRESS, X86_MACHINE_CODE)
    # Set the r0 register in the code to the number of 7
    mu.reg_write(UC_X86_REG_RDI, 7)
    # emulate code in infinite time & unlimited instructions
    mu.emu_start(ADDRESS, ADDRESS + len(X86_MACHINE_CODE))
    # now print out the R0 register
    print("Emulation done. Below is the result")
    rax = mu.reg_read(UC_X86_REG_RAX)
    print(">>> RAX = %u" % rax)
except UcError as e:
    print("Unicorn Error: %s" % e)

Emulation done. Below is the result
>>> RAX = 5040
