# Approx. Concrete Execution (ACE) Fingerprinting Demo

This notebook demonstrates some of the basics of ACE by extracting and fingerprinting functions from a "toy" binary. As a refresher, here's a quick overview of the ACE process.

![ACE process flowchart](aceflow.png)

1. We start with a ELF binary executable file and **disassemble** it into its original assembly code representation. In practice, we'd use another tool, *Nucleus* to separate the assembly code into individual functions (not shown).
2. We then use the *PyREIL* library to **translate** that assembly code (which can be in x86, x86_64, ARMv7, or ARMv8 assembly, and can contain anything from a single function to an entire program) into REIL code, an intermediate representation. You can learn more about REIL on the [zynamics website](https://www.zynamics.com/binnavi/manual/html/reil_language.htm).
3. The resulting REIL code is then loaded into an *approximate VM (aVM)* and is **approximately executed** inside the aVM, changing the state of the aVM's registers
4. The final state of the aVM's registers (and memory, possibly) becomes the **fingerprint** of the original code 

## Sample Executable
We'll use a simple "helloworld" executable with two functions to demonstrate ACE. Here's the C source code.
```
#include <stdio.h>

void print_hello() {
   puts("Hello");
   return;
}

void print_world() {
   puts("World");
   return;
}

int main() {
   print_hello();
   print_world();
   return 0;
}
```
We'll compile it using GCC with all optimizations turned off and debugging symbols turned on. This produces the x86 assembly code shown below

In [1]:
!gcc -Wall -ansi -fno-inline-functions -O0 -c helloworld.c -o helloworld.o
!gcc -static helloworld.o -o helloworld
!objdump -d -M intel helloworld.o


helloworld.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <print_hello>:
   0:	55                   	push   rbp
   1:	48 89 e5             	mov    rbp,rsp
   4:	48 8d 3d 00 00 00 00 	lea    rdi,[rip+0x0]        # b <print_hello+0xb>
   b:	e8 00 00 00 00       	call   10 <print_hello+0x10>
  10:	90                   	nop
  11:	5d                   	pop    rbp
  12:	c3                   	ret    

0000000000000013 <print_world>:
  13:	55                   	push   rbp
  14:	48 89 e5             	mov    rbp,rsp
  17:	48 8d 3d 00 00 00 00 	lea    rdi,[rip+0x0]        # 1e <print_world+0xb>
  1e:	e8 00 00 00 00       	call   23 <print_world+0x10>
  23:	90                   	nop
  24:	5d                   	pop    rbp
  25:	c3                   	ret    

0000000000000026 <main>:
  26:	55                   	push   rbp
  27:	48 89 e5             	mov    rbp,rsp
  2a:	b8 00 00 00 00       	mov    eax,0x0
  2f:	e8 00 00 00 00       	ca

### Translation
In this section, we'll translate each function to REIL code.

Note: Normally at this point, we'd use *Nucleus* to separate the `.text` section of the `helloworld` executable into individual functions. For the purposes of this demonstration, we'll use the debugging symbol information embedded into the binary instead, which contains the exact start and end locations of functions.

In [1]:
#### Helper functions. Reader can ignore this cell

from elftools.elf.elffile import ELFFile
from elftools.elf.sections import SymbolTableSection, Symbol
from reil.x86.translator import translate

def get_func_addr_range(bin_path, func_name):
    """
    Get a function's address range in a binary using its name
    
    :param bin_path: a path to the executable being searched
    :param name: the name of the function you're looking for
    :returns: a tuple containing the start and end addr of the function, or None if not found
    """
    with open(bin_path, 'rb') as f:
        # Load the binary's symbol table into PyELFtools 
        symtab = ELFFile(f).get_section_by_name(".symtab")
        sym = symtab.get_symbol_by_name(func_name)
        if sym is not None and len(sym) == 1:
            # Return the address of the binary
            start_addr = sym[0]['st_value']
            size = sym[0]['st_size']
            return (start_addr, start_addr + size)
        else:
            print("Couldn't find symbol " + func_name)
            return None
        
def get_raw_bytes(path, start, stop):
    """
    Get the raw bytes between two MEMORY addresses in an ELF binary
    
    :param path: string path to the ELF file
    :param start: the starting address to extract
    :param stop: the last address to extract
    :returns: a raw bytes object
    """
    with open(path, 'rb') as f:
        start_addr = list(ELFFile(f).address_offsets(start))[0]
        f.seek(start_addr)
        return f.read(stop - start)
    
def x86_64_to_reil(raw_bytes):
    """
    Wrapper function. Returns output from REIL translator in human-readable format
    """
    return list(il_ins for nat_ins in translate(raw_bytes, 0x0, x86_64=False) for il_ins in nat_ins.il_instructions)

Let's see what the `print_hello` and `print_world` functions look like in REIL

In [2]:
%%capture --no-stdout
def print_reil_translation(bin_path, func_name):
    """
    Pretty-prints the REIL translation of an x86 function
    
    :param bin_path: string path to the ELF file
    :param func_name: string name of the function to translate
    """
    addr_range = get_func_addr_range(bin_path, func_name)
    func_bytes = get_raw_bytes(bin_path, addr_range[0], addr_range[1])
    reil_code = x86_64_to_reil(func_bytes)
    for ins in reil_code:
        print(str(ins))
    return reil_code
        
print("-- print_hello() --")
hello_reil = print_reil_translation("helloworld", "print_hello")

print("\n-- print_world() --")
world_reil = print_reil_translation("helloworld", "print_world")
%%capture --no-stdout --no-stderr

-- print_hello() --
sub esp, 4, esp
stm ebp, esp
sub eax, 1, t00
and eax, 2147483648, t02
xor 1, 2147483648, t01
and t01, 2147483648, t03
and t00, 2147483648, t04
xor t02, t04, t05
xor t03, t04, t06
and t05, t06, t07
bisnz t07, of
bisnz t04, sf
and t00, 4294967295, t08
bisz t08, zf
str t00, t10
lshr t10, 4, t11
xor t10, t11, t10
and t10, 15, t11
lshr 38505, t11, t12
and t12, 1, pf
str t00, t13
str t13, eax
str esp, ebp
sub eax, 1, t00
and eax, 2147483648, t02
xor 1, 2147483648, t01
and t01, 2147483648, t03
and t00, 2147483648, t04
xor t02, t04, t05
xor t03, t04, t06
and t05, t06, t07
bisnz t07, of
bisnz t04, sf
and t00, 4294967295, t08
bisz t08, zf
str t00, t10
lshr t10, 4, t11
xor t10, t11, t10
and t10, 15, t11
lshr 38505, t11, t12
and t12, 1, pf
str t00, t13
str t13, eax
str 595436, edi
sub esp, 4, esp
stm 16, esp
jcc 1, 63219

-- print_world() --
sub esp, 4, esp
stm ebp, esp
sub eax, 1, t00
and eax, 2147483648, t02
xor 1, 2147483648, t01
and t01, 2147483648, t03
and t00, 2147483648,

Notice that these translations are almost identical, which is to be expected considering the original functions were almost identical. 

The REIL translator in use makes a good effort to accurately model the original x86 code, but the translation is certainly not perfect. For our purposes, that is okay; as long as translation is consistent (as shown by the near identical translations above), we can use them during approximate execution.

### Approximate Execution

We'll now step through the approximate execution process. The REIL code shown above will be executed by the aVM, which is implemented in `acevm.py` (attached). The final state of the registers will become the function fingerprint.

We'll begin by setting up an instance of the aVM. The constructor for the `REILApproximateVM` object takes three parameters. The first two parameters, `reg_context` and `mem_context`, allow the user to specify the initial state of the VM's registers and memory using the `REILRegContext` and `REILMemContext` enumerated types. Specifying `REILRegContext.zeros` as the register context, for example, sets all registers in the aVM initially to zero, and specifying `REILMemContext.address` as the memory context would set all memory values to the address itself (i.e. reads of memory location 0xE would return 0xE). See `acevm.py` for other RegContext options. 

The third parameter, `dynamic_memory`, controls the behavior of the `STM` (Store to memory) instruction. When `False`, all `STM` instructions are ignored, effectively making the aVM's memory read-only. When `True`, the aVM's memory is writable.

At this time, the effect of these parameters on the effectiveness of ACE as a fingerprinting tool is unknown and will be investigated through further experiments.

In [4]:
from acevm import REILApproximateVM, REILRegContext, REILMemContext 

hello_avm = REILApproximateVM(REILRegContext.zeros, REILMemContext.address, True) 
world_avm = REILApproximateVM(REILRegContext.zeros, REILMemContext.address, True) 

We'll now step through the execution of `print_hello()`. If you haven't already, now would be a good time to read the REIL specification on the [zynamics website](https://www.zynamics.com/binnavi/manual/html/reil_language.htm). In summary, a reference REIL virtual machine has unlimited memory and an unlimited number of *temporary registers*, as well as the original *named registers* of the target architecture (i.e. x86's `rsp`, `rbp`, `rdi`, etc.). In this implementation, the aVM contains 32 temporary registers, all 69 of x86's named registers, and "unlimited memory" represented internally by a sparse array.

For readability, only the contents of the temporary registers will be shown below after each instruction.

In [5]:
print("-- Step-by-step execution of print_hello() --")
for ins in hello_reil:
    print(hello_avm.t_regs)
    print(ins)
    hello_avm.execute(ins)
print(hello_avm.t_regs)

-- Step-by-step execution of print_hello() --
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
sub rsp, 8, rsp
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
stm rbp, rsp
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
str rsp, rbp
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
str 11, t00
[11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
add t00, 595436, t01
[11, 595447, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
and t01, 18446744073709551615, t02
[11, 595447, 595447, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
str t02, rdi
[11, 595447, 595447, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
sub rsp, 8, rsp
[11, 59544

Note that jump and branch instructions like `JCC` are ignored by the aVM and have no effect.

We'll now repeat the same process with `print_world()`.

In [6]:
print("-- Step-by-step execution of print_world() --")
for ins in world_reil:
    print(world_avm.t_regs)
    print(ins)
    world_avm.execute(ins)
print(world_avm.t_regs)

-- Step-by-step execution of print_world() --
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
sub rsp, 8, rsp
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
stm rbp, rsp
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
str rsp, rbp
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
str 11, t00
[11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
add t00, 595423, t01
[11, 595434, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
and t01, 18446744073709551615, t02
[11, 595434, 595434, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
str t02, rdi
[11, 595434, 595434, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
sub rsp, 8, rsp
[11, 59543

### Code Fingerprinting

The final state of the aVM post-execution serves as the fingerprint of the original code. The simplest fingerprint could be derived from just the aVM's 32 temporary registers. If this was the case, `print_hello()`'s fingerprint would simply be:

In [7]:
print(hello_avm.t_regs)

[11, 595447, 595447, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


A more complex fingerprint could incorporate the achitecture-specific named registers, but this would defeat the theorized architecture-agnostic qualities of ACE. Nonetheless, concatenating the temporary and named registers of final state of the aVM that ran `print_world()` would produce a fingerprint like this:

In [8]:
print(world_avm.t_regs + list(world_avm.n_regs.values()))

[11, 595434, 595434, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 595434, -8, -16, 0, 0, 0, 0, 0, 0, 0, 0, 0]


Even further complexity could be introduced by incorporating data about memory accesses (like the number of times memory was read from or written to) or instruction counts, or by performing some statistical analysis or other mathematical transformations. All of these are avenues for future experimentation.

Finally, how to use these fingerprints is yet another area for exploration. Current strategies include populating a database with labeled function fingerprints and using cosine similarity to determine the identity of unlabeled function fingerprints, as well as machine-learning-based approaches of a similar nature.

## Appendix
### Source of acevm.py

In [None]:
import random
from enum import Enum
from reil.definitions import ImmediateOperand, TemporaryOperand, RegisterOperand, _opcode_to_string


class REILRegContext(Enum):
    """
    Enumerated type representing different possible startup register contexts
    """
    # Undefined/custom (used when user provides context)
    undefined = 0
    # All registers = 0 
    zeros = 1
    # All registers = 1
    ones = 2
    # All registers = 1,073,741,823 (0x3FFFFFFF, or half of the 32-bit signed max)
    halfmax_signed = 3
    # All registers = 2,147,483,647 (0x7FFFFFFF, or half of the 32-bit unsigned max)
    halfmax_unsigned = 4
    # All registers = -1,073,741,823 (0xC0000001, or half of the 32-bit signed min)
    halfmin_signed = 5
    # All registers = 0x55555555 (0b01010101010101010101010101010101)
    bitweave_one = 6
    # All registers = 0x0F0F0F0F (0b00001111000011110000111100001111)
    bitweave_four = 7
    # Register values count up from zero
    countup = 8
    # Register values count down from 31
    countdown = 9
    # Random permutations (see implementation)
    random1 = 10
    random2 = 11
    random3 = 12
    random4 = 13
    
class REILMemContext(Enum):
    """
    Enumerated type representing different possible startup memory contexts
    """
    # Undefined/custom (used when user provides context)
    undefined = 0
    # All addresses = 0
    zeros = 1
    # ...
    # TODO: determine if other memory contexts are helpful
    # ...
    # All memory cells have their address as their value
    address = 2


class REILApproximateVM():
    """
    "Approximate" virtual machine for REIL. Designed for fast collection of context changes, not correct execution
    """
    def __init__(self, reg_context, mem_context, dynamic_mem = True):
        """
        Constructor
        """
        # Memory setup
        self.mem_context = mem_context
        self.dynamic_mem = dynamic_mem
        self.memory = {}
        
        # Register setup
        X86_REG_NAMES = ['al', 'ah', 'bl', 'bh', 'cl', 'ch', 'dl', 'dh', 'sil', 'dil', 'bpl', 'spl', 'r8b', 'r9b', 'r10b', 'r11b', 'r12b', 'r13b', 'r14b', 'r15b', 'ax', 'bx', 'cx', 'dx', 'si', 'di', 'bp', 'sp', 'r8w', 'r9w', 'r10w', 'r11w', 'r12w', 'r13w', 'r14w', 'r15w', 'eax', 'ebx', 'ecx', 'edx', 'esi', 'edi', 'ebp', 'esp', 'r8d', 'r9d', 'r10d', 'r11d', 'r12d', 'r13d', 'r14d', 'r15d', 'rax', 'rbx', 'rcx', 'rdx', 'rsi', 'rdi', 'rbp', 'rsp', 'r8', 'r9', 'r10', 'r11', 'r12', 'r13', 'r14', 'r15', 'rip']
        NUM_T_REGS = 32
        NUM_N_REGS = len(X86_REG_NAMES)
        self.reg_context = reg_context
        n_regs = {}
        
        
        if not isinstance(reg_context, REILRegContext) and hasattr(reg_context, "__getitem__"):
            # Allows user to provide array of registers for custom contexts
            self.reg_context = REILRegContext.undefined
            self.t_regs = reg_context
            self.n_regs = dict(zip(X86_REG_NAMES, reg_context))
        elif reg_context == REILRegContext.zeros:
            self.t_regs = [0] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [0] * NUM_N_REGS))
        elif reg_context == REILRegContext.ones:
            self.t_regs = [1] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [1] * NUM_N_REGS))
        elif reg_context == REILRegContext.halfmax_signed:
            self.t_regs = [0x3FFFFFFF] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [0x3FFFFFFF] * NUM_N_REGS))
        elif reg_context == REILRegContext.halfmax_unsigned:
            self.t_regs = [0x7FFFFFFF] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [0x7FFFFFFF] * NUM_N_REGS))
        elif reg_context == REILRegContext.halfmin_signed:
            self.t_regs = [0xC0000001] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [0xC0000001] * NUM_N_REGS))
        elif reg_context == REILRegContext.bitweave_one:
            self.t_regs = [0x55555555] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [0x55555555] * NUM_N_REGS))
        elif reg_context == REILRegContext.bitweave_four:
            self.t_regs = [0x0F0F0F0F] * NUM_T_REGS
            self.n_regs = dict(zip(X86_REG_NAMES, [0x0F0F0F0F] * NUM_N_REGS))
        elif reg_context == REILRegContext.countup:
            self.t_regs = list(range(NUM_T_REGS))
            self.n_regs = dict(zip(X86_REG_NAMES, list(range(NUM_N_REGS))))
        elif reg_context == REILRegContext.countdown:
            self.t_regs = list(range(NUM_T_REGS-1, -1, -1))
            self.n_regs = dict(zip(X86_REG_NAMES, list(range(NUM_N_REGS-1, -1, -1))))
        elif reg_context == REILRegContext.random1:
            self.t_regs = REILApproximateVM.__random_context(NUM_T_REGS, 100)
            self.n_regs = dict(zip(X86_REG_NAMES, REILApproximateVM.__random_context(NUM_N_REGS, 101)))
        elif reg_context == REILRegContext.random2:
            self.t_regs = REILApproximateVM.__random_context(NUM_T_REGS, 200)
            self.n_regs = dict(zip(X86_REG_NAMES, REILApproximateVM.__random_context(NUM_N_REGS, 201)))
        elif reg_context == REILRegContext.random3:
            self.t_regs = REILApproximateVM.__random_context(NUM_T_REGS, 300)
            self.n_regs = dict(zip(X86_REG_NAMES, REILApproximateVM.__random_context(NUM_N_REGS, 301)))
        elif reg_context == REILRegContext.random4:
            self.t_regs = REILApproximateVM.__random_context(NUM_T_REGS, 400)
            self.n_regs = dict(zip(X86_REG_NAMES, REILApproximateVM.__random_context(NUM_N_REGS, 401)))
        else:
            raise ValueError("Invalid register context")
    
    @classmethod
    def __random_context(cls, num_regs, seed):
        """
        Deterministic random context generator
        """
        t_regs = [0] * num_regs
        random.seed(seed)
        for s in range(num_regs):
            t_regs[s] = random.randint(-1073741823, 1073741823)
        return t_regs
    
    def __set_reg(self, value, reg):
        """
        Stores a value in a register
        """
        if isinstance(reg, TemporaryOperand):
            self.t_regs[int(reg.name[1:])] = value
        elif isinstance(reg, RegisterOperand):
            self.n_regs[reg.name] = value
            pass
        else:
            raise ValueError("Invalid register provided")
            
    def __resolve_op(self, operand):
        """
        Given an operand, return its value
        """
        if isinstance(operand, ImmediateOperand):
            return operand.value
        elif isinstance(operand, TemporaryOperand):
            return self.t_regs[int(operand.name[1:])]
        elif isinstance(operand, RegisterOperand):
            return self.n_regs[operand.name]
        else:
            raise ValueError("Invalid register provided")
        
    #### BEGIN INSTRUCTION IMPLEMENTATION ####
    def __add(self, input0, input1, output):
        """
        Adds the two values given in the first and second operand and writes the result to the third operand. 
        The input operands can be literals and register values. The output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) + self.__resolve_op(input1), output)
        return
    
    def __and(self, input0, input1, output):
        """
        Binary AND operation that connects the first two operands and stores the result in the third operand. 
        The input operands can be literals and register values. The output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) & self.__resolve_op(input1), output)
        return
    
    def __bisz(self, input0, input1, output):
        """
        Sets a flag depending on whether another value is zero. 
        The input operand can be a literal or a register value. The output operand is a register.
        """
        self.__set_reg(1 if self.__resolve_op(input0) == 0 else 0, output)
        return
        
    def __bsh(self, input0, input1, output):
        """
        Performs a logical shift on a value. If the second operand is positive, the shift is a left-shift. 
        If the second operand is negative, the shift is a right-shift. The two input operands can be either 
        registers or literals while the output operand must be a register.
        """
        if self.__resolve_op(input1) > 0:
            self.__set_reg(self.__resolve_op(input0) << self.__resolve_op(input1), output)
        else:
            self.__set_reg(self.__resolve_op(input0) >> self.__resolve_op(input1), output)
        return
    
    def __div(self, input0, input1, output):
        """
        Performs an unsigned division on the two input operands. The first input operand is the dividend, 
        the second input operand is the divisor. The two input operands can be either registers or literals 
        while the output operand must be a register.
        
        Imp. Note: output is zero on divide-by-zero
        """
        try:
            self.__set_reg(self.__resolve_op(input0) // self.__resolve_op(input1), output)
        except ZeroDivisionError:
            self.__set_reg(0, output)
        return
    
    def __jcc(self, input0, input1, output):
        """
        Conditional jump. Not implemented.
        """
        pass
    
    def __ldm(self, input0, input1, output):
        """
        Loads a value from memory. The first operand specifies the address to read from. It can be either 
        a register or a literal. The third operand must be a register where the loaded value is stored. 
        The size of the third operand determines how many bytes are read from memory.
        
        Imp. Note: we only read the address specifed, nothing beyond that
        """
        # If dynamic memory is on, we check if that address has been written to yet
        if self.dynamic_mem:
            saddr = str(self.__resolve_op(input0))
            try:
                self.__set_reg(self.memory[saddr], output)
            except KeyError:
                # if it hasn't been written to, we just return the address as the value
                # TODO: IMPLEMENT OTHER MemContexts
                self.__set_reg(self.__resolve_op(input0), output)
        else:
            # if dynamic memory is off, we just return the address as the value
            self.__set_reg(self.__resolve_op(input0), output)
        return
    
    def __mod(self, input0, input1, output):
        """
        Performs a modulo operation on the first two operands. The two input operands can be either registers 
        or literals while the output operand must be a register.
        
        Imp. Note: output is zero on divide-by-zero
        """
        try:
            self.__set_reg(self.__resolve_op(input0) % self.__resolve_op(input1), output)
        except ZeroDivisionError:
            self.__set_reg(0, output)
        return
    
    def __mul(self, input0, input1, output):
        """
        Performs an unsigned multiplication on the two input operands. The two input operands can be either 
        registers or literals while the output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) * self.__resolve_op(input1), output)
        return
        
    def __nop(self, input0, input1, output):
        """
        Does nothing.
        """
        pass
    
    def __or(self, input0, input1, output):
        """
        Binary OR operation that connects the first two operands and stores the result in the third operand. 
        The input operands can be literals and register values. The output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) | self.__resolve_op(input1), output)
        return
    
    def __stm(self, input0, input1, output):
        """
        Stores a value to memory. The first operand is the register value or literal to be stored in memory. 
        The third operand is the register value or literal that contains the memory address where the value is stored. 
        The size of the first operand determines the number of bytes to be written to memory.
        """
        # Only does anything if dynamic memory is on
        if self.dynamic_mem:
            self.memory[str(self.__resolve_op(output))] = self.__resolve_op(input0)
        return
    
    def __str(self, input0, input1, output):
        """
        Copies a value to a register. The input operand can be either a literal or a register. The output operand must 
        be a register. If the output operand is of a larger size than the input operand, the input is zero-extended.
        """
        self.__set_reg(self.__resolve_op(input0), output)
        return
    
    def __sex(self, input0, input1, output):
        """
        Functionally identical to STR in this implementation
        """
        self.__set_reg(self.__resolve_op(input0), output)
        return
    
    def __sub(self, input0, input1, output):
        """
        Subtracts the second input operand from the first input operand and writes the result to the output operand.
        The input operands can be literals and register values. The output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) - self.__resolve_op(input1), output)
        return 
    
    def __undef(self, input0, input1, output):
        """
        Flags a register value as undefined. Not implemented.
        """
        pass
    
    def __unkn(self, input0, input1, output):
        """
        Untranslatable native instruction placeholder. Does nothing.
        """
        pass
    
    def __xor(self, input0, input1, output):
        """
        Binary XOR operation that connects the first two operands and stores the result in the third operand. 
        The input operands can be literals and register values. The output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) ^ self.__resolve_op(input1), output)
        return 
    
    def __bisnz(self, input0, input1, output):
        """
        Sets a flag depending on whether another value is nonzero. The input operand can be a literal or 
        a register value. The output operand is a register.
        """
        # Assuming flag = 1 if nonzero
        self.__set_reg(0 if self.__resolve_op(input0) == 0 else 1, output)
        return
    
    def __equ(self, input0, input1, output):
        """
        Sets a flag depending on whether another two values are equal. The input operands can be literal or 
        register values. The output operand is a register.
        """
        # Assuming flag = 1 if equal
        self.__set_reg(1 if self.__resolve_op(input0) == self.__resolve_op(input1) else 0, output)
        return
    
    def __lshl(self, input0, input1, output):
        """
        Performs a logical left shift on a value. The two input operands can be either registers or literals 
        while the output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) << self.__resolve_op(input1), output)
        return
    
    def __lshr(self, input0, input1, output):
        """
        Performs a logical right shift on a value. The two input operands can be either registers or literals 
        while the output operand must be a register.
        """
        self.__set_reg(self.__resolve_op(input0) >> self.__resolve_op(input1), output)
        return
    
    def __ashr(self, input0, input1, output):
        #TODO implement?
        self.__lshr(self, input0, input1, output)
        return
    
    #### END INSTRUCTION IMPLEMENTATION ####
    
    def execute(self, instruction):
        """
        Takes in a single REIL instruction and executes it in the approximate virtual machine
        """
        op = getattr(self, "_REILApproximateVM__" + _opcode_to_string(instruction.opcode))
        op(instruction.input0, instruction.input1, instruction.output)
        #print(op)
        return