# Fourth Step

In this fourth project, you're going to translate the optimized SSA intermediate representation uCIR into LLVM IR, the intermediate representation of LLVM that is partially specified in [LLVM Primer](./llvm_primer.ipynb). LLVM is a set of production-quality reusable libraries for building compilers. LLVM separates computer architectures from language issues and simplifies the design and portability of new compilers. Before you start, carefully study this simplified specification of the LLVM-IR, tailored to our needs, to familiarize yourself with data structures, addressing modes, and the instructions. The full LLVM IR specification you can find [here](https://llvm.org/docs/LangRef.html).

We will try to give you a hand by giving you some of the pieces prewritten and littered your code with helpful statements. By the time you’re done, you’ll have a pretty thorough understanding of the execution code for uC programs and will even gain a little reading familiarity with LLVM-IR.

# llvmlite

To carry out this step, we will make use of the llvmlite library. llvmlite is a lightweight LLVM python binding for writing JIT compilers. We do not need a full LLVM API. Only the IR builder, optimizer, and JIT compiler APIs are necessary. So, llvmlite is a project tailored for us, using the following approach: a small C wrapper around the parts of the LLVM C++ API we need that are not already exposed by the LLVM C API; a ctypes Python wrapper around the C API; and a pure Python implementation of the subset of the LLVM IR builder. You'll find the documentation and how to install llvmlite at http://llvmlite.pydata.org. An example, extracted from the llvmlite documentation, is showed [here](./llvm_example.ipynb). Study it carefully.


# Generating Code

The basic idea in this step is exactly the same as the interpreter in the step 2. You'll make a class that walks through the instruction sequence and triggers a method for each kind of instruction.  Instead of running the instruction however, you'll be generating LLVM instructions. The code below show an example to interface with llvm. Further instructions are contained in the comments.

In [None]:
from llvmlite import ir, binding
from ctypes import CFUNCTYPE, c_int

class CodeGen():
    def __init__(self):
        self.binding = binding
        self.binding.initialize()
        self.binding.initialize_native_target()
        self.binding.initialize_native_asmprinter()
        
        self.module = ir.Module(name=__file__)
        self.module.triple = self.binding.get_default_triple()
        
        self._create_execution_engine()
        
        # declare external functions
        self._declare_printf_function()
        self._declare_scanf_function()

    def _create_execution_engine(self):
        """
        Create an ExecutionEngine suitable for JIT code generation on
        the host CPU.  The engine is reusable for an arbitrary number of
        modules.
        """
        target = self.binding.Target.from_default_triple()
        target_machine = target.create_target_machine()
        # And an execution engine with an empty backing module
        backing_mod = binding.parse_assembly("")
        engine = binding.create_mcjit_compiler(backing_mod, target_machine)
        self.engine = engine

    def _declare_printf_function(self):
        voidptr_ty = ir.IntType(8).as_pointer()
        printf_ty = ir.FunctionType(ir.IntType(32), [voidptr_ty], var_arg=True)
        printf = ir.Function(self.module, printf_ty, name="printf")
        self.printf = printf

    def _declare_scanf_function(self):
        voidptr_ty = ir.IntType(8).as_pointer()
        scanf_ty = ir.FunctionType(ir.IntType(32), [voidptr_ty], var_arg=True)
        scanf = ir.Function(self.module, scanf_ty, name="scanf")
        self.scanf = scanf

    def _compile_ir(self):
        """
        Compile the LLVM IR string with the given engine.
        The compiled module object is returned.
        """
        # Create a LLVM module object from the IR
        llvm_ir = str(self.module)
        mod = self.binding.parse_assembly(llvm_ir)
        mod.verify()
        # Now add the module and make sure it is ready for execution
        self.engine.add_module(mod)
        self.engine.finalize_object()
        self.engine.run_static_constructors()
        return mod

    def save_ir(self, filename):
        with open(filename, 'w') as output_file:
            output_file.write(str(self.module))
            
    def execute_ir(self):
        mod = self._compile_ir()
        # Obtain a pointer to the compiled 'main' - it's the address of its JITed code in memory.
        main_ptr = self.engine.get_pointer_to_function(mod.get_function('main'))
        # To convert an address to an actual callable thing we have to use
        # CFUNCTYPE, and specify the arguments & return type.
        main_function = CFUNCTYPE(c_int)(main_ptr)
        # Now 'main_function' is an actual callable we can invoke
        res = main_function()
        print(res)

## printf Sample Code 

The code snippet below shows how you can map the print instruction from uCIR to LLVM. Part of the class structure that generates LLVM instructions for a function is shown. Use it as a guide. 

In [None]:
def make_bytearray(buf):
    # Make a byte array constant from *buf*.
    b = bytearray(buf)
    n = len(b)
    return ir.Constant(ir.ArrayType(ir.IntType(8), n), b)


class LLVMFunctionVisitor(BlockVisitor):

    def __init__(self, module):
        self.module = module
        self.func = None
        self.builder = None
        self.loc = {}

    def _global_constant(self, builder_or_module, name, value, linkage='internal'):
        # Get or create a (LLVM module-)global constant with *name* or *value*.
        if isinstance(builder_or_module, ir.Module):
            mod = builder_or_module
        else:
            mod = builder_or_module.module
        data = ir.GlobalVariable(mod, value.type, name=name)
        data.linkage = linkage
        data.global_constant = True
        data.initializer = value
        data.align = 1
        return data

    def _cio(self, fname, format, *target):
        # Make global constant for string format
        mod = self.builder.module
        fmt_bytes = make_bytearray((format + '\00').encode('ascii'))
        global_fmt = self._global_constant(mod, mod.get_unique_name('.fmt'), fmt_bytes)
        fn = mod.get_global(fname)
        ptr_fmt = self.builder.bitcast(global_fmt, ir.IntType(8).as_pointer())
        return self.builder.call(fn, [ptr_fmt] + list(target))

    def _build_print(self, val_type, target):
        if target:
            # get the object assigned to target
            _value = self._get_loc(target)
            if val_type == 'int':
                self._cio('printf', '%d', _value)
            elif val_type == 'float':
                self._cio('printf', '%.2f', _value)
            elif val_type == 'char':
                self._cio('printf', '%c', _value)
            elif val_type == 'string':
                self._cio('printf', '%s', _value)
        else:
            self._cio('printf', '\n')