# Draft Report
### Group 24
### Gavin Johnson 1217930
### Gabriel Dalimonte 1208473

The core concept of our project is to add SIMD (single instruction multiple data) instructions to P0. To briefly explain SIMD allows the CPU to execute the same instruction on multiple data elements in parralel. This greatly speeds up operations such as vector arithmetic. In order to do this several changes need to be made. The largest update is changing architecture. MIPS does not suppoer SIMD instructions so the code generartion needed to be entirely changed for ARM64 architecture. This was the largest undertaking of the project. In addition to this, P0.py needed to be extended to allow for these operations. This is a much more forgiving and adaptable adjustment so it was left till the end. Once the backend is working properly, the front-end is an easy update. Unfortunately, the machines available to complete this project do not run on ARM64 meaning testing the output code requires a virtual simulator. To deal with this, we attempted to use Unicorn (http://www.unicorn-engine.org). There was a fair bit more work involved in using this than anticipated but more on that later.

## P0.py Front-End
First we will discuss the front end. The changes made are quite minimal and incomplete but they show a basic idea of what will come in full implementation once the backend is fully developed.

In [1]:
import nbimporter
from sys import argv
import SC  #  used for SC.init, SC.sym, SC.val, SC.error
from SC import TIMES, DIV, MOD, AND, PLUS, MINUS, OR, EQ, NE, LT, GT, \
     LE, GE, PERIOD, COMMA, COLON, RPAREN, RBRAK, OF, THEN, DO, LPAREN, \
     LBRAK, NOT, BECOMES, NUMBER, IDENT, SEMICOLON, END, ELSE, IF, WHILE, \
     ARRAY, RECORD, CONST, TYPE, VAR, PROCEDURE, BEGIN, PROGRAM, EOF, \
     getSym, mark
import ST  #  used for ST.init
from ST import Var, Ref, Const, Type, Proc, StdProc, Int, Bool, Enum, \
     Record, Array, newObj, find, openScope, topScope, closeScope


# first and follow sets for recursive descent parsing

FIRSTFACTOR = {IDENT, NUMBER, LPAREN, NOT}
FOLLOWFACTOR = {TIMES, DIV, MOD, AND, OR, PLUS, MINUS, EQ, NE, LT, LE, GT, GE,
                COMMA, SEMICOLON, THEN, ELSE, RPAREN, RBRAK, DO, PERIOD, END}
FIRSTEXPRESSION = {PLUS, MINUS, IDENT, NUMBER, LPAREN, NOT}
FIRSTSTATEMENT = {IDENT, IF, WHILE, BEGIN}
FOLLOWSTATEMENT = {SEMICOLON, END, ELSE}
FIRSTTYPE = {IDENT, RECORD, ARRAY, LPAREN}
FOLLOWTYPE = {SEMICOLON}
FIRSTDECL = {CONST, TYPE, VAR, PROCEDURE}
FOLLOWDECL = {BEGIN}
FOLLOWPROCCALL = {SEMICOLON, END, ELSE}
STRONGSYMS = {CONST, TYPE, VAR, PROCEDURE, WHILE, IF, BEGIN, EOF}

from sys import stdout

indent = '  '

# parsing procedures

def selector(x):
    """
    Parses
        selector = {"." ident | "[" expression "]"}.
    Assumes x is the entry for the identifier in front of the selector;
    generates code for the selector if no error is reported
    """
    while SC.sym in {PERIOD, LBRAK}:
        if SC.sym == PERIOD:  #  x.f
            getSym()
            if SC.sym == IDENT:
                if type(x.tp) == Record:
                    for f in x.tp.fields:
                        if f.name == SC.val:
                            x = CG.genSelect(x, f); break
                    else: mark("not a field")
                    getSym()
                else: mark("not a record")
            else: mark("identifier expected")
        else:  #  x[y]
            getSym(); y = expression()
            if type(x.tp) == Array:
                if y.tp == Int:
                    if type(y) == Const and \
                       (y.val < x.tp.lower or y.val >= x.tp.lower + x.tp.length):
                        mark('index out of bounds')
                    else: x = CG.genIndex(x, y)
                else: mark('index not integer')
            else: mark('not an array')
            if SC.sym == RBRAK: getSym(); 
            else: mark("] expected")
    return x

def factor():
    """
    Parses
        factor = ident selector |
                 integer  |
                 "("  expression ")" |
                 "not" factor.
    Generates code for the factor if no error is reported
    """
    if SC.sym not in FIRSTFACTOR:
        mark("expression expected"); getSym()
        while SC.sym not in FIRSTFACTOR | STRONGSYMS | FOLLOWFACTOR:
            getSym()
    if SC.sym == IDENT:
        x = find(SC.val)
        if type(x) in {Var, Ref}: x = CG.genVar(x)
        elif type(x) == Const: x = Const(x.tp, x.val); x = CG.genConst(x)
        else: mark('expression expected')
        getSym(); x = selector(x)
    elif SC.sym == NUMBER:
        x = Const(Int, SC.val); x = CG.genConst(x); getSym()
    elif SC.sym == LPAREN:
        wgetSym(); x = expression()
        if SC.sym == RPAREN: getSym()
        else: mark(") expected")
    elif SC.sym == NOT:
        getSym(); x = factor()
        if x.tp != Bool: mark('not boolean')
        elif type(x) == Const: x.val = 1 - x.val # constant folding
        else: x = CG.genUnaryOp(NOT, x)
    else: x = Const(None, 0)
    return x

importing Jupyter notebook from SC.ipynb
importing Jupyter notebook from ST.ipynb


In order to extend P0 to handle vector operations, operations such as TIMES, DIV, MOD, ADD, and MINUS had to be altered in order to accept operands of type Array. In addition the operands do not have to mach types. For example: a scalar can be multiplied with a vector ie. [1, 2, 3]*2 = [2, 4, 6]

It should be noted that a Vector*Vector does not result the cross product of the two vectors, but rather the pairwise operations of them. For example [1,2]*[2,3] = [2,6]

Parses
    term = factor {("*" | "div" | "mod" | "and") factor}.
Generates code for the term if no error is reported  

In [2]:
def term():
    x = factor()
    while SC.sym in {TIMES, DIV, MOD, AND}:
        op = SC.sym; getSym();
        if op == AND and type(x) != Const: x = CG.genUnaryOp(AND, x)
        y = factor() # x op y
        if op in {TIMES, DIV, MOD}:
            if x.tp == Int == y.tp:
                if type(x) == Const == type(y): # constant folding
                    if op == TIMES: x.val = x.val * y.val
                    elif op == DIV: x.val = x.val // y.val
                    elif op == MOD: x.val = x.val % y.val
                else: x = CG.genBinaryOp(op, x, y)
            ###ADDED CODE###        
            elif type(x.tp) == Array == type(y.tp): # Added array operations
                x = CG.genArrayVectorOp(op,x,y) # new code gen method for array(*|/|%)array operations
            elif (x.tp == Int and type(y.tp) == Array):
                x = CG.genArrayScalarOp(op,y,x) # new code gen method for array(*|/|%)scalar operations
            elif (type(x.tp) == Array and y.tp == Int):
                x = CG.genArrayScalarOp(op,x,y)
            else:
                mark('bad type')
        elif x.tp == Bool == y.tp and op == AND:
            if type(x) == Const: # constant folding
                if x.val: x = y # if x is true, take y, else x
            else: x = CG.genBinaryOp(AND, x, y)
        else: mark('bad type')
    return x

In a very similar fashion, the addition and subtraction operations were extended to support arrays as well.

Parses
    simpleExpression = ["+" | "-"] term {("+" | "-" | "or") term}.
Generates code for the simpleExpression if no error is reported

In [3]:
def simpleExpression():
    if SC.sym == PLUS:
        getSym(); x = term()
    elif SC.sym == MINUS:
        getSym(); x = term()
        if x.tp != Int: mark('bad type')
        elif type(x) == Const: x.val = - x.val # constant folding
        else: x = CG.genUnaryOp(MINUS, x)
    else: x = term()
    while SC.sym in {PLUS, MINUS, OR}:
        op = SC.sym; getSym()
        if op == OR and type(x) != Const: x = CG.genUnaryOp(OR, x)
        y = term() # x op y
        if op in {PLUS, MINUS}:
            if x.tp == Int == y.tp:
                if type(x) == Const == type(y): # constant folding
                    if op == PLUS: x.val = x.val + y.val
                    elif op == MINUS: x.val = x.val - y.val
                else: x = CG.genBinaryOp(op, x, y)
            ###ADDED CODE###
            elif x.tp == Array == y.tp: # type checking for array-array operation
                x = CG.genArrayVectorOp(op,x,y) # same code gen method as introduced above
            elif (x.tp == Int and y.tp == Array): # type check for scalar-array
                x = CG.genArrayScalarOp(op,y,x)
            elif (x.tp == Array and y.tp == Int): # type check for scalar-array
                x = CG.genArrayScalarOp(op,x,y)
        elif x.tp == Bool == y.tp and op == OR:
            if type(x) == Const: # constant folding
                if not x.val: x = y # if x is false, take y, else x
            else: x = CG.genBinaryOp(OR, x, y)
        else: mark('bad type')
    return x

In [5]:
def expression():
    """
    Parses
        expression = simpleExpression
                     {("=" | "<>" | "<" | "<=" | ">" | ">=") simpleExpression}.
    Generates code for the expression if no error is reported
    """
    x = simpleExpression()
    while SC.sym in {EQ, NE, LT, LE, GT, GE}:
        op = SC.sym
        getSym(); y = simpleExpression() # x op y
        if x.tp == Int == y.tp:
            x = CG.genRelation(op, x, y)
        else: mark('bad type')
    return x

def compoundStatement(l):
    """
    Parses
        compoundStatement(l) =
            "begin" 
            statement(l + 1) {";" statement(l + 1)}
            "end" 
    Generates code for the compoundStatement if no error is reported
    """
    if SC.sym == BEGIN: getSym()
    else: mark("'begin' expected")
    x = statement(l + 1)
    while SC.sym == SEMICOLON or SC.sym in FIRSTSTATEMENT:
        if SC.sym == SEMICOLON: getSym()
        else: mark("; missing")
        y = statement(l + 1); x = CG.genSeq(x, y)
    if SC.sym == END: getSym()
    else: mark("'end' expected")
    return x

def statement(l):
    """
    Parses
        statement =
            ident selector ":=" expression |
            ident "(" [expression
                {","  expression}] ")"  |
            compoundStatement(l) |
            "if"  expression
                "then" statement(l + 1)
                ["else" statement(l + 1)] |
            "while" expression "do" statement.
    Generates code for the statement if no error is reported
    """
    if SC.sym not in FIRSTSTATEMENT:
        mark("statement expected"); getSym()
        while SC.sym not in FIRSTSTATEMENT | STRONGSYMS | FOLLOWSTATEMENT:
            getSym()
    if SC.sym == IDENT:
        x = find(SC.val); getSym()
        x = CG.genVar(x)
        if type(x) in {Var, Ref}:
            x = selector(x)
            if SC.sym == BECOMES:
                getSym(); y = expression()
                if x.tp == y.tp: #in {Bool, Int}: # and not SC.error: type(y) could be Type 
                    #if type(x) == Var: ### and type(y) in {Var, Const}: incomplete, y may be Reg
                    x = CG.genAssign(x, y)
                    #else: mark('illegal assignment')
                elif type(x.tp) == type(y.tp) == Array:
                    x = CG.genDeferredAssign(x, y)
                else: mark('incompatible assignment')
            elif SC.sym == EQ:
                mark(':= expected'); getSym(); y = expression()
            else: mark(':= expected')
        elif type(x) in {Proc, StdProc}:
            fp, i = x.par, 0  #  list of formals, count of actuals
            if SC.sym == LPAREN:
                getSym()
                if SC.sym in FIRSTEXPRESSION:
                    y = expression()
                    if i < len(fp):
                        if (type(fp[i]) == Var or type(y) == Var) and \
                           fp[i].tp == y.tp:
                            if type(x) == Proc: CG.genActualPara(y, fp[i], i)
                            i = i + 1
                        else: mark('illegal parameter mode')
                    else: mark('extra parameter')
                    while SC.sym == COMMA:
                        getSym()
                        y = expression()
                        if i < len(fp):
                            if (type(fp[i]) == Var or type(y) == Var) and \
                               fp[i].tp == y.tp:
                                if type(x) == Proc: CG.genActualPara(y, fp[i], i)
                                i = i + 1
                            else: mark('illegal parameter mode')
                        else: mark('extra parameter')
                if SC.sym == RPAREN: getSym()
                else: mark("')' expected")
            if i < len(fp): mark('too few parameters')
            if type(x) == StdProc:
                if x.name == 'read': x = CG.genRead(y)
                elif x.name == 'write': x = CG.genWrite(y)
                elif x.name == 'writeln': x = CG.genWriteln()
            else: x = CG.genCall(x)
        else: mark("variable or procedure expected")
    elif SC.sym == BEGIN: x = compoundStatement(l + 1)
    elif SC.sym == IF:
        getSym(); x = expression();
        if x.tp == Bool: x = CG.genCond(x)
        else: mark('boolean expected')
        if SC.sym == THEN: getSym()
        else: mark("'then' expected")
        y = statement(l + 1)
        if SC.sym == ELSE:
            if x.tp == Bool: y = CG.genThen(x, y);
            getSym()
            z = statement(l + 1);
            if x.tp == Bool: x = CG.genIfElse(x, y, z)
        else:
            if x.tp == Bool: x = CG.genIfThen(x, y)
    elif SC.sym == WHILE:
        getSym(); t = CG.genTarget(); x = expression()
        if x.tp == Bool: x = CG.genCond(x)
        else: mark('boolean expected')
        if SC.sym == DO: getSym()
        else: mark("'do' expected")
        y = statement()
        if x.tp == Bool: x = CG.genWhile(t, x, y)
    else: x = None
    return x

def typ():
    """
    Parses
        type = ident |
               "array" "[" expression ".." expression "]" "of" type |
               "record" typedIds {";" typedIds} "end".
    Returns a type descriptor 
    """
    if SC.sym not in FIRSTTYPE:
        getSym(); mark("type expected")
        while SC.sym not in FIRSTTYPE | STRONGSYMS | FOLLOWTYPE:
            getSym()
    if SC.sym == IDENT:
        ident = SC.val; x = find(ident); getSym()
        if type(x) == Type: x = Type(x.tp)
        else: mark('not a type'); x = Type(None)
    elif SC.sym == ARRAY:
        getSym()
        if SC.sym == LBRAK: getSym()
        else: mark("'[' expected")
        x = expression()
        if SC.sym == PERIOD: getSym()
        else: mark("'.' expected")
        if SC.sym == PERIOD: getSym()
        else: mark("'.' expected")
        y = expression()
        if SC.sym == RBRAK: getSym()
        else: mark("']' expected")
        if SC.sym == OF: getSym()
        else: mark("'of' expected")
        z = typ().tp;
        if type(x) != Const or x.val < 0:
            mark('bad lower bound'); x = Type(None)
        elif type(y) != Const or y.val < x.val:
            mark('bad upper bound'); y = Type(None)
        else: x = Type(CG.genArray(Array(z, x.val, y.val - x.val + 1)))
    elif SC.sym == RECORD:
        getSym(); openScope(); typedIds(Var)
        while SC.sym == SEMICOLON:
            getSym(); typedIds(Var)
        if SC.sym == END: getSym()
        else: mark("'end' expected")
        r = topScope(); closeScope()
        x = Type(CG.genRec(Record(r)))
    else: x = Type(None)
    return x

def typedIds(kind):
    """
    Parses
        typedIds = ident {"," ident} ":" type.
    Updates current scope of symbol table
    Assumes kind is Var or Ref and applies it to all identifiers
    Reports an error if an identifier is already defined in the current scope
    """
    if SC.sym == IDENT: tid = [SC.val]; getSym()
    else: mark("identifier expected"); tid = []
    while SC.sym == COMMA:
        getSym()
        if SC.sym == IDENT: tid.append(SC.val); getSym()
        else: mark('identifier expected')
    if SC.sym == COLON:
        getSym(); tp = typ().tp
        if tp != None:
            for i in tid: newObj(i, kind(tp))
    else: mark("':' expected")

def declarations(allocVar):
    """
    Parses
        declarations =
            {"const" ident "=" expression ";"}
            {"type" ident "=" type ";"}
            {"var" typedIds ";"}
            {"procedure" ident ["(" [["var"] typedIds {";" ["var"] typedIds}] ")"] ";"
                declarations compoundStatement ";"}.
    Updates current scope of symbol table.
    Reports an error if an identifier is already defined in the current scope.
    For each procedure, code is generated
    """
    if SC.sym not in FIRSTDECL | FOLLOWDECL:
        getSym(); mark("'begin' or declaration expected")
        while SC.sym not in FIRSTDECL | STRONGSYMS | FOLLOWDECL: getSym()
    while SC.sym == CONST:
        getSym()
        if SC.sym == IDENT:
            ident = SC.val; getSym()
            if SC.sym == EQ: getSym()
            else: mark("= expected")
            x = expression()
            if type(x) == Const: newObj(ident, x)
            else: mark('expression not constant')
        else: mark("constant name expected")
        if SC.sym == SEMICOLON: getSym()
        else: mark("; expected")
    while SC.sym == TYPE:
        getSym()
        if SC.sym == IDENT:
            ident = SC.val; getSym()
            if SC.sym == EQ: getSym()
            else: mark("= expected")
            x = typ(); newObj(ident, x)  #  x is of type ST.Type
            if SC.sym == SEMICOLON: getSym()
            else: mark("; expected")
        else: mark("type name expected")
    start = len(topScope())
    while SC.sym == VAR:
        getSym(); typedIds(Var)
        if SC.sym == SEMICOLON: getSym()
        else: mark("; expected")
    varsize = allocVar(topScope(), start)
    while SC.sym == PROCEDURE:
        getSym()
        if SC.sym == IDENT: getSym()
        else: mark("procedure name expected")
        ident = SC.val; newObj(ident, Proc([])) #  entered without parameters
        sc = topScope()
        CG.procStart(); openScope() # new scope for parameters and body
        if SC.sym == LPAREN:
            getSym()
            if SC.sym in {VAR, IDENT}:
                if SC.sym == VAR: getSym(); typedIds(Ref)
                else: typedIds(Var)
                while SC.sym == SEMICOLON:
                    getSym()
                    if SC.sym == VAR: getSym(); typedIds(Ref)
                    else: typedIds(Var)
            else: mark("formal parameters expected")
            fp = topScope()
            sc[-1].par = fp[:] #  procedure parameters updated
            if SC.sym == RPAREN: getSym()
            else: mark(") expected")
        else: fp = []
        parsize = CG.genFormalParams(fp)
        if SC.sym == SEMICOLON: getSym()
        else: mark("; expected")
        localsize = declarations(CG.genLocalVars)
        CG.genProcEntry(ident, parsize, localsize)
        x = compoundStatement(); CG.genProcExit(x, parsize, localsize)
        closeScope() #  scope for parameters and body closed
        if SC.sym == SEMICOLON: getSym()
        else: mark("; expected")
    return varsize

def program():
    """
    Parses
        program = "program" ident ";" declarations compoundStatement(1).
    Generates code if no error is reported
    """
    newObj('boolean', Type(Bool)); Bool.size = 4
    newObj('integer', Type(Int)); Int.size = 4
    newObj('true', Const(Bool, 1))
    newObj('false', Const(Bool, 0))
    newObj('read', StdProc([Ref(Int)]))
    newObj('write', StdProc([Var(Int)]))
    newObj('writeln', StdProc([]))
    CG.progStart()
    if SC.sym == PROGRAM: getSym()
    else: mark("'program' expected")
    ident = SC.val
    if SC.sym == IDENT: getSym()
    else: mark('program name expected')
    if SC.sym == SEMICOLON: getSym()
    else: mark('; expected')
    declarations(CG.genGlobalVars); CG.progEntry(ident)
    x = compoundStatement(1)
    return CG.progExit(x)

def compileString(src, dstfn = None, target = 'armv8'):
    """Compiles string src; if dstfn is provided, the code is written to that
    file, otherwise printed on the screen"""
    global CG
    #  used for init, genRec, genArray, progStart, genGlobalVars, \
    #  progEntry, progExit, procStart, genFormalParams, genActualPara, \
    #  genLocalVars, genProcEntry, genProcExit, genSelect, genIndex, \
    #  genVar, genConst, genUnaryOp, genBinaryOp, genRelation, genSeq, \
    #  genAssign, genCall, genRead, genWrite, genWriteln, genCond, \
    #  genIfThen, genThen, genIfElse, genTarget, genWhile
    if target == 'mips': import CGmips as CG
    elif target == 'ast': import CGast as CG
    elif target == 'armv8': import CGARMv8 as CG
    else: print('unknown target'); return
    SC.init(src)
    ST.init()
    CG.init()
    p = program()
    if p != None and not SC.error:
        if dstfn == None: print(p)
        else:
            with open(dstfn, 'w') as f: f.write(p);

def compileFile(srcfn):
    if srcfn.endswith('.p'):
        with open(srcfn, 'r') as f: src = f.read()
        dstfn = srcfn[:-2] + '.s'
        compileString(src, dstfn)
    else: print("'.p' file extension expected")

Finally, due to the change in the code gen the default code generation file was changed to **CGARMv8.py** as opposed to CGmips.py.

In [7]:
compileString("""
program p;
  type S = array [0..11] of integer;
  var y: integer;
  var x: S;
  var z: S;
  begin
    y := 7;
    x := z * y;
    x := x * x
  end""")

ERROR:root:An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 5))



AssertionError: 

## Backend documentation can be seen in CGARMv8.ipynb

## Unicorn CPU Emulation
### Beyond this point code does not compile correctly

In an attempt to set up a virtual testing environment with Unicorn we downloaded the libraries available at http://www.unicorn-engine.org and installed the python binding. We did have marginal success with this, with everything working apart from the advanced SIMD instructions.

The unicorn code reads binary code from standard in, allowing compiled binary output to be directly piped into the emulator with a terminal command such as "hex.out | python unicorn.py"

In [None]:
from __future__ import print_function
from unicorn import *
from unicorn.arm64_const import *
import sys

code = sys.stdin.read() # read binary code on stdin

# code to be emulated
ARM64_CODE = b"%s"%(code) # our binary arm code from stdin

# memory address where emulation starts
ADDRESS = 0x1000000

In [None]:
# callback for tracing instructions
def hook_code(uc, address, size, user_data):
    print(">>> Tracing instruction at 0x%x, instruction size = 0x%x" %(address, size))

def hook_intr(uc, exception_index, user_data):

    print("exception index: %d"%exception_index)
    uc.emu_stop()

Address space is allocated for the emulation, architecture is specified and the code is ran. Registers can be checked to verify correct output.

In [None]:
print("Emulate arm code")
try:
    # Initialize emulator in ARM-64bit mode
    mu = Uc(UC_ARCH_ARM64, UC_MODE_ARM)

    # map 2MB memory for this emulation
    mu.mem_map(ADDRESS, 10 * 1024 * 1024)

    # write machine code to be emulated to memory
    mu.mem_write(ADDRESS, ARM64_CODE)

    # tracing all instructions with customized callback
    mu.hook_add(UC_HOOK_CODE, hook_code)

    mu.hook_add(UC_HOOK_INTR, hook_intr)

    # emulate code in infinite time & unlimited instructions
    mu.emu_start(ADDRESS, ADDRESS + len(ARM64_CODE))

    # now print out some registers
    print("Emulation done. Below is the CPU context")

    reg_XN = []
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X9))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X10))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X11))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X12))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X13))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X14))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X15))
    reg_XN.append(mu.reg_read(UC_ARM64_REG_X16))

    j = 0
    for i in reg_XN:
    	print(">>> X%d = 0x%x" %(j+9, i))
    	j = j + 1

 
except UcError as e:
    print("ERROR: %s" % e)

All of this works well, except for SIMD instructions which are currently throwing a generic error.

## Parsing ARM64 Output For Unicorn

The unicorn code above takes binary intructions only, which need to be extracted from our generated code output. In order to test our generated code, it must first be ran with a gcc-arm64 command which produces out like below:

All we care about is the binary output ("528000ed,b000046a,etc."), so we developed a simple parser to extract it and create a single line of binary code for Unicorn to take. Additionally the bytes are actually returned in reverse order from what Unicorn accepts, so the parser starts at the end of the number and returns each 4 byte sections in hex. This can be seen below.

In [None]:
import sys

def find_main(f):
	in_main = False
	for line in f:
		if in_main:
			if "00000000" not in line:
				get_ins_bytes(line)
			else: break
		elif "<main>" in line:
			in_main = True

def get_ins_bytes(line):
	i = 0
	while line[i] != '\t': i += 1
	i += 1
	for j in range(8,0,-2):
		printHexByte(line[i+j-2:i+j])

def printHexByte(hex_byte):
	global code
	hb = ""
	hb += chr(int(hex_byte,16))
	code += hb

f = open('out.dis', 'r')
global code
code = ""
find_main(f)
sys.stdout.write(code)

## Refences
