### P0 Code Generator for MIPS
#### Emil Sekerinski, March 2017

The generated code is kept in memory and all code generation procedures continuously append to that code: procedure `genProgStart` initializes the generator, then `gen`-prefixed procedures are to be called for P0 constructs as in order in which they are recognized by a recursive descent parser, and finally procedure `genProgExit` returns the generated code in assembly language as a string in a format that can be read in by the SPIM simulator. The generation procedures are:
- `genBool`, `genInt`, `genRec`, `genArray`
- `genProgStart`, `genGlobalVars`, `genProgEntry`, `genProgExit`
- `genProcStart`, `genFormalParams`, `genLocalVars`, `genProcEntry`, `genProcExit`
- `genSelect`, `genIndex`, `genVar`, `genConst`, `genUnaryOp`, `genBinaryOp`, `genRelation`
- `genAssign`, `genActualPara`, `genCall`, `genRead`, `genWrite`, `genWriteln`
- `genSeq`, `genCond`, `genIfThen`, `genThen`, `genIfElse`, `genTarget`, `genWhile`

Errors in the code generator are reported by calling `mark` of the scanner. The data types of the symbol table are used to specify the P0 constructs for which code is to be generated.

In [None]:
import nbimporter
from SC import TIMES, DIV, MOD, AND, PLUS, MINUS, OR, EQ, NE, LT, GT, LE, \
     GE, NOT, mark
from ST import Var, Ref, Const, Type, Proc, StdProc, Int, Bool

Following variables determine the state of the code generator:

- `curlev` is the current level of nesting of P0 procedures
- `regs` is the set of available MIPS registers for expression evaluation
- `label` is a counter for generating new labels
- `declarations` is a list of list of declared variables (from the .data section). Each list is a list of declared variables per scope. Each inner list consists of name and size of the declared variable
- `instructions` is a list of lists of triples; each list is a list of triples for the current scope. Each triple consists of three (possibly empty) strings
   - a label
   - an instruction, possibly with operands
   - a target (for branch and jump instructions)
- `compiledProcedures` is a list of post-optimization procedure mips instructions; these will be inserted into the program after we're done all of the optimizations
- `fieldVars` is a list of dicts, with the keys being the names of fields in records, and the keys being their placeholder variable names. See [`genSelect()`](#gen-select) for more details.
- `usedGlobalVars` is a single map of names of global variables to a boolean flag which is `True` if they are used and `False` otherwise.

Procedure `genProgStart()` initializes these variables. Registers `$t0` to `$t9` are used as general-purpose registers.

In [9]:
def genProgStart():
    global declarations, instructions, compiledProcedures, procNames, fieldVars, usedGlobalVars
    global curlev, label, vname, regs, writtenText
    declarations = [ [] ]
    instructions = [ [] ]
    compiledProcedures = []
    procNames = []
    fieldVars = [ {} ]
    usedGlobalVars  = {}
    curlev = 0
    label = 0
    vname = 0
    regs = {'$t0', '$t1', '$t2', '$t3', '$t4', '$t5', '$t6', '$t7', '$t8'}
    writtenText = False
    putInstr('.data')

For each global variable, `genGlobalVars(sc, start)` adds the identifier to the list of global declarations and marks it as unused. The parameter `sc` contains the top scope  with all declarations parsed so far; only variable declarations from index `start` on in the top scope are considered. As MIPS instructions are not allowed to be identifiers, all variables get `_`  as suffix to avoid a name clash. We also insert an entry into the `usedGlobalVariables` map, flagging it as unused for now.

In [None]:
def genGlobalVars(sc, start):
    for i in range(len(sc) - 1, start - 1, - 1):
        if type(sc[i]) == Var:
            sc[i].adr = sc[i].name + '_'
            declarations[curlev].append( (sc[i].adr, sc[i].tp.size) )
            usedGlobalVars[sc[i].adr] = False

Procedure `genProgEntry(ident)` takes the program's name as a parameter. Directives for marking the beginning of the main program are generated; the program's name is it not used. 

In [None]:
def genProgEntry(ident):
    putInstr('.globl main')
    putInstr('.ent main')
    putLab('main')

Procedure `genProgExit(x)` takes parameter `x` with the result of previous `gen-` calls, generates code for exiting the program, directives for marking the end of the main program, runs all available optimizations, consolidates the various arrays used to store instructions (`declarations`, `compliedProcedures`, and `instructions`) and returns the complete assembly code.

In [11]:
def assembly(l, i, t):
    """Convert label l, instruction i, target t to assembly format"""
    return (l + ':\t' if l else '\t') + i + (', ' + t if t else '')

def genProgExit(x):
    putInstr('li $v0, 10')
    putInstr('syscall')
    putInstr('.end main')
    runOptimizations()
    combineInstructions()
    return '\n'.join(assembly(l, i, t) for (l, i, t) in instructions[-1])

Procedure `combineInstructions` consolidates the various arrays used to store instructions (`declarations`, `compliedProcedures`, and `instructions`) into the main `instructions` array.

In [None]:
def combineInstructions():
    prelude = [instructions[curlev][0]] + declarations[curlev] + [('', '.text', '')]
    for proc in compiledProcedures:
        prelude += proc
    prelude += instructions[curlev][1:]
    instructions[curlev] = prelude

Running the available optimizations in the following order.
1. First we eliminate unused procedures
1. Then we eliminate dead stores; these `sw` instructions to memory addresses which are never read from
1. Then we eliminate variabls that are never used

In [1]:
def runOptimizations():
    removeUnusedProcedures()
    deadStoreElimination()
    removeUnusedVariables(genDataDecl)

The `removeUnusedProcedures` function process the global instruction block, looking for any subroutines. We do this by reading all of the instructions in order, looking for jump and links (`jal` instructions). If we find a `jal`, we know that the memory location of the linked procedure gets read and therefore, must be maintained.

Due to the possibility of subroutines accessing other subroutines, as we find a `jal` in the global instruction block, we must also recursively search each nested procedure instructions for additional jump and links via the `recursiveProcedureSearch` function.

Once we have finished processing the global instruction block, we can eliminate all procedures that are never accessed.

In [2]:
def removeUnusedProcedures():
    global instructions, compiledProcedures
    currentInstructions = instructions[curlev]
    procedureNames = []
    temp = []
    for i in range(len(currentInstructions)):
        (l,ins,target) = currentInstructions[i]   
        if (ins == 'jal'):
            recursiveProcedureSearch(procedureNames, temp, target)
    compiledProcedures = temp

Similar to `removeUnusedProcedures`, the `recursiveProcedureSearch` function process the procedure instructions block, looking for any additional subroutines. We do this by reading all of the instructions of the current procedure in order, looking for jump and links (`jal` instructions). If we find a `jal`, we know that the memory location of the linked procedure also gets read and therefore, must be maintained.

Due to the possibility of subroutines accessing other subroutines, as we find a `jal` in the global instruction block, we must also recursively search each nested procedure instructions for additional jump and links via the `recursiveProcedureSearch` function.

In [3]:
def recursiveProcedureSearch(procedureNames, temp, target):
    for procedure in range(len(compiledProcedures)):
        searchCurrentProcedure = False
        for line in range(len(compiledProcedures[procedure])):
            if (line == 2 and compiledProcedures[procedure][line][0] == target):
                if compiledProcedures[procedure][line][0] not in procedureNames:
                    procedureNames.append(compiledProcedures[procedure][line][0])
                    temp.append(compiledProcedures[procedure])
                    searchCurrentProcedure = True
            if (searchCurrentProcedure and compiledProcedures[procedure][line][1] == 'jal'):
                recursiveProcedureSearch(procedureNames, temp, compiledProcedures[procedure][line][2])

The `deadStoreElimination` function process the global instruction block and the instructions in a procedure, looking for stores to memory locations that are unused. We do this by reading all of the instructions in reverse order, looking for writes (`sw` instructions) which are not preceding a read (`lw` instructions). If, reading in reverse order, we find a `sw` without having found a `lw`, we know that the memory location being written to never gets read from, so we can eliminate that `sw` instruction.

Due to the way this is detected, there is currently a significant problem that occur with respect to procedure calls: We cannot detect when a global variable is modified inside of one procedure and then read inside of another (or in the global scope)

In [None]:
def deadStoreElimination():
    global compiledProcedures
    instructions[curlev] = deadStoreEliminationFromBlock(instructions[curlev])
    newCompiledProcedures = []
    for compiledProc in compiledProcedures:
        newCompiledProcedures.append(deadStoreEliminationFromBlock(compiledProc))
    compiledProcedures = newCompiledProcedures

Dead store elimination looks for `sw` instructions not followed by a `lw` instruction from the same reg. 

1. We go through all of the instructions in reverse order. 
1. If we have an instruction which is not a `sw` or `lw`, we keep it (add it to the `result` list). 
1. If it is a `lw` instruction, we determine the location being loaded from and store in the `readAddresses` dict
1. If it is a `sw` instruction, we determine the location being stored to. 
    - Then we check to see if that location is in the `readAddresses` dict. 
    - If so, we have encountered a read from this address already, so we need to keep the instruction
    - Otherwise, we can drop the instruction
1. Finally, we reverse the `result` list and return it

In [None]:
def deadStoreEliminationFromBlock(block):
    readAddresses = {}
    result = []
    for elem in block[::-1]:
        (_,ins,_) = elem
        # assuming all targets for branch/jump instructions are labels
        if ins.startswith('lw'):
            # this is a read
            t,reg,loc = ins.split()
            readAddresses[loc] = True
            result.append(elem)
        elif ins.startswith('sw'):
            # this is a write
            t,reg,loc = ins.split()
            # if we're writing here and we've read from it before, keep the instruction
            # otherwise we discard it
            if loc in readAddresses and readAddresses[loc]:
                result.append(elem)
                readAddresses[loc] = False
        else:
            result.append(elem)
    return result[::-1]

The `removeUnusedVariables` function detects unused variables and removes their declarations from the declarations list. Every time we encounter a variable being used, we check if it is a global variable. If so, we mark that global variable as being used. If not, we add it to the list of used variable declarations for the current scope.

In [None]:
def removeUnusedVariables(f):
    global declarations, usedGlobalVars
    currentInstructions = instructions[curlev]
    temp = []
    completed = []
    realAddrs = {}
    s = 0
    for i in range(len(currentInstructions)):
        (l,ins,target) = currentInstructions[i]    
        for decl in declarations[curlev]:
            name = decl[0]
            size = decl[1]
            if name in completed: continue
            if name in usedGlobalVars or name in ins or (type(target) == str and name in target):
                completed.append(name)
                s = s + size
                realAddrs[name] = -s - 8
                if name not in usedGlobalVars:
                    temp.append( f(decl) )
                elif not usedGlobalVars[name]:
                    usedGlobalVars[name] = True
    declarations[curlev] = temp
    return realAddrs,s

The `genDataDecl` function takes a placeholder variable declaration consisting of a tuple of `(name, size)` and translates it into the appropriate `.space` directive to preceed the program.

The `genProcDecl` function accepts the same placeholder declaration but does nothing with it. We process procedure declarations later.

In [None]:
def genDataDecl(decl):
    return (decl[0], '.space ' + str(decl[1]), '')

def genProcDecl(decl):
    return decl

The `processProcInstructions` method accepts a dictionary with string keys and integer values. This dictionary contains an entry for each local variable used in the procedure, with the key being that variable's placeholder name, and the integer value being the amount of space that variable will occupy. 

The function also accepts an integer argument with the total size used by the local variables. 

With this information, we can replace the placeholder variable names with their exact stack addresses, as well as replacing the process name with the amount of stack space required for the local variables. We do this by parsing through all of the instructions for the procedure and basically executing a string replacement.

In [None]:
def processProcInstructions(realAddrs, size):
    global instructions, procNames
    currentInstructions = instructions[curlev]
    procName = '__LOCAL_' + procNames[-1]
    localsize = str(size + 8)
    for i in range(len(currentInstructions)):
        (l,ins,target) = currentInstructions[i]
        if procName in ins:
            ins = ins.replace(procName, localsize)
        for varAddr in realAddrs:
            if type(varAddr) != str:
                continue
            varSize = str(realAddrs[varAddr])            
            if varAddr in ins:
                ins = ins.replace(varAddr, varSize)
            if type(target) == str and varAddr in target:
                target = target.replace(varAddr, varSize)
        currentInstructions[i] = (l, ins, target)
    instructions[curlev] = currentInstructions

Procedure `newVarName()` generates a new unique variable placeholder name on each call.

In [None]:
def newVarName():
    global vname
    vname += 1
    return '__V' + str(vname)

<a id="gen-select"></a>

Procedure `genSelect(x, f)` "generates code" for `x.f`, provided `f` is in `x.fields`. Only `x.adr` is updated, no code is generated. An updated item is returned.

In essence, we treat `x.f` as an entirely different variable than any other field in `x`. In practice, two fields in `x` will typically end up next to each other in memory. However with these "optimizations" we don't guarantee that all fields in a record will occupy a contiguous memory space.

We do this by giving every field a placeholder name (via `newVarName()`) and storing the field name -> placeholder name association in the `fieldVars` dict.

Every time we enounter a field on an object, we perform a lookup for that field variable. If we can find that field variable in the dict, we have seen it before, and so we use the same placeholder name that was generated for that field. 

In [None]:
def genSelect(x, f):
    global fieldVars
    x.tp = f.tp
    if type(x.adr) == int:
        x.adr = x.adr + f.offset
    else:
        # lookup the field name to see if we've already seen it
        fieldName = '__' + str(x) + '.' + str(f)
        if fieldName not in fieldVars[curlev]:
            tempName = newVarName()
            declarations[curlev].append( (tempName, f.tp.size) )
            fieldVars[curlev][fieldName] = tempName
            x.adr = tempName
        else:
            x.adr = fieldVars[curlev][fieldName]
    return x

Procedure `genIndex(x, y)` generates code for `x[y]`, assuming `x` is `Var` or `Ref`, `x.tp` is `Array`, and `y.tp` is `Int`. If `y` is `Const`, only `x.adr` is updated and no code is generated, otherwise code for array index calculation is generated.

We do not do the same process as with the field variables, because with an array we have the guarantee that all items in an array are in a contiguous block of memory. If we violate this promise, then all pointer semantics with respect to arrays become meaningless, and we would risk breaking the input program. As such, the `genIndex` function has not been modified.

In [None]:
def genIndex(x, y):
    if type(y) == Const:
        offset = (y.val - x.tp.lower) * x.tp.base.size
        x.adr = x.adr + (offset if type(x.adr) == int else '+' + str(offset))
    else:
        if type(y) != Reg: y = loadItem(y)
        putOp('sub', y.reg, y.reg, x.tp.lower)
        putOp('mul', y.reg, y.reg, x.tp.base.size)
        if x.reg != R0:
            putOp('add', y.reg, x.reg, y.reg); releaseReg(x.reg)
        x.reg = y.reg
    x.tp = x.tp.base
    return x

For each local variable, `genLocalVars(sc, start)` updates the entry of the variable with the FP-relative address and returns their total size. The parameter `sc` contains the top scope with all local declarations parsed so far; only variable declarations from index `start` on in the top scope are considered.

In [None]:
def genLocalVars(sc, start):
    s = 0 # local block size
    for i in range(start, len(sc)):
        if type(sc[i]) == Var:
            s = s + sc[i].tp.size
            sc[i].adr = newVarName() # replace the address with a random name
                                     # compute the real address at the end
            name = str(sc[i].adr)
            declarations[curlev].append( (name, sc[i].tp.size) )
    return s

Procedure `genProcStart()` adds a new scope to the declarations, instructions, global variables

In [None]:
def genProcStart():
    global curlev, declarations, instructions
    declarations.append( [] )
    instructions.append( [] )
    fieldVars.append( {} )
    curlev = curlev + 1

For the procedure prologue, we are supposed to allocate enough space for all of the local variables, but we do not yet know which of those local variables are actually used. So, we insert a placeholder which we will remove after we have processed the rest of the procedure.

In [None]:
def genProcEntry(ident, parsize, localsize):
    putInstr('.globl ' + ident)              # global declaration directive
    putInstr('.ent ' + ident)                # entry point directive
    putLab(ident)                            # procedure entry label
    putMemOp('sw', FP, SP, - parsize - 4)    # push frame pointer
    putMemOp('sw', LNK, SP, - parsize - 8)   # push return address
    putOp('sub', FP, SP, parsize)            # set frame pointer
    putOp('sub', SP, FP, '__LOCAL_' + ident) # set stack pointer to temp label, 
                                             # which we replace and compute at the end
    procNames.append(ident)

For the procedure epilogue, we: 
1. add the appropriate instructions to return from the function 
1. **TODO** remove dead stores
1. parse the procedure for unused variables
1. replace all placeholders with the actual addresses
1. pop off from the `declarations`, `instructions`, `procNames`, and `fieldVars` arrays
1. decrement `curlev`
1. append the instructions for this procedure to the `compiledProcedures` list

In [None]:
def genProcExit(x, parsize, localsize):
    global curlev, declarations, instructions, compiledProcedures
    
    putOp('add', SP, FP, parsize) # restore stack pointer
    putMemOp('lw', LNK, FP, - 8)  # pop return address
    putMemOp('lw', FP, FP, - 4)   # pop frame pointer
    putInstr('jr $ra')            # return
    
    # apply optimizations
    realAddresses,localsize = removeUnusedVariables(genProcDecl)
    processProcInstructions(realAddresses, localsize)

    # remove declarations, instructions for the procedure we're exiting
    procDecl = declarations.pop()
    procIns = instructions.pop()
    procNames.pop()
    fieldVars.pop()
    curlev = curlev - 1
    # store the compiled instructions in compiledProcedures
    compiledProcedures.append(procIns)

# `CGmips` code
The rest of the code is taken verbatim from `CGmips.py`

Reserved registers are `$0` for the constant `0`, `$fp` for the frame pointer, `$sp` for the stack pointer, and `$ra` for the return address (dynamic link).

In [10]:
R0 = '$0'; FP = '$fp'; SP = '$sp'; LNK = '$ra'

def obtainReg():
    if len(regs) == 0: mark('out of registers'); return R0
    else: return regs.pop()

def releaseReg(r):
    if r not in (R0, SP, FP, LNK): regs.add(r)

In [None]:
def putLab(lab, instr = ''):
    """Emit label lab with optional instruction; lab may be a single
    label or a list of labels"""
    if type(lab) == list:
        for l in lab[:-1]: instructions[-1].append((l, '', ''))
        instructions[-1].append((lab[-1], instr, ''))
    else: instructions[-1].append((lab, instr, ''))

def putInstr(instr, target = ''):
    """Emit an instruction"""
    instructions[-1].append(('', instr, target))

def putOp(op, a, b, c):
    """Emit instruction op with three operands, a, b, c; c can be register or immediate"""
    putInstr(op + ' ' + a + ', ' + b + ', ' + str(c))

def putBranchOp(op, a, b, c):
    putInstr(op + ' ' + a + ', ' + b, str(c))

def putMemOp(op, a, b, c):
    """Emit load/store instruction at location or register b + offset c"""
    if b == R0: putInstr(op + ' ' + a + ', ' + str(c))
    else: putInstr(op + ' ' + a + ', ' + str(c) + '(' + b + ')')

Following procedures "generate code" for all P0 types by determining the size 
of objects and store in the `size` field.
- Integers and booleans occupy 4 bytes
- The size of a record is the sum of the sizes of its field; the offset of a 
  field is the sum of the size of the preceding fields
- The size of an array is its length times the size of the base type.

In [None]:
def genBool(b):
    b.size = 4; return b

def genInt(i):
    i.size = 4; return i

def genRec(r):
    # r is Record
    s = 0
    for f in r.fields:
        f.offset, s = s, s + f.tp.size
    r.size = s
    return r

def genArray(a):
    # a is Array
    a.size = a.length * a.base.size
    return a

Procedure `newLabel()` generates a new unique label on each call.

In [None]:
def newLabel():
    global label
    label += 1
    return 'L' + str(label)

The code generator _delays the generation of code_ until it is clear that no better code can be generated. For this, the not-yet-generated result of an expressions and the location of a variable is stored in _items_. In addition to the symbol table types `Var`, `Ref`, `Const`, the generator uses two more item types:
- `Reg(tp, reg)` for integers or boolean values stored in a register; the register can be `$0` for constants `0` and `false`
- `Cond(cond, left, right)` for short-circuited Boolean expressions with two branch targets. The relation `cond` must be one of `'EQ'`, `'NE'`, `'LT'`, `'GT'`, `'LE'`, `'GE'`. The operands `left`, `right` are either registers or constants, but one has to be a register. The result of the comparison is represented by two branch targets, stored as fields, where the evaluation continues if the result of the comparison is true or false. The branch targets are lists of unique labels, with targets in each list denoting the same location. If `right` is `$0`, then `'EQ'` and `'NE'` for `cond` can be used for branching depending on whether `left` is `true` or `false`.

In [None]:
class Reg:
    def __init__(self, tp, reg):
        # tp is Bool or Int
        self.tp, self.reg = tp, reg

class Cond:
    # labA, labB are lists of branch targets for when the result is true or false
    def __init__(self, cond, left, right):
        self.tp, self.cond, self.left, self.right = Bool, cond, left, right
        self.labA, self.labB = [newLabel()], [newLabel()]

Procedure `loadItemReg(x, r)` generates code for loading item `x` to register `r`, assuming `x` is `Var`, `Const`, or `Reg`. If a constant is too large to fit in 16 bits immediate addressing, an error message is generated.

In [12]:
def testRange(x):
    if x.val >= 0x8000 or x.val < -0x8000: mark('value too large')

def loadItemReg(x, r):
    if type(x) == Var: 
        putMemOp('lw', r, x.reg, x.adr)
        releaseReg(x.reg)
    elif type(x) == Const:
        testRange(x); putOp('addi', r, R0, x.val)
    elif type(x) == Reg: # move to register r
        putOp('add', r, x.reg, R0)
    else: assert False

Procedure `loadItem(x)` generates code for loading item `x`, which has to be `Var` or `Const`, into a new register and returns a `Reg` item; if `x` is `Const` and has value `0`, no code is generated and register `R0` is used instead. For procedure `loadBool(x)`, the type of item `x` has to be `Bool`; if `x` is not a constant, it is loaded into a register and a new `Cond` item is returned.

In [13]:
def loadItem(x):
    if type(x) == Const and x.val == 0: r = R0 # use R0 for "0"
    else: r = obtainReg(); loadItemReg(x, r)
    return Reg(x.tp, r)

def loadBool(x):
    if type(x) == Const and x.val == 0: r = R0 # use R0 for "false"
    else: r = obtainReg(); loadItemReg(x, r)
    return Cond(NE, r, R0)

Procedure `put(cd, x, y)` generates code for `x op y`, where `op` is an operation with mnemonic `cd`. Items `x`, `y` have to be `Var`, `Const`, `Reg`. An updated item `x` is returned.

In [None]:
def put(cd, x, y):
    if type(x) != Reg: x = loadItem(x)
    if x.reg == R0: x.reg, r = obtainReg(), R0
    else: r = x.reg # r is source, x.reg is destination
    if type(y) == Const:
        testRange(y); putOp(cd, r, x.reg, y.val)
    else:
        if type(y) != Reg: y = loadItem(y)
        putOp(cd, x.reg, r, y.reg); releaseReg(y.reg)
    return x

Procedures `genVar`, `genConst`, `genUnaryOp`, `genBinaryOp`, `genRelation`, `genSelect`, and `genIndex` generate code for expressions (e.g. right hand side of assignments) and for  locations (e.g. left hand side of assignments).

Procedure `genVar(x)` allows `x` to refer to a global variable, local variable, or procedure parameter: the assumption is that `x.reg` (which can be `R0`) and `x.adr` (which can be 0) refer to the variable. References to variables on intermediate level is not supported. For global variables, the reference is kept symbolic, to be resolved later by the assembler. Item `x` is `Var` or `Ref`; if it is `Ref`, the reference is loaded into a new register. A new `Var` item with the location is returned.

In [None]:
def genVar(x):
    if x.lev == 0: s = R0 # global variable at x.adr
    elif x.lev == curlev: s = FP # local variable, FP relative
    else: mark('level!'); s = R0
    y = Var(x.tp); y.lev = x.lev
    if type(x) == Ref: # reference is loaded into register
        r = obtainReg(); putMemOp('lw', r, s, x.adr)
        y.reg, y.adr = r, 0 # variable at (y.reg)
    elif type(x) == Var:
        y.reg, y.adr = s, x.adr
    else: assert False
    return y

Procedure `genConst(x)` does not need to generate any code.

In [1]:
def genConst(x):
    # x is Const
    return x

Procedure `genUnaryOp(op, x)` generates code for `op x` if `op` is `MINUS`, `NOT` and `x` is `Int`, `Bool`; if `op` is `AND`, `OR`, item `x` is the first operand. If it is not already a `Cond` item, it is made so which is loaded into a register. A branch instruction is generated for `OR` and a branch instruction with a negated condition for `AND`.

In [None]:
def negate(cd):
    return {EQ: NE, NE: EQ, LT: GE, LE: GT, GT: LE, GE: LT}[cd]

def condOp(cd):
    return {EQ: 'beq', NE: 'bne', LT: 'blt', LE: 'ble', GT: 'bgt', GE: 'bge'}[cd]

def genUnaryOp(op, x):
    if op == MINUS: # subtract from 0
        if type(x) == Var: x = loadItem(x)
        putOp('sub', x.reg, R0, x.reg)
    elif op == TILDE: # `r nor s` means ~(r | s) so we use `r nor r` to get ~r
        if type(x) == Var: x = loadItem(x)
        putOp('nor', x.reg, x.reg, x.reg)
    elif op == NOT: # switch condition and branch targets, no code
        if type(x) != Cond: x = loadBool(x)
        x.cond = negate(x.cond); x.labA, x.labB = x.labB, x.labA
    elif op == AND: # load first operand into register and branch
        if type(x) != Cond: x = loadBool(x)
        putBranchOp(condOp(negate(x.cond)), x.left, x.right, x.labA[0])
        releaseReg(x.left); releaseReg(x.right); putLab(x.labB)
    elif op == OR: # load first operand into register and branch
        if type(x) != Cond: x = loadBool(x)
        putBranchOp(condOp(x.cond), x.left, x.right, x.labB[0])
        releaseReg(x.left); releaseReg(x.right); putLab(x.labA)
    else: assert False
    return x

Procedure `genBinaryOp(op, x, y)` generates code for `x op y` if `op` is `PLUS`, `MINUS`, `TIMES`, `DIV`, `MOD`. If `op` is `AND`, `OR`, operand `y` is made a `Cond` item it if is not so already and the branch targets are merged. 

In [None]:
def genBinaryOp(op, x, y):
    if op == PLUS: y = put('add', x, y)
    elif op == MINUS: y = put('sub', x, y)
    elif op == TIMES: y = put('mul', x, y)
    elif op == DIV: y = put('div', x, y)
    elif op == MOD: y = put('mod', x, y)
    elif op == AND: # load second operand into register 
        if type(y) != Cond: y = loadBool(y)
        y.labA += x.labA # update branch targets
    elif op == OR: # load second operand into register
        if type(y) != Cond: y = loadBool(y)
        y.labB += x.labB # update branch targets
    elif op == AMP: y = put('and', x, y)
    elif op == BAR: y = put('or', x, y)
    else: assert False
    return y

Procedure `genRelation(op, x, y)` generates code for `x op y` if `op` is `EQ`, `NE`, `LT`, `LE`, `GT`, `GE`. Items `x` and `y` cannot be both constants. A new `Cond` item is returned.

In [None]:
def genRelation(op, x, y):
    if type(x) != Reg: x = loadItem(x)
    if type(y) != Reg: y = loadItem(y)
    return Cond(op, x.reg, y.reg)

Procedure `genAssign(x, y)` generates code for `x := y`, provided `x` is `Var`. Item `x` is loaded into a register if it is not already there; if `x` is `Cond`, then either `0` or `1` is loaded into a register.

In [None]:
def genAssign(x, y):
    """Assume x is Var, generate x := y"""
    if type(y) == Cond:
        putBranchOp(condOp(negate(y.cond)), y.left, y.right, y.labA[0])
        releaseReg(y.left); releaseReg(y.right); r = obtainReg()
        putLab(y.labB); putOp('addi', r, R0, 1) # load true
        lab = newLabel()
        putInstr('b', lab)
        putLab(y.labA); putOp('addi', r, R0, 0) # load false 
        putLab(lab)
    elif type(y) != Reg: y = loadItem(y); r = y.reg
    else: r = y.reg
    putMemOp('sw', r, x.reg, x.adr); releaseReg(r)

Procedure `genFormalParams(sc)` determines the FP-relative address of all parameters in the list `sc` of procedure parameters. Each parameter must be of type `Int` or `Bool` or must be a reference parameter.

In [1]:
def genFormalParams(sc):
    s = 0 # parameter block size
    for p in reversed(sc):
        if p.tp == Int or p.tp == Bool or type(p) == Ref:
            p.adr, s = s, s + 4
        else: mark('no structured value parameters')
    return s

Procedure `genActualPara(ap, fp, n)` assume that `ap` is an item with the actual parameter, `fp` is the entry for the formal parameter, and `n` is the parameter number. The parameters are pushed SP-relative on the stack. The formal parameter is either `Var` or `Ref`.

In [None]:
def genActualPara(ap, fp, n):
    if type(fp) == Ref:  #  reference parameter, assume p is Var
        if ap.adr != 0:  #  load address in register
            r = obtainReg(); putMemOp('la', r, ap.reg, ap.adr)
        else: r = ap.reg  #  address already in register
        putMemOp('sw', r, SP, - 4 * (n + 1)); releaseReg(r)
    else:  #  value parameter
        if type(ap) != Cond:
            if type(ap) != Reg: ap = loadItem(ap)
            putMemOp('sw', ap.reg, SP, - 4 * (n + 1)); releaseReg(ap.reg)
        else: mark('unsupported parameter type')

Procedure `genCall(pr, ap)` assumes `pr` is `Proc` and `ap` is a list of actual parameters.

In [None]:
def genCall(pr, ap):
    putInstr('jal', pr.name)

Procedures `genRead(x)`, `genWrite(x)`, `genWriteln()` generate code for SPIM-defined "syscalls"; `genRead(x)` and assumes that `x` is `Var` and `genWrite(x)` assumes that `x` is `Ref`, `Var`, `Reg`.  

In [2]:
def genRead(x):
    putInstr('li $v0, 5'); putInstr('syscall')
    putMemOp('sw', '$v0', x.reg, x.adr)

def genWrite(x):
    loadItemReg(x, '$a0'); putInstr('li $v0, 1'); putInstr('syscall')

def genWriteln():
    putInstr('li $v0, 11'); putInstr("li $a0, '\\n'"); putInstr('syscall')

For control structures:
- `genSeq(x, y)` generates `x ; y`, assuming `x`, `y` are statements
- `genCond(x)` generates code for branching on `x`, assuming that `x` is of type `Bool`
- `genIfThen(x, y)` generates code for `y` in `if x then y`
- `genThen(x, y)` generates code for `y` in `if x then y else z`
- `genIfElse(x, y, z)` generates code for `z` in `if x then y else z`
- `genTarget()` generates and returns a target for backward branches
- `genWhile(lab, x, y)` generates code for `y` in `while x do y`, assuming that target `lab` was generated before `x`.

In [None]:
def genSeq(x, y):
    pass

def genCond(x):
    if type(x) != Cond: x = loadBool(x)
    putBranchOp(condOp(negate(x.cond)), x.left, x.right, x.labA[0])
    releaseReg(x.left); releaseReg(x.right); putLab(x.labB)
    return x

def genIfThen(x, y):
    putLab(x.labA)

def genThen(x, y):
    lab = newLabel()
    putInstr('b', lab)
    putLab(x.labA); 
    return lab

def genIfElse(x, y, z):
    putLab(y)

def genTarget():
    lab = newLabel()
    putLab(lab)
    return lab

def genWhile(lab, x, y):
    putInstr('b', lab)
    putLab(x.labA); 