# CS202: Compiler Construction

## In-class Exercises, Week of 01/24/2022

----
# PART I: Lmin language & interpreter; x86 ASTs

## Question 1

Write code to parse the following `Lmin` program and print out its abstract syntax tree.

In [1]:
from ast import *
from cs202_support.base_ast import print_ast

program = "print(42)"

In [2]:
ast = parse(program)
print(print_ast(ast))

Module(
 [
  Expr(
   Call(
    Name(
     "print",
     Load()),
    [
     Constant(
      42,
      None)
    ],
    []))
 ],
 [])


## Question 2

Write an interpreter `eval_lmin` for the `Lmin` language.

In [3]:
def eval_lmin(program):
  match program:
    case Module([Expr(Call(Name('print'), [Constant(number, None)]))]):
      print(print_ast(number))
    case _:
      print(print_ast(ast))
      raise Exception('eval_lmin', program)


eval_lmin(ast)

42


## Question 3

Write code to generate a *pseudo-x86 abstract syntax tree* for the `start` block for the program above.

Hint: reference the [pseudo-x86 AST class hierarchy](https://github.com/jnear/cs202-assignments/blob/master/cs202_support/x86exp.py). Debug your solution using the online compiler's output for the `select instructions` pass.

In [4]:
import cs202_support.x86exp as x86

ast = x86.Program(
  {
    'start':
      [
        x86.NamedInstr('movq', [x86.Immediate(42), x86.Reg('rdi')]),
        x86.Callq('print_int'),
        x86.Jmp('conclusion')
      ]
  }
)

print(print_ast(ast))

Program(
 {
  'start':
   [
    NamedInstr(
     "movq",
     [
      Immediate(42),
      Reg("rdi")
     ]),
    Callq("print_int"),
    Jmp("conclusion")
   ]
 })


# Part II: Passes of the Compiler

## Question 4

What is the purpose of the `select_instructions` pass of the compiler? How should it be implemented?

The select instructions pass converts a Lmin program into an x86 program.

It should make this transformation by pattern matching on the *only* type of program allowed in Lmin, and returing an x86 AST that uses movq and callq instructions with the same functionality.

## Question 5

What is the purpose of the `print_x86` pass of the compiler? How should it be implemented?

The print_x86 pass converts an x86 AST into a string.

The pass should be implemented as a set of recursive functions that pattern match on different components of the x86 AST.
- reg
- arg
- instr
- program

Each function will contain pattern match cases corresponding to the terminals that category of the grammar.

# Part III: Lvar

## Question 6

Write an interpreter `eval_lvar` for the `Lvar` language. Reference the grammar: Figure 2.2 (page 14) in the textbook.

In [17]:
from typing import Dict


def eval_lvar(program: Module):
  def eval_exp(e: expr, env: Dict[str, int]) -> int:
    match e:
      case Name(var):
        return env[var]
      case Constant(i):
        return i
      case BinOp(e1, Add(), e2):
        v1 = eval_exp(e1, env)
        v2 = eval_exp(e2, env)
        return v1 + v2
      case _:
        print('found expr', print_ast(e))

  def eval_stmt(s: stmt, env: Dict[str, int]):
    match s:
      case Assign([Name(var)], exp):
        val = eval_exp(exp, env)
        env[var] = val
      case Expr(Call(Name('print'), [exp])):
        val = eval_exp(exp, env)
        print(val)
      case _:
        print('found a statement')
        print(print_ast(s))

  env = {}
  match program:
    case Module(stmts):
      for s in stmts:
        eval_stmt(s, env)
    case _:
      print(print_ast(program))

In [18]:
# TEST CASE
program = """
x = 5
y = 6
print(x + y)"""

eval_lvar(parse(program))

11


----
# PART IV: Remove Complex Operands

## Question 7

Consider this translation of an expression to assembly language. What is wrong with it?

In [44]:
python = """
x = 1 + 2 + 3
"""

asm = """
movq $2, %rax
addq $1, (addq $3, %rax)
"""

- The python code has nested expressions
- X86 Assembly does not allow nested expressions (or expressions of any kind). It has instructions only
- The instruction `(addq $3, %rax)` is not allowed to be an argument to another instruction

## Question 8

Which AST nodes in the language `Lvar` are **atomic**?

- Constants
- Variables

## Question 9

Why do we need this pass? What is the form of its output?

The remove-complex-operands pass un-nests expressions to ensure that all arguments are atomic.

We need it because x86 assembly only supports atomic things as arguments to instructions.

The output of RCO is in Administrative Normal Form(A-normal form) - arguments to all operations are atomic

## Question 10

Convert the program from earlier into A-normal form.

In [45]:
python = """
x = 1 + 2 + 3
"""

In [46]:
python_anf = """
<<YOUR ANSWER>>
"""

## Question 11

Describe a recursive procedure to perform the *remove-complex-opera* pass. Reference section 2.4 in the textbook.

<<YOUR ANSWER>>