# CS202: Compiler Construction

## In-class Exercises, Week of 02/20/2023

----

# Conditionals

## Question 1

The following grammar defines the *concrete syntax* for a subset of $L_{if}$.

\begin{align*}
b &::= \texttt{True} \mid \texttt{False}\\
cmp &::= \texttt{==} \mid \texttt{!=} \mid \texttt{<} \mid \texttt{<=} \mid \texttt{>} \mid \texttt{>=} \\
exp &::= n \mid b \mid exp + exp \mid exp\; cmp\; exp \mid exp\; \texttt{and}\; exp \mid exp\; \texttt{or}\; exp \\
stmt &::= var = exp \mid \texttt{print}(exp) \mid \texttt{if}\; exp: stmt^+\; \texttt{else}: stmt^+ \\
\end{align*}

Write a program that prints 42 if 5 equals 6, and 0 otherwise.

```
if 5 == 6:
    print(42)
else:
    print(0)
```

## Question 2

Write the same program in x86 assembly language.

x86 doesn't have if statements, it has jumps. 

main:
- this has the condition in it
```
movq $6, %r8
cmpq $5, %r8 # this is the comparison
je label1 # this is the jump if equal
jmp label2 # this is the jump if not equal
```
label1:
- this has the "then" branch
```
movq $42, %rdi
call print_int
jmp conclusion
```
label2:
- this has the "else" branch
```
movq $0, %rdi
call print_int
jmp conclusion
```

conclusion:
- this is the end of the program
```
retq
``` 

## Question 3

Convert the following program to pseudo-x86 assembly:

```
if 5 == 6:
  x = 0
else:
  x = 40
print(x+2)
```

```
main:
  movq $5, %r8
  cmpq $6, %r8
  je label1
  jmp label2
label1:
  movq $0, #x
  jmp label3
label2:
  movq $40, #x
  jmp label3
label3:
  addq $2, #x
  movq #x, %rdi
  call print_int
```

## Question 4

Describe a strategy for converting `if` expressions into x86 assembly.

- create a label for the "then" and "else" branches and compile the statements in each branch
- use cmpq instruction to compare the condition
- use jmp and conditional jumps to consume the result of the comparison and jump to the appropriate branch
- create a label for the end of the program after the if statement
  - we better not copy the code for the end of the program into each branch

**Notes:**
- `if` is a structure for *control flow*
- A [control flow graph](https://en.wikipedia.org/wiki/Control-flow_graph) can express x86 programs with control flow

----
# Strategy

## Question 5

List the major differences between $\mathcal{L}_{var}$ and $\mathcal{L}_{if}$, and the required corresponding changes to the compiler.

Language differences:
- we have if statements
- we now have both int and bool values

Required changes:
- add a pass (explicate-control) to convert if statements into blocks in a control flow graph
- add a pass(typecheck) that typechecks the program

## Question 6

For each pass of the compiler, list major changes. Include new passes.

- typecheck : new 
- rco : no major changes
- explicate-control : new
- select-instructions : now compile cif to x86
- allocate-registers : npw need to handle multiple blocks
- patch-instructions : no major changes
- prelude-and-conclusion : no major changes

## Question 7

List the major differences between our source language and that of the textbook.

- we won't handle if expressions
- we won't implement the shrink pass
- we will make a few simplifications to the compiler passes

----

# Typechecking

## Question 8

What does this program do? What is the type of `x`?

```
if 1:
  x = 2
else:
  x = 3
```

this program performs a code block based on the condition.

x is an int

## Question 9

What is the type of `x`?

```
if 5 == 6:
  x = 7
else:
  x = True
```

x could be either an int or a bool

Notes:

Benefits of typechecking:
- we can catch errors early
- performance

## Question 10

Fill in the following definition of a typechecker for $L_{if}$ expressions.

In [35]:
from typing import Dict, List
from cs202_support.python import *
TEnv = Dict[str, type]

prim_input_types = {
    '+': [int, int],
}

prim_output_types = {
    '+': int,
}

def tc_exp(e: Expr, env: TEnv) -> type:
    match e:
        case Var(x):
            return env[x]
        case Constant(n):
            return type(n)
        case Prim('add', [e1, e2]):
            assert tc_exp(e1, env) == int
            assert tc_exp(e2, env) == int
            return int
        case Prim('eq', [e1, e2]):
            assert tc_exp(e1, env) == tc_exp(e2, env)
            return bool

## Question 11

Fill in the following definition of a typechecker for $L_{if}$ statements.

In [37]:
def tc_stmt(s: Stmt, env: TEnv):
    match s:
        case Assign(x, e):
            if x in env:
                assert env[x] == tc_exp(e, env)
            else:
                env[x] = tc_exp(e, env)
        case Print(e):
            tc_exp(e, env)
        case If(e1, s1, s2):
            assert tc_exp(e1, env) == bool
            for s in s1:
                tc_stmt(s, env)
            for s in s2:
                tc_stmt(s, env)

def tc_stmts(ss: List[Stmt]):
    env = {}
    for s in ss:
        tc_stmt(s, env)
    return f"Successfully type checked program"


# TEST CASES
print('Test 1 result:', tc_stmts(parse('x=5').stmts))

error_prog = """
y = 5
y = True
"""

try:
    print(tc_stmts(parse(error_prog).stmts))
except:
    print('Test 2 result: Succesfully caught error')

good_if_prog = """
if 5 == 6:
    x = 0
else:
    x = 1
x = 2
"""

print('Test 3 result:', tc_stmts(parse(good_if_prog).stmts))

error_if_prog = """
if 5 == 6:
    y = 5
else:
    y = True
"""

try:
    print(tc_stmts(parse(error_if_prog).body))
except:
    print('Test 4 result: Succesfully caught error')


Test 1 result: Successfully type checked program
Test 2 result: Succesfully caught error
Test 3 result: Successfully type checked program
Test 4 result: Succesfully caught error


----

# RCO

## Question 12

How do we handle `if` statements in rco?

in rco, we need to handle if statements by creating a new variable for each branch and then assigning the variable to the original variable after the if statement.