# CS202: Compiler Construction

## In-class Exercises, Week of 02/15/2021

----
# Explicate-control pass

The explicate-control pass transforms a program written as an expression into a *sequence of statements* with explicit order.

## Question 1

In the following code, is the expression `1 + 2` in *tail position*? Is the expression `x + 3` in *tail position*?

```
let x = 1 + 2
in x + 3
```

1. 1 + 2 is not in tail position
2. Yes, x+3 is in tail position

## Question 2

How do you determine if an expression is in tail position?

If there is nothing left to do in evaluating the program *after* you evaluate the expression in question, it is in tail position

## Question 3

Why do we need the explicate-control pass?

- It turns an expression with a value into a sequence of actions (statements)
- We need this because in x86 assembly, programs are not expressions

## Question 4

Compile the following ANF program into CVar (manually perform the explicate-control pass).

```
let x = 1 + 2
in x + 3
```

Solution:

```
x := 1 + 2
return x + 3

Seq(
    Assign(x, Prim('+', [Int(1), Int(2)])),
    Return(Prim('+',[Var(x), Int(3)]))
    )
```

## Question 5

Describe the structure of the explicate-control pass.

Two recursive functions:
- `ec_tail` processes RVar expressions in tail position, and produces CVar tail objects
    - When we find a let expression:
        - call `ec_assign` where x = e.x, e = e.e1, and k = `ec_tail(e.body)`
- `ec_assign` processes RVar assignments in non-tail positions, and produces CVar tail objects
    - When we find an atomic expression:
        - produce a sequence where we 
            1. assign x to e 
            2. what what k says

----
# Select-instructions pass

The select-instructions pass transforms a sequence of statements into X86 assembly instructions.

## Question 6

Convert the following CVar code into a psuedo-x86 assembly program.

```
Program({
 'start':
  Seq(
   Assign(
    y_1,
    AtmExp(Int(5))),
   Seq(
    Assign(
     x_2,
     AtmExp(Var(y_1))),
    Return(AtmExp(Var(x_2)))))
})
```

```
movq $5, #y_1
movq #y_1, #x_2
movq #x_2, %rax
jmp conclusion
```

## Question 7

Describe the structure of select-instructions.

- when you find an assignment, turn it into a movq
- when you find an assignment whose right-hand size is a + expression, turn it into an addq
- when you find a return, turn it into a movq/addq followed by jmp conclusion

----
# Assign-homes pass

The assign-homes pass places each program variable in a *stack location* in memory, eliminating variables from the program.

See Section 2.2 for details; especially see Figure 2.7 for details on the memory layout of stack frames.

## Question 8

Write X86 assembly that prepares a stack frame for four variables and puts the values 1,2,3,4 in stack locations.

In [4]:
from cs202_support.eval_x86 import X86Emulator

asm = """
pushq %rbp
movq %rsp, %rbp
subq $32, %rsp
movq $1, -8(%rbp)
movq $2, -16(%rbp)
movq $3, -24(%rbp)
movq $4, -32(%rbp)
"""

X86Emulator(logging=False).eval_instructions(asm)

Unnamed: 0,Location,Old,New
0,mem 992,,1000
1,mem 984,,1
2,mem 976,,2
3,mem 968,,3
4,mem 960,,4
5,reg rbp,1000.0,992
6,reg rsp,1000.0,960


## Question 9

Write X86 assembly that prepares a stack frame for three variables and puts the values 1,2,3 in stack locations. Why is this situation different than above?

In [6]:
asm = """
pushq %rbp
movq %rsp, %rbp
subq $32, %rsp
movq $1, -8(%rbp)
movq $2, -16(%rbp)
movq $3, -24(%rbp)
"""

X86Emulator(logging=False).eval_instructions(asm)

Unnamed: 0,Location,Old,New
0,mem 992,,1000
1,mem 984,,1
2,mem 976,,2
3,mem 968,,3
4,reg rbp,1000.0,992
5,reg rsp,1000.0,960


- This situation is different from above because we leave one stack location unused. This is to make sure that the value of `rsp` is divisible by 16 (the *16-byte alignment*), as required by X86 assembly.
- Your assign-homes pass should *ensure 16-byte alignment* of the stack frame.

## Question 9

Implement a function `align` to ensure 16-byte alignment.

In [3]:
def align(num_bytes: int) -> int:
    YOUR SOLUTION HERE

print(align(32))
print(align(8))
print(align(24))

32
16
32


## Question 10

Describe the assign-homes pass.

- Process each instruction
- When you find a reference to a variable: 
    - Look up the variable in the `homes` dictionary
        - If it's found, return whatever the `homes` dictionary says
        - Otherwise, allocate a new stack location and add it to the `homes` dictionary

In [8]:
def new_stack_location(homes, var):
    len_stack = len(homes)
    new_stack_offset = len_stack*8
    new_stack_location = x86.Deref(- new_stack_offset, 'rbp') # produce: -n(%rbp)
    homes[var] = new_stack_location
    return new_stack_location

----
# Patch-instructions pass

The patch-instructions pass fixes instructions with two in-memory arguments, by using the `rax` register as a temporary location.

## Question 11

What is wrong with the following instructions?

```
movq -8(%rbp), -16(%rbp)
addq -24(%rbp), -16(%rbp)
```

x86 assembly doesn't let you reference two memory locations in the same instructions one argument must be a register

## Question 12

Fix the instructions above.

```
movq -8(%rbp), %rax
movq %rax, -16(%rbp)
movq -24(%rbp), %rax
addq %rax, -16(%rbp)
```

## Question 13

Describe the patch-instructions pass.

When you find an instruction where both args are `Deref`s:
Generate two instructions where the first `movq`s the variable to `%rax`, and the second does the operation from `%rax` to the memory location. 