# CS202: Compiler Construction

## In-class Exercises, Week of 03/20/2023

----

# Part 1: While Loops

## Question 1

Compile the following program to $\mathcal{C}_{if}$:

```
i = 10
sum = 0
while i > 0:
    i = i - 1
    sum = sum + i
print(sum)
```

Output of RCO:
```
i = 10
sum = 0
tmp_1 = i > 0
while tmp_1:
    (compilied statements here)
```
(this means we'll need to do something special in RCO)

Cif version:
```
start:
    i = 10
    sum = 0
    goto while_test
while_test:
    if i > 0 then goto while_body else goto cont
while_body:
    i = i - 1
    sum = sum + i
    goto while_test
cont:
    print(sum)
```

## Question 2

Compile the program above into pseudo-x86 assembly.

```
start:
    movq 10, #i
    movq 0, #sum
    jmp while_test
while_test:
    cmpq $0, #1
    jqt while_body
    jmp cont
while_body:
    subq $1, #i
    addq #i, #sum
    jmp while_test
cont:
    movq #sum, %rdi
    call print_int
```

## Question 3

Describe the major changes to the compiler, up to *select instructions*.

No new passes.

- Typechecker
    - Add a case for while loops
    - Condition better be a boolean
    - Statements better be well-typed
- RCO
    - Add a case for while loops
    - Easy part: run rco_stmts on the body statements of the loop
    - Hard part: condition
        - Problem: tmp vars created by rco_exp end up outside of the loop
        - Solution:
            - Construct brand new bindings just for the tmp vars associated with the condition
            - Package up resulting Assign statements into a Begin expression
                - cond_bindings = {}
                - new_cond_exp = rco_exp(cond, cond_bindings)
                - create a Begin node with a list of assignment statments for everything in `cond_bindings`, and the expression `new_cond_exp`
- Explicate-control
    - new case: While(Begin(cond_stmts, cond_exp), body_stmts)
    - create a loop shaped control flow graph
        - make a new block for the continuation (use create_block)
        - make a new block for the body_stmts, with the continuation "goto test_label" (use create
        - make a new block for the condition, using the label "test_label"
        - can't use create_block to construct all of them
        - two big differences with "if": we nee an explicit test block; we cant use create block for every sinhle block we create, becauyse we needd the create a s syckle in the CFG
        - process:
            - cont_label: use create_block to add 'cont' to the CFG
            - test_label = gensym(loop_label)
            - body_label = use create_block to add result of the compiling `body_stmts` to the CFG using the test_label as the continuation that is `[cif.Goto(test_label)]`
            - compile the test:
                - let the cont be `[cif.If(explicate_exp(cond_exp), cif.Goto(body_label), cif.Goto(cont_label)))]`
                - compile cond_stmts with this continuation
                - `basic_blocks[test_label] =` result of above
            - return new continuation `[cif.Goto(test_label)]` 
- Select-instructions: no changes
- Allocate Registers: Specifically dataflow analysis

# Part 2: Dataflow Analysis

## Question 4

Perform liveness analysis on the pseudo-x86 program from Question 2.

Attempt: Use approach from assignment 3, when we find a jmp go do liveness analysis on the target to get live-before set

```
start:
    movq 10, #i
    movq 0, #sum    
    jmp while_test  {} <-- start
while_test:
    cmpq $0, #1
    jqt while_body  {sum}
    jmp cont        {}
while_body:
    subq $1, #i
    addq #i, #sum
    jmp while_test    {}
cont:                  {sum}
    movq #sum, %rdi   {}
    call print_int    {}
```

SPOILER ALERT: WE WILL GET STUCK in infinite loop - uh oh

Problem: to compute live-before of while_body we need live_before of while_test, to compute live_before of while_test we need live_before of while_body.

## Question 5

Describe the idea of dataflow analysis on cyclic control-flow graphs.

1. Compute live-after sets of each block without worrying about jmps (assume all live-before sets are empty). This is an **underapproximation** of the live-after sets. Variable might be live if we said it isn't so not great.
2. Update the live-before sets based on the results of #1.
3. Run #1 again until the live-after sets don't change at all. This is called a **fixed point** - there is a provable existance of this

## Question 6

Use the dataflow-based approach to perform liveness analysis on the pseudo-x86 program above.

```
start:                  {} {}       {}       {}       {}
    movq 10, #i            {}       {i}      {i}      {i}
    movq 0, #sum           {}       {i}      {i, sum} {i, sum}
    jmp while_test         {}       {}       {}       {}
while_test:             {} {i}      {i, sum} {i, sum} {i, sum}
    cmpq $0, #i            {}       {i, sum} {i, sum} {i, sum}
    jgt while_body         {}       {sum}    {sum}    {sum}
    jmp cont               {}       {}       {}       {}
while_body:             {} {i, sum} {i, sum} {i, sum} {i, sum}
    subq $1, #i            {i, sum} {i, sum} {i, sum} {i, sum}
    addq #i, #sum          {i}      {i, sum} {i, sum} {i, sum}
    jmp while_test         {}       {}       {}       {}
cont:                   {} {sum}    {sum}    {sum}    {sum}
    movq #sum, %rdi        {}       {}       {}       {}
    call print_int         {}       {}       {}       {}
```

After 4 iterations, nothing changes; so I'm done.

Change to the liveness analysis to the compiler:
 - add a ul_fixedpoint function
 - while loop:
     - make a copy of the current live-after sets
     - run ul_block on each block of the program
     - exit the while loop if the live-after sets are the same as the copy
 - initialize live-before sets to be empty for *all* blocks (not just conclusion)
 - remove the call to ul_block in the jmp case

## Question 7

How do we know the dataflow analysis will stop (i.e. not loop forever)?

Two big questions: 
 - What if live after sets keep changing? Then ul_fixedpoints runs forever.
     - There are finitely many variables and blocks in prgram so in worse case every variable is in every live-after set in which case there is nothing we can add to any set so the state cannot change if we run the analysis again.
 - As we are starting from wrong information, that is all live-after sets are empty. How do we know final answer is correct?
     - imagine we somehow knew the correct live-before set for each label
     - we could run liveness analysis on the blocks in any order, and get the answer
     - Imagine some live-before set is missing a variable that should be there
     - Because all other live-before sets are correct the next iteration should fill in the missing variable in that live-before set

## Question 8

What changes are required for the rest of the compiler?

None