based on various sources, including slides from: José Nelson Amaral – University of Alberta, David Walker – Princeton University



# Register Allocation

Compilers course

Masters in Informatics and Computing Engineering (MIEIC), 3rd Year

João M. P. Cardoso

Email: jmpc@fe.up.pt





#### **Outline**

- Introduction to Register Allocation
- Variables' Live Ranges
- Register Allocation by Graph Coloring
  - Heuristics
  - Spilling
- > Summary

- > Store as many variables as possible in registers
- Use each register to store as many variables as possible (registers are limited resources)
  - use live range (also known as "lifetime interval") of variables
- One the optimizations with highest impact (code size and performance)

# Variables' Live Ranges

- Duration in the code from a definition of a variable and a use of this variable reached by that definition
- See Liveness Analysis

#### Variables' Live Ranges

Variables' (\$t registers) live ranges in the following MIPS code



Based on the variables' (\$t registers) live ranges try to use each register to store more than one variable (\$t register)



Let's try to reduce the number of registers t in the following MIPS code



- > Determine the live range for each variable
  - Use liveness analysis
- > Allocate a register to one or more variables
- ➤ Hows
  - Graph Coloring (problem NP-complete)
    - Use heuristics

- Graph Coloring
  - Calculate the live range for each variable
  - Construct the Register-Interference Graph\* (there is interference when 2 variables have lifetimes with non-null intersection)
    - Edges represent interference
    - Nodes represent variables
  - Find the minimum colors or the k colors
  - Each color corresponds to a register
    - i.e., number of registers = number of colors
- \* Also known as Register-Conflict Graph

Variables' (\$t registers) Live Range



Register-Interference Graph (IG)



- Register-Interference Graph
  - Interference (edge)
     between two variables
     (nodes) indicates that the
     two variables could not be
     stored in the same register



- Register-Inference Graph
- > After Coloring:
  - Number of colors indicate the number of necessary registers



A graph is k-colorable if each node can be assigned one of k colors in such a way that no two adjacent nodes have the same color.



#### Steps:

- 1. Build the register interference graph,
- 2. Attempt to find a k-coloring for the interference graph.



- The problem of determining if an undirected graph is kcolorable is NP-hard for k ≥ 3
- It is also hard to find approximate solutions to the graph coloring problem



- > Key observation:
  - Let G be an undirected graph
  - Let x be a node of G such that degree(x) < k</li>



Then G is k-colorable if G' is k-colorable.

- Kempe's algorithm [1879] for finding a Kcoloring of a graph
- Step 1 (simplify): find a node n with degree(n)<k and cut it out of the graph (remember this node on a stack for later stages)



- Once a coloring is found for the simpler graph, we can always color the node we saved on the stack
- Step 2 (color): when the simplified subgraph has been colored, add back the node on the top of the stack and assign it a color not taken by one of the adjacent nodes



> Let's go back to the example

v0 Stack

➤ Consider k=3



- Let's go back to the example
- Consider k=3
- $\rightarrow$  Edges(v0) < 3



> Let's go back to the example

•

v0

<u>v1</u>

Stack

Consider k=3



- > Let's go back to the example
- Consider k=3
- $\rightarrow$  Edges(v1) < 3



Let's go back to the example

v0 \_\_\_\_\_v1

Stack

Consider k=3



- Let's go back to the example
- Consider k=3
- $\rightarrow$  Edges(v6) < 3



Let's go back to the example

Let's go back to the example

Consider k=3



- > Let's go back to the example
- Consider k=3
- > After some steps...



Let's go back to the example

v0

Stack

> Consider k=3 **v**10 **v**9 Now we start coloring using v6 the top of the stack 8٧ **v**5 v7 **v**7 **v**2 **v**2 ٧3 **v**4 **v**3 8v v4 **v**5 ٧6 v9 v10 **v**1 **V**0

> Let's go back to the example

Consider k=3

Now we start coloring using the

top of the stack

v10



> Let's go back to the example

Consider k=3

Now we start coloring using the

top of the stack

v9



Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

V8



> Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

v7



Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

v2



> Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

• ∨4



- Let's go back to the example
- Consider k=3
- Now we start coloring using the top of the stack
  - v3



Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

v5



> Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

• ٧6



> Let's go back to the example

Consider k=3

Now we start coloring using the

top of the stack

• \



Stack

> Let's go back to the example

Consider k=3

Now we start coloring using the top of the stack

v0



- Let's go back to the example
- Consider k=3
- > Done!
- > 3 colors imply 3 registers



Stack

#### Register Allocation

**Question:** What to do if a register-interference graph is <u>not</u> k-colorable? Or if the compiler cannot efficiently find a k-coloring even if the graph is k-colorable?

**Answer**: Repeatedly select less profitable variables for "spilling" (i.e. not to be assigned to registers) and remove them from the interference graph until the graph becomes k-colorable.

- > Example:
  - What if we only have 2 registers, i.e., k=2?



- > Example:
  - What if we only have 2 registers, i.e., k=2?



**v**0

- Step 3 (spilling): once all nodes have K or more neighbors, pick a node for spilling
  - Storage on the stack
- There are many heuristics that can be used to pick a node
  - E.g., not in an inner loop



- We need to generate extra instructions to load variables from stack and store them
- > These instructions use registers themselves. What to do?
  - Stupid approach: always keep extra registers handy for shuffling data in and out: what a waste!
  - Better approach: ?

- We need to generate extra instructions to load variables from stack and store them
- > These instructions use registers themselves. What to do?
  - Stupid approach: always keep extra registers handy for shuffling data in and out: what a waste!
  - Better approach: rewrite code introducing a new temporary; rerun liveness analysis and register allocation

- Consider: add t1, t2, t3
  - Suppose t3 is selected for spilling and assigned to stack location [8+\$sp]
  - Invented new temporary t35 for just this instruction and rewrite:
    - **lw \$t35**, **8(\$sp)**; add t1, t2, t35
  - Advantage: t35 has a very short live range and is much less likely to interfere
  - Rerun the algorithm
    - fewer variables will spill

- Variables selected to Spill?
- > The selection can be based on a number of properties:
  - frequencies of execution of uses/defs (based on the iteration count, profiling results)
  - number of uses/defs
  - number of adjacent nodes for the variable in the Interference Graph
  - Lifetime duration
  - etc.

#### **Precolored Nodes**

- Some variables are pre-assigned to registers
- > Treat these registers as special temporaries; before beginning, add them to the graph with their colors
- Can't simplify a graph by removing a precolored node
- Precolored nodes are the starting point of the coloring process
- Once simplified down to colored nodes start adding back the other nodes as before

> A 2-Phase Register Allocation Algorithm



#### Remarks

- > This register allocation algorithm, based on graph coloring, is both efficient (linear time) and effective (good assignment)
- It has been used in many industry-strength compilers to obtain significant improvements over simpler register allocation heuristics

#### **Optimizing Moves**

- > Code generation produces a lot of extra move instructions
  - mov t1, t2 ( $t1 \leftarrow t2$ )
  - If we can assign 11 and 12 to the same register, we do not have to execute the mov
  - Idea: if t1 and t2 are not connected in the interference graph, we coalesce into a single variable
  - First: Include in the register interference graph a move-related edge between two variables used in a move instruction

#### Coalescing

Problem: coalescing can increase the number of interference edges and make a graph uncolorable



- > Solution 1 (Briggs): avoid creation of high-degree (>= K) nodes
- > Solution 2 (George): a can be coalesced with b if every neighbor t of a:
  - already interferes with b, or
  - has low-degree (< K)</li>

#### Simplify and Coalesce

- Step 1 (simplify): simplify as much as possible without removing nodes that are the source or destination of a move (move-related nodes)
- Step 2 (coalesce): coalesce move-related nodes provided low-degree node results
- Step 3 (freeze): if neither steps 1 or 2 apply, freeze a move instruction: registers involved are marked not move-related and try step 1 again
- Step 4 (spill): if there are no low-degree nodes, select a node for potential spilling
- Step 5 (select): pop each element of the stack assigning colors and turning potential spill into actual spill if needed
- Step 6 (rewrite the program): rewrite the program based on the register allocation, remove move operations with coalesced variables, and inserting spilling code. If there is spill build a new register-inference graph and goto Step 1

## **Overall Algorithm**

From Tiger Book (by Appel)



# Example: Step 1: Compute Live Ranges















#### stack



# Example: Step 3: Coalesce (K=4)

#### stack



(m, no-spill)
(e, no-spill)
(f, no-spill)
(k, no-spill)
(g, no-spill)
(h, no-spill)

Why we cannot simplify?

Cannot simplify move-related nodes.

# Example: Step 3: Coalesce (K=4)

#### stack



```
(m, no-spill)
(e, no-spill)
(f, no-spill)
(k, no-spill)
(g, no-spill)
(h, no-spill)
```



#### stack

```
(c-d, no-spill)
(m, no-spill)
(e, no-spill)
(f, no-spill)
(k, no-spill)
(g, no-spill)
(h, no-spill)
```

# Example: Step 3: Coalesce (K=4)



#### stack

b-j

```
(b-j, no-spill)
(c-d, no-spill)
(m, no-spill)
(e, no-spill)
(f, no-spill)
(k, no-spill)
(g, no-spill)
(h, no-spill)
```

















# Could we do the allocation in the previous example with 3 registers?

#### Example: Step 3: Simplify (K=3)



#### Example: Step 3: Simplify (K=3)



#### Example: Step 5: Freeze (K=3)



#### Example: Step 3: Simplify (K=3)



#### Example: Step 6: Spill (K=3)



#### Example: Step 3: Simplify (K=3)



# Example: Step 3: Simplify (K=3)





```
stack
(m, no-spill)
(f, no-spill)
(e, may-spill)
(c, no-spill)
(g, no-spill)
(h, no-spill)
```



```
stack
(d, no-spill)
(m, no-spill)
(f, no-spill)
(e, may-spill)
(c, no-spill)
(g, no-spill)
(h, no-spill)
```



```
stack
(k, no-spill)
(d, no-spill)
(m, no-spill)
(f, no-spill)
(e, may-spill)
(c, no-spill)
(g, no-spill)
(h, no-spill)
```

## j-b

```
stack
(j-b, no-spill)
(k, no-spill)
(d, no-spill)
(m, no-spill)
(f, no-spill)
(e, may-spill)
(c, no-spill)
(g, no-spill)
(h, no-spill)
```





















R1={j,b} R2={k,h,m} R3={f,d,c,g}

**R3** 

```
LIVE-IN: r2(k) r1(j)
  r3 := mem[r1+12]
  r2 := r2 -1
  r3 := r3 + r2
  e := mem[r1+8] \Rightarrow t4 := mem[r1+8]; mem[$sp+4] := t4
  r2 := mem[r1+16]
  r1 := mem[r3]
  r3 := e + 8 \Rightarrow t5 := mem[$sp+4]; r3 := t5 + 8
  r3 := r3
                        A good optimizing compiler would recognize that
  r2 := r2 + 4
                        the assignment to "e" can be moved to just before
                        its use and no spilling would be needed!
  r1 := r1
LIVE-OUT: r3(d) r2(k) r1(j)
```

(José Nelson Amaral based on Tiger Book, Appel)



### **Example:** Step 3: Select (K=3) $R2=\{k,h,m\}$ R2= $\{f,d,c,g\}$ R2= $\{f,d,c,g\}$

R1  $R1=\{j,b\}$ 

**R3** 

```
LIVE-IN: r2(k) r1(j)
  r3 := mem[r1+12]
  r2 := r2 -1
  r3 := r3 + r2
  e := mem[r1+8] \Rightarrow t4 := mem[r1+8]; mem[$sp+4] := t4
  r2 := mem[r1+16]
  r1 := mem[r3]
  r3 := e + 8 \Rightarrow t5 := mem[\$sp+4]; r3 := t5 + 8
  r2 := r2 + 4
LIVE-OUT: r3(d) r2(k) r1(j)
```

99



R1

**R2** 

**R3** 

```
LIVE-IN: r2(k) r1(j)
  r3 := mem[r1+12]
  r2 := r2 -1
  r3 := r3 + r2
  t4 := mem[r1+8]
  mem[\$sp+4] := t4
  r2 := mem[r1+16]
  r1 := mem[r3]
  t5 := mem[\$sp+4]
  r3 := t5 + 8
  r2 := r2 + 4
LIVE-OUT: r3(d) r2(k) r1(j)
```

R1

**R2** 

R3



R1

**R2** 

**R3** 



After repeating Register Allocation

. . .

R1

**R2** 

R3



After repeating Register Allocation

. . .

#### Live Range Splitting

- The basic coloring algorithm does not consider cases in which a variable can be allocated to a register for part of its live range
  - Some compilers split live ranges within the iteration structure of the coloring algorithm
  - When a variable is split into two new variables, one of the new variables might be profitably assigned to a register while the other is not

#### Length of Live Ranges

- The interference graph does not contain information of where in the CFG variables interfere and what the length of a variable's live range is
- For example, if we only had few available registers in the following intermediate-code example, the right choice would be to spill variable w because it has the longest live range:

$$x = w + 1$$
 $c = a - 2$ 
 $y = x * 3$ 
 $z = w + y$ 

#### Summary

- > Register allocation has three major parts
  - Liveness analysis
  - Graph coloring
  - Program transformation (move coalescing and spilling)
- > See Sections 11.1-11.3 in the Tiger Book (Appel)

#### References

- Kempe, A. B. 1879. On the geographical problem of the four colors. Am. J. Math. 2, 193-200.
- G.J. Chaitin, M.A. Auslander, A.K. Chandra, J. Cocke, M.E. Hopkins, and P.W. Markstein. **Register Allocation via Coloring**. Computer Languages, 6:45-57, January 1981.
- Gregory Chaitin. 2004. **Register allocation and spilling via graph coloring**. SIGPLAN Not. 39, 4 (April 2004), 66–74. [1982] DOI: <a href="https://doi.org/10.1145/989393.989403">https://doi.org/10.1145/989393.989403</a>
- See also Patent US4571678A: "Register allocation and spilling via graph coloring," Inventor: Gregory J. Chaitin <a href="https://patents.google.com/patent/US4571678A/en">https://patents.google.com/patent/US4571678A/en</a>
- Preston Briggs, Keith D. Cooper, and Linda Torczon. Improvements to Graph Coloring Register Allocation. ACM Transactions on Programming Languages and Systems, 16(3):428-455, May 1994. https://doi.org/10.1145/177492.177575
- Lal George and Andrew W. Appel. **Iterated register coalescing**. ACM Trans. Program. Lang. Syst., 18(3):300-324, 1996. <a href="https://doi.org/10.1145/229542.229546">https://doi.org/10.1145/229542.229546</a>