# Chapter 9

# **Register Allocation**

### 9.1 Introduction

When generating intermediate code in chapter 7, we have freely used as many variables as we found convenient. In chapter 8, we have simply translated variables in the intermediate language one-to-one into registers in the machine language. Processors, however, do not have an unlimited number of registers, so we need *register allocation* to handle this conflict. The purpose of register allocation is to map a large number of variables into a small(ish) number of registers. This can often be done by letting several variables share a single register, but sometimes there are simply not enough registers in the processor. In this case, some of the variables must be temporarily stored in memory. This is called *spilling*.

Register allocation can be done in the intermediate language prior to machine-code generation, or it can be done in the machine language. In the latter case, the machine code initially uses symbolic names for registers, which the register allocation turns into register numbers. Doing register allocation in the intermediate language has the advantage that the same register allocator can easily be used for several target machines (it just needs to be parameterised with the set of available registers).

However, there may be advantages to postponing register allocation to after machine code has been generated. In chapter 8, we saw that several instructions may be combined to a single instruction, and in the process a variable may disappear. There is no need to allocate a register to this variable, but if we do register allocation in the intermediate language we will do so. Furthermore, when an intermediate-language instruction needs to be translated into a sequence of machine-code instructions, the machine code may need an extra register (or two) for storing temporary values (such as the register needed to store the result of the SLT instruction when translating a jump on < to MIPS code). Hence, the register allocator must make sure that there is always at least one spare register for temporary storage.

The techniques used for register allocation are more or less the same regardless of whether register allocation is done on intermediate code or on machine code. So, in this chapter, we will describe register allocation in terms of the intermediate language introduced in chapter 7.

As in chapter 7, we operate on the body of a single procedure or function, so when we below use the word "program", we mean it to be such a body. In chapter 10, we will look at how to handle programs consisting of several functions that can call each other.

## 9.2 Liveness

In order to answer the question "When can two variables share a register?", we must first define the concept of *liveness*:

**Definition 9.1** A variable is live at some point in the program if the value it contains at that point might conceivably be used in future computations. Conversely, it is dead if there is no way its value can be used in the future.

We have already hinted at this concept in chapter 8, when we talked about lastuses of variables.

Loosely speaking, two variables may share a register if there is no point in the program where they are both live. We will make a more precise definition later.

We can use some rules to determine when a variable is live:

- 1) If an instruction uses the contents of a variable, that variable is *live* at the start of that instruction.
- 2) If a variable is assigned a value in an instruction, and the same variable is not used as an operand in that instruction, then the variable is *dead* at the start of the instruction, as the value it has at this time is not used before it is overwritten.
- 3) If a variable is live at the end of an instruction and that instruction does not assign a value to the variable, then the variable is also live at the start of the instruction.
- 4) A variable is live at the end of an instruction if it is live at the start of any of the immediately succeeding instructions.

Rule 1 tells how liveness is *generated*, rule 2 how liveness is *killed*, and rules 3 and 4 how liveness is *propagated*.

# 9.3 Liveness analysis

We can formalise the above rules as equations over sets of variables. The process of solving these equations is called *liveness analysis*, and will at any given point in the program determine which variables are live at this point. To better speak of points in a program, we number all instructions as in figure 9.2.

For every instruction in the program, we have a set of *successors*, *i.e.*, instructions that may immediately follow the instruction during execution. We denote the set of successors to the instruction numbered i as succ[i]. We use the following rules to find succ[i]:

- 1) The instruction numbered j (if any) that is listed just after instruction number i is in succ[i], unless i is a GOTO or IF-THEN-ELSE instruction. If instructions are numbered consecutively, j = i + 1.
- 2) If instruction number i is of the form GOTO l, (the number of) the instruction LABEL l is in succ[i]. Note that there in a correct program will be exactly one LABEL instruction with the label used by the GOTO instruction.
- 3) If instruction i is IF p THEN  $l_t$  ELSE  $l_f$ , (the numbers of) the instructions LABEL  $l_t$  and LABEL  $l_f$  are in succ[i].

Note that we assume that both outcomes of an IF-THEN-ELSE instruction are possible. If this happens not to be the case (*i.e.*, if the condition is always true or always false), our liveness analysis may claim that a variable is live when it is in fact dead. This is no major problem, as the worst that can happen is that we use a register for a variable that is not going to be used after all. The converse (claiming a variable dead when it is, in fact, live) is worse, as we may overwrite a value that could be used later on, and hence get wrong results from the program. Precise liveness informatin depends on knowing exactly which paths a program may take through the code when executed, and this is not possible to compute exactly (it is a formally undecidable problem), so it is quite reasonable to allow imprecise results from a liveness analysis, as long as we err on the side of safety, *i.e.*, calling a variable live unless we can prove it to be dead

For every instruction i, we have a set gen[i], which lists the variables that may be read by instruction i and, hence, are live at the start of the instruction. In other words, gen[i] is the set of variables that instruction i generates liveness for. We also have a set kill[i] that lists the variables that may be assigned a value by the instruction. Figure 9.1 shows which variables are in gen[i] and kill[i] for the types of instruction found in intermediate code. x, y and z are (possibly identical) variables and k denotes a constant.

| Instruction i                                 | gen[i]       | kill[i]      |
|-----------------------------------------------|--------------|--------------|
| LABEL l                                       | 0            | 0            |
| x := y                                        | {y}          | { <i>x</i> } |
| x := k                                        | Ø            | { <i>x</i> } |
| $x := \mathbf{unop} \ y$                      | { <i>y</i> } | { <i>x</i> } |
| $x := \mathbf{unop} \ k$                      | Ø            | { <i>x</i> } |
| x := y binop $z$                              | $\{y,z\}$    | { <i>x</i> } |
| x := y binop $k$                              | { <i>y</i> } | { <i>x</i> } |
| x := M[y]                                     | { <i>y</i> } | { <i>x</i> } |
| x := M[k]                                     | Ø            | { <i>x</i> } |
| M[x] := y                                     | $\{x,y\}$    | 0            |
| M[k] := y                                     | { <i>y</i> } | 0            |
| GOTO l                                        | 0            | 0            |
| IF $x$ <b>relop</b> $y$ THEN $l_t$ ELSE $l_f$ | $\{x,y\}$    | 0            |
| $x := \mathtt{CALL}\ f(args)$                 | args         | { <i>x</i> } |

Figure 9.1: Gen and kill sets

For each instruction i, we use two sets to hold the actual liveness information: in[i] holds the variables that are live at the start of i, and out[i] holds the variables that are live at the end of i. We define these by the following equations:

$$in[i] = gen[i] \cup (out[i] \setminus kill[i])$$

$$out[i] = \bigcup_{j \in succ[i]} in[j]$$
(9.1)
(9.2)

These equations are recursive. We solve these by fixed-point iteration, as shown in appendix A: We initialise all in[i] and out[i] to be empty sets and repeatedly calculate new values for these until no changes occur. This will eventually happen, since we work with sets with finite support (*i.e.*, a finite number of possible values) and because adding elements to the sets out[i] or in[j] on the right-hand sides of the equations can not reduce the number of elements in the sets on the left-hand sides. Hence, each iteration will either add elements to some set (which we can do only a finite number of times) or leave all sets unchanged (in which case we are done). It is also easy to see that the resulting sets form a solution to the equation – the last iteration essentially verifies that all equations hold. This is a simple extension of the reasoning used in section 2.6.1.

The equations work under the assumption that all uses of a variable are visible in the code that is analysed. If a variable contains, e.g., the output of the program,

```
1: a := 0
2: b := 1
3: z := 0
4: LABEL loop
5: IF n = z THEN end ELSE body
6: LABEL body
7: t := a + b
8: a := b
9: b := t
10: n := n - 1
11: z := 0
12: GOTO loop
13: LABEL end
```

Figure 9.2: Example program for liveness analysis and register allocation

it will be used after the program finishes, even if this is not visible in the code of the program itself. So we must ensure that the analysis makes this variable live at the end of the program.

Equation 9.2, similarly, is ill-defined if succ[i] is the empty set (which is, typically, the case for any instruction that ends the program), so we make a special case: out[i], where i has no successor, is defined to be the set of all variables that are live at the end of the program. This definition replaces (for these instructions only) equation 9.2.

Figure 9.2 shows a small program that we will calculate liveness for. Figure 9.3 shows *succ*, *gen* and *kill* sets for the instructions in the program.

The program in figure 9.2 calculates the Nth Fibonacci number (where N is given as input by initialising n to N prior to execution). When the program ends (by reaching instruction 13), a will hold the Nth fibonacci number, so a is live at the end of the program. Instruction 13 has no successors ( $succ[13] = \emptyset$ ), so we set  $out[13] = \{a\}$ . The other out sets are defined by equation 9.2 and all in sets are defined by equation 9.1. We initialise all in and out sets to the empty set and iterate until we reach a fixed point.

The order in which we treat the instructions does not matter for the final result of the iteration, but it may influence how quickly we reach the fixed-point. Since the information in equations 9.1 and 9.2 flow backwards through the program, it is a good idea to do the evaluation in reverse instruction order and to calculate out[i] before in[i]. In the example, this means that we will in each iteration calculate the sets in the order

| i  | succ[i] | gen[i] | kill[i] |
|----|---------|--------|---------|
| 1  | 2       |        | а       |
| 2  | 3       |        | b       |
| 3  | 4       |        | z       |
| 4  | 5       |        |         |
| 5  | 6,13    | n,z    |         |
| 6  | 7       |        |         |
| 7  | 8       | a,b    | t       |
| 8  | 9       | b      | а       |
| 9  | 10      | t      | b       |
| 10 | 11      | n      | n       |
| 11 | 12      |        | z       |
| 12 | 4       |        |         |
| 13 |         |        |         |

Figure 9.3: succ, gen and kill for the program in figure 9.2

$$out[13], in[13], out[12], in[12], \dots, out[1], in[1]$$

Figure 9.4 shows the fixed-point iteration using this backwards evaluation order. Note that the most recent values are used when calculating the right-hand sides of equations 9.1 and 9.2, so, when a value comes from a higher instruction number, the value from the same column in figure 9.4 is used.

We see that the result after iteration 3 is the same as after iteration 2, so we have reached a fixed point. We note that n is live at the start of the program, which is to be expected, as n is expected to hold the input to the program. If a variable that is not expected to hold input is live at the start of a program, it might in some executions of the program be used before it is initialised, which is generally considered an error (since it can lead to unpredictable results and even security holes). Some compilers issue warnings about uninitialised variables and some compilers add instructions to initialise such variables to a default value (usually 0).

**Suggested exercises:** 9.1(a,b).

## 9.4 Interference

We can now define precisely the condition needed for two variables to share a register. We first define *interference*:

197

|    | Init   | ial   | Iterat     | ion 1      | Iterat     | ion 2      | Iterat     | ion 3      |
|----|--------|-------|------------|------------|------------|------------|------------|------------|
| i  | out[i] | in[i] | out[i]     | in[i]      | out[i]     | in[i]      | out[i]     | in[i]      |
| 1  |        |       | n,a        | n          | n,a        | n          | n,a        | n          |
| 2  |        |       | n, a, b    | n,a        | n, a, b    | n,a        | n, a, b    | n,a        |
| 3  |        |       | n, z, a, b | n, a, b    | n, z, a, b | n, a, b    | n, z, a, b | n, a, b    |
| 4  |        |       | n, z, a, b |
| 5  |        |       | a,b,n      | n, z, a, b | a,b,n      | n, z, a, b | a,b,n      | n, z, a, b |
| 6  |        |       | a,b,n      | a,b,n      | a,b,n      | a,b,n      | a,b,n      | a,b,n      |
| 7  |        |       | b,t,n      | a,b,n      | b,t,n      | a,b,n      | b,t,n      | a,b,n      |
| 8  |        |       | t, n       | b,t,n      | t, n, a    | b,t,n      | t, n, a    | b,t,n      |
| 9  |        |       | n          | t, n       | n, a, b    | t, n, a    | n, a, b    | t, n, a    |
| 10 |        |       |            | n          | n, a, b    | n, a, b    | n, a, b    | n, a, b    |
| 11 |        |       |            |            | n, z, a, b | n,a,b      | n, z, a, b | n, a, b    |
| 12 |        |       |            |            | n, z, a, b |
| 13 |        |       | а          | а          | а          | a          | a          | а          |

Figure 9.4: Fixed-point iteration for liveness analysis

**Definition 9.2** A variable x interferes with a variable y if  $x \neq y$  and there is an instruction i such that  $x \in kill[i]$ ,  $y \in out[i]$  and instruction i is not x := y.

Two different variables can share a register precisely if neither interferes with the other. This is almost the same as saying that they should not be live at the same time, but there are small differences:

- After x := y, x and y may be live simultaneously, but as they contain the same value, they can still share a register.
- It may happen that x is not in out[i] even if x is in kill[i], which means that we have assigned to x a value that is definitely not read from x later on. In this case, x is not technically live after instruction i, but it still interferes with any y in out[i]. This interference prevents an assignment to x overwriting a live variable y.

The first of these differences is essentially an optimisation that allows more sharing than otherwise, but the latter is important for preserving correctness. In some cases, assignments to dead variables can be eliminated, but in other cases the instruction may have another visible effect (*e.g.*, setting condition flags or accessing memory) and hence can not be eliminated without changing program behaviour.



Figure 9.5: Interference graph for the program in figure 9.2

We can use definition 9.2 to generate interference for each assignment statement in the program in figure 9.2:

| Instruction | Left-hand side | Interferes with |
|-------------|----------------|-----------------|
| 1           | а              | n               |
| 2           | b              | n,a             |
| 3           | z              | n, a, b         |
| 7           | t              | b,n             |
| 8           | а              | t,n             |
| 9           | b              | n, a            |
| 10          | n              | a,b             |
| 11          | z              | n, a, b         |

We will do *global register allocation*, *i.e.*, find for each variable a register that it can stay in at all points in the program (procedure, actually, since a "program" in terms of our intermediate language corresponds to a procedure in a high-level language). This means that, for the purpose of register allocation, two variables interfere if they do so at *any* point in the program. Also, even though interference is defined in an assymetric way in definition 9.2, the conclusion that the two involved variables cannot share a register is symmetric, so interference defines a symmetric relation between variables. A variable can never interfere with itself, so the relation is not reflective.

We can draw interference as an undirected graph, where each node in the graph is a variable, and there is an edge between nodes x and y if x interferes with y (or *vice versa*, as the relation is symmetric). The *interference graph* for the program in figure 9.2 is shown in figure 9.5.

# 9.5 Register allocation by graph colouring

Two variables can share a register if they are not connected by an edge in the interference graph. Hence, we must assign to each node in the interference graph a register number such that:

- 1) Two nodes that share an edge have different register numbers.
- 2) The total number of different register numbers is no higher than the number of available registers.

This problem is well-known in graph theory, where it is called *graph colouring* (in this context a "colour" is a register number). It is known to be NP-complete, which means that no effective (*i.e.*, polynomial-time) method for doing this optimally is known. In practice, this means that we need to use a heuristic method, which will often find a solution but may give up in some cases even when a solution does exist. This is no great disaster, as we must deal with non-colourable graphs anyway (by moving some variables to memory), so at worst we get slightly slower programs than we would get if we could colour the interference graphs optimally.

The basic idea of the heuristic method we use is simple: If a node in the graph has strictly fewer than N edges, where N is the number of available colours (*i.e.*, registers), we can set this node aside and colour the rest of the graph. When this is done, the (at most N-1) nodes connected by edges to the selected node can not possibly use all N colours, so we can always pick a colour for the selected node from the remaining colours.

We can use this method to four-colour the interference graph from figure 9.5:

- 1) z has three edges, which is strictly less than four. Hence, we remove z from the graph.
- 2) Now, a has less than four edges, so we also remove this.
- 3) Only three nodes are now left (*b*, *t* and *n*), so we can give each of these a number, *e.g.*, 1, 2 and 3 respectively for nodes *b*, *t* and *n*.
- 4) Since three nodes (*b*, *t* and *n*) are connected to *a*, and these use colours 1, 2 and 3, we must choose a fourth colour for *a*, *e*.*g*., 4.
- 5) z is connected to a, b and n, so we choose a colour that is different from 4, 1 and 3. Giving z colour 2 works.

The problem comes if there are no nodes that have less than *N* edges. This in itself does not imply that the graph is uncolourable. As an example, a graph with four nodes arranged and connected as the corners of a square can, even though all nodes

have two neighbours, be coloured with two colours by giving opposite corners the same colour. This leads to the following so-called "optimistic" colouring heuristics:

## Algorithm 9.3

initialise: Start with an empty stack.

**simplify:** If there is a node with less than N edges, put this on the stack along with a list of the nodes it is connected to, and remove it and its edges from the graph.

If there is no node with less than N edges, pick any node and do as above.

If there are more nodes left in the graph, continue with **simplify**, otherwise go to **select**.

**select:** Take a node and its list of connected nodes from the stack. If possible, give the node a colour that is different from the colours of the connected nodes (which are all coloured at this point). If this is not possible, colouring fails and we mark the node for spilling (see below).

If there are more nodes on the stack, continue with select.

The idea in this algorithm is that, even though a node has N or more edges, some of the nodes it is connected to may have been given identical colours, so the total number of colours used for these nodes is less than N. If this is the case, we can use one of the unused colours. If not, we must mark the node for spill.

There are several things left unspecified by algorithm 9.3:

- Which node to choose in **simplify** when none have less than N edges, and
- Which colour to choose in **select** if there are several choices.

If we choose perfectly in both cases, algorithm 9.3 will do optimal colouring. But perfect choices are costly to compute so, in practice, we will sometimes have to guess. We will, in section 9.7, look at some ideas for making qualified guesses. For now, we just make arbitrary choices.

**Suggested exercises:** 9.1(c,d).

# 9.6 Spilling

If the **select** phase is unable to find a colour for a node, algorithm 9.3 cannot colour the graph. This means we must give up on keeping all variables in registers throughout the program. We must, hence, select some variables that will reside in memory

9.6. SPILLING 201

(except for brief periods). This process is called *spilling*. Obvious candidates for spilling are variables at nodes that are not given colours by **select**. We simply mark these as *spilled* and continue **select** with the rest of the stack, ignoring spilled nodes when selecting colours for the remaining nodes. When we finish algorithm 9.3, several variables may be marked as spilled.

When we have chosen one or more variables for spilling, we change the program so these are kept in memory. To be precise, for each spilled variable *x* we:

- 1) Choose a memory address *address* $_x$  where the value of x is stored.
- 2) In every instruction i that reads or assigns x, we locally in this instruction rename x to  $x_i$ .
- 3) Before an instruction *i* that reads  $x_i$ , insert the instruction  $x_i := M[address_x]$ .
- 4) After an instruction *i* that assigns  $x_i$ , insert the instruction  $M[address_x] := x_i$ .
- 5) If x is live at the start of the program, add an instruction  $M[address_x] := x$  to the start of the program. Note that we use the original name for x here.
- 6) If x is live at the end of the program, add an instruction  $x := M[address_x]$  to the end of the program. Note that we use the original name for x here.

After this rewrite of the program, we do register allocation again. This includes re-doing the liveness analysis, since we have added new variables  $x_i$  and changed the liveness of x. We may optimise this a bit by repeating the liveness analysis only for the affected variables ( $x_i$  and x), as the results will not change for the other variables.

It may happen that the subsequent new register allocation will generate additional spilled variables. There are several reasons why this may be:

- We have ignored spilled variables when selecting colours for a node in the
  select phase. When the spilled variables are replaced by new variables, these
  may use colours that would otherwise be available, so we may end up with
  no choices where we originally had one or more colours available.
- The choices of nodes to remove from the graph in the **simplify** phase and the colours to assign in the **select phase** can change, and we might be less lucky in our choices, so we get more spills.

If we have at least as many registers as the number of variables used in a single instruction, all variables can be loaded just before the instruction, and the result can be saved immediately afterwards, so we will eventually be able to find a colouring

| Node | Neighbours | Colour |
|------|------------|--------|
| n    |            | 1      |
| t    | n          | 2      |
| b    | t,n        | 3      |
| a    | b, n, t    | spill  |
| z    | a,b,n      | 2      |

Figure 9.6: Algorithm 9.3 applied to the graph in figure 9.5

by repeated spilling. If we ignore the CALL instruction, no instruction in the intermediate language uses more than two variables, so this is the minimum number of registers that we need. A CALL instruction can use an unbounded number of variables as arguments, possibly even more than the total number of registers available, so it is unrealistic to expect all arguments to function calls to be in registers. We will look at this issue in chapter 10.

If we take our example from figure 9.2, we can attempt to colour its interference graph (figure 9.5) with only three colours. The stack built by the **simplify** phase of algorithm 9.3 and the colours chosen for these nodes in the **select** phase are shown in figure 9.6. The stack grows upwards, so the first node chosen by **simplify** is at the bottom. The colours (numbers) are, conversely, chosen top-down as the stack is popped. We can choose no colour for a, as all three available colours are in use by the neighbours b, n and t. Hence, we mark a as spilled. Figure 9.7 shows the program after spill code has been inserted. Note that, since a is live at the end of the program, we have inserted a load instruction at the end of the program. Figure 9.8 shows the interference graph for the program in figure 9.7 and figure 9.9 shows the stack used by algorithm 9.3 for colouring this graph, showing that colouring with three colours is now possible.

**Suggested exercises:** 9.1(e).

#### 9.7 Heuristics

When the **simplify** phase of algorithm 9.3 is unable to find a node with less than *N* edges, some other node is chosen. So far, we have chosen arbitrarily, but we may apply some heuristics (qualified guessing) to the choice in order to make colouring more likely or reduce the number of spilled variables:

• We may choose a node with close to *N* neighbours, as this is likely to be colourable in the **select** phase anyway. For example, if a node has exactly *N* 

9.7. HEURISTICS 203

```
1: a_1 := 0
   M[address_a] := a_1
2: b := 1
3: z := 0
4: LABEL loop
5: If n = z THEN end ELSE body
6: LABEL body
   a_7 := M[address_a]
7: t := a_7 + b
8: a_8 := b
   M[address_a] := a_8
9: b := t
10: n := n - 1
11: z := 0
12: GOTO loop
13: LABEL end
   a := M[address_a]
```

Figure 9.7: Program from figure 9.2 after spilling variable a



Figure 9.8: Interference graph for the program in figure 9.7

| Node  | Neighbours | Colour |
|-------|------------|--------|
| n     |            | 1      |
| t     | n          | 2      |
| $a_8$ | t, n       | 3      |
| b     | t, n       | 3      |
| $a_7$ | b,n        | 2      |
| z     | b,n        | 2      |
| $a_1$ | n          | 2      |
| a     |            | 1      |

Figure 9.9: Colouring of the graph in figure 9.8

neighbours, it will be colourable if just two of its neighbours get the same colour.

- We may choose a node with many neighbours that have close to *N* neighbours of their own, as spilling this node may allow many of these neighbours to be coloured.
- We may look at the program and select a variable that does not cost so much to spill, *e.g.*, a variable that is not used inside a loop.

These criteria (and maybe others as well) may be combined into a single heuristic by giving numeric values describing how well a variable fits each criterion, multiplying each with a weight and then adding the results to give a weighted sum.

We have also made arbitrary choices when we pick colours for nodes in the **select** phase. We can try to make it more likely that the rest of the graph can be coloured by choosing a colour that is already used elsewhere in the graph instead of picking a colour that is used nowhere else. This will make it less likely that the nodes connected to an as yet uncoloured node will use all the available colours. A simple instance of this idea is to always use the lowest-numbered available colour.

A more advanced variant of this idea is to look at the uncoloured nodes connected to the node we are about to colour. If we have several choices of colour for the current node, we would like to choose a colour that makes it more likely that its uncoloured neighbours can later be coloured. If an uncoloured neighbour has neighbours of its own that are already coloured, we would like to use one of the colours used among these, as this will not increase the number of colours for nodes that neighbour the uncoloured neighbour, so we will not make it any harder to colour this later on. If the current node has several uncoloured neighbours, we can find the set of neighbour-colours for each of these and select a colour that occurs in as many of these sets as possible.

9.7. HEURISTICS 205

## 9.7.1 Removing redundant moves

An assignment of the form x := y can be removed from the code if x and y use the same register (as the instruction will have no effect). Most register allocators remove such redundant move instructions, and some even try to increase the number of assignments that can be removed by trying to allocate x and y in the same register whenever possible.

If x has already been given a colour by the time we need to select a colour for y, we can choose the same colour for y, as long as it is not used by any variable that y interferes with (including, possibly, x). Similarly, if x is uncoloured, we can give it the same colour as y if this colour is not used for a variable that interferes with x (including y itself). This is called *biased colouring*.

Another method of achieving the same goal is to combine *x* and *y* (if they do not interfere) into a single node before colouring the graph, and only split the combined node if the **simplify** phase can not otherwise find a node with less than *N* edges. This is called *coalescing*.

The converse of coalescing (called *live-range splitting*) can be used as well: Instead of spilling a variable, we can split its node by giving each occurrence of the variable a different name and inserting assignments between these when necessary. This is not quite as effective at increasing the chance of colouring as spilling, but the cost of the extra assignments is likely to be less than the cost of the loads and stores inserted by spilling.

## 9.7.2 Using explicit register numbers

Some operations may require their arguments or results to be in specific registers. For example, the integer multiplication instruction in Intel's IA-32 (x86) processors require the first argument to be in the eax register and puts the 64-bit result in the eax and edx registers. Also, as we shall see in chapter 10, function calls can require arguments and results to be in specific registers.

Variables used as arguments results to such operations must, hence, be assigned to these registers *a priori*, before the register allocation begins. We call these precoloured nodes in the interference graph. If two nodes that are precoloured to the same register interfere, we can not make a legal colouring of the graph. One solution would be to spill one or both so they no longer interfere, but that is rather costly.

A better solution is to insert move instructions that move the variables to and from the required registers before and after an instruction that requires specific registers. The specific registers must still be included as precoloured nodes in the interference graph, but are not removed from it in the **simplify** phase. Once only precoloured nodes remain in the graph, the **select** phase starts. When the **select** phase needs to colour a node, it must avoid colours used by all neighbours to the

node – whether they are precoloured or just coloured earlier in the **select** phase. The register allocator can try to remove some of the inserted moves by using the techniques described in section 9.7.1.

# 9.8 Further reading

Preston Briggs' Ph.D. thesis [12] shows several variants of the register-allocation algorithm shown here, including many optimisations and heuristics as well as considerations about how the various phases can be implemented efficiently. The compiler textbooks [35] and [9] show some other variants and a few newer developments. A completely different approach to register allocation that exploits the structure of a program is suggested in [42].

## **Exercises**

#### Exercise 9.1

Given the following program:

```
    LABEL start
    IF a < b THEN next ELSE swap</li>
    LABEL swap
    t := a
    a := b
    b := t
    LABEL next
    z := 0
    b := b mod a
    IF b = z THEN end ELSE start
    LABEL end
```

- a) Show *succ*, *gen* and *kill* for every instruction in the program.
- b) Assuming a is live at the end of the program, *i.e.*,  $out[11] = \{a\}$ , calculate in and out for every instruction in the program. Show the iteration as in figure 9.4.
- c) Draw the interference graph for a, b, t and z.
- d) Make a three-colouring of the interference graph. Show the stack as in figure 9.6.

EXERCISES 207

e) Attempt, instead, a two-colouring of the graph. Select variables for spill, do the spill-transformation as shown in section 9.6 and redo the complete register allocation process on the transformed program. If necessary, repeat the process until register allocation is successful.

#### Exercise 9.2

Three-colour the following graph. Show the stack as in figure 9.6. The graph *is* three-colour-able, so try making different choices if you get spill.



#### Exercise 9.3

Combine the heuristics suggested in section 9.7 for selecting nodes in the **simplify** phase of algorithm 9.3 into a formula that gives a single numerical score for each node, such that a higher score implies a stronger candidate for spill.

#### Exercise 9.4

Some processors (such as Motorola 68000) have two types of registers: data registers and address registers. Some instructions (such as load and store) expect their arguments or put their results in address registers while other instructions (such as multiplication and division) expect their arguments or put their results in data registers. Some operations (like addition and subtraction) can use either type of register. There are instructions for moving between address and data registers.

By adding the registers as nodes in the interference graph, a variable can be prevented from being allocated in a specific register by making it interfere with it.

- a) Describe how instructions that require argument or result variables to be in a specific type of register can ensure this by adding interference for its argument and result variables.
- b) The answer above is likely to cause spilling of variables that are used as both address and data (as they interfere with all registers). Describe how this can be avoided by taking care of this situation in the spill phase. Hint: Add register-to-register move instructions to the program.

c) If there are not enough registers of one type, but there are still available registers of the other type, describe how you can spill a variable to a register of the other type instead of to memory.

#### Exercise 9.5

Some processors have instructions that operate on values that require two registers to hold. Such processors usually require these values to be held in pairs of adjacent registers, so the instructions only need specify one register number per value (as the other part of the value is implicitly stored in the following register).

We will now look at register allocation where some values must be alloacetd in register pairs. We note that live-ness analysis is unaffected, so only colouring and spill is affected. Hence, we start with an interference graph where some nodes are marked as requiring register pairs.

- a) Modify algorithm 9.3 to take register pairs into account. Focus on correctness, not efficiency. You can assume "colours" are numbers, so you can talk about adjacent colours, the next colour, *etc*.
- b) Describe for the **simplify** phase of algorithm 9.3 heuristics that take into account that some nodes require two registers.
- c) Describe for the **select** phase of algorithm 9.3 heuristics that take into account that some nodes require two registers.