# A Divide-and-Conquer Approach for Cell Routing using Litho-friendly Layouts

Author:
Alexandre Vidal Obiols

Supervisor: Jordi Cortadella Fortuny

Dept. de Llenguatges i Sistemes Informàtics

Facultat d'Informàtica de Barcelona



June 4, 2013

#### DADES DEL PROJECTE

Títol del projecte: A Divide-and-Conquer Approach for Cell

Routing using Litho-friendly Layouts

Nom de l'estudiant: Alexandre Vidal Obiols

Titulació: Enginyeria Informàtica

Crèdits: 37,5

Director: Jordi Cortadella Fortuny

Departament: Llenguatges i Sistemes Informàtics

## MEMBRES DEL TRIBUNAL (nom i signatura)

President: Jordi Petit Silvestre

Vocal: Roser Rius Carrasco

Secretari: Jordi Cortadella Fortuny

## QUALIFICACIÓ

 $Qualificaci\'o\ num\`erica:$ 

Qualificaci'o~descriptiva:

Data:

# **Preface**

The aim of this project is to develop divide-and-conquer algorithms for the routing of signals in nanoelectric standard cells. We work on an already existing cell synthesis framework based on solving a boolean satisfiability formulation of the routing problem. This approach takes too much computational time when dealing with big and complex cells. Our goal is to make such cells become tractable. Different ways to route a part of a cell and to use this partial routings to solve the routing of a cell are explored. We evaluate different strategies and try to find the ones with the best solution for a given cell and design rules on terms of computational time. [Some conclusions]

Chapter 1 provides background on logic synthesis, the routing process, circuit physical design considerations and the applications of the satisfiability problem. The following chapter is about CellRouter, the cell routing framework this project aims to improve, including a brief description of what it is based on and some experimental results. Chapter 3 is devoted to explain what the initial considerations of the project were with respect to the original tool and the implementation decisions that were made. It includes an overview of some of the dividing decisions that have been used as a divide-and-conquer strategy. Chapter 4 includes several experiments conducted using the Nangate Open Cell Library and another cell library created for testing purposes. Finally, conclusions and future work directions are given in chapter 5.

# Contents

| 1        | Bac                                                  | kground                               | 1                                                  |
|----------|------------------------------------------------------|---------------------------------------|----------------------------------------------------|
|          | 1.1                                                  | VLSI and EDA                          | 1                                                  |
|          | 1.2                                                  | Cell Routing                          | 5                                                  |
|          |                                                      | 1.2.1 General Considerations          | 5                                                  |
|          |                                                      | 1.2.2 General-purpose Routing         | 6                                                  |
|          |                                                      | 1.2.3 Global Routing                  | 8                                                  |
|          |                                                      | 1.2.4 Detailed Routing                | 9                                                  |
|          |                                                      |                                       | 10                                                 |
|          | 1.3                                                  | Design Considerations                 | 1                                                  |
|          |                                                      | 1.3.1 Standard Cell Design            | 1                                                  |
|          |                                                      | 1.3.2 Manufacturability-Aware Design  | 4                                                  |
|          | 1.4                                                  | · · · · · · · · · · · · · · · · · · · | 15                                                 |
|          |                                                      | 1.4.1 SAT problem                     | 15                                                 |
|          |                                                      | 1.4.2 Using SAT                       | 18                                                 |
|          |                                                      |                                       |                                                    |
| <b>2</b> | Cel                                                  | lRouter 2                             | 1                                                  |
| 2        | <b>Cel</b> 2.1                                       |                                       | 2 <b>1</b><br>21                                   |
| 2        |                                                      | The Routing Problem                   |                                                    |
| 2        | 2.1                                                  | The Routing Problem                   | 21                                                 |
| 2        | 2.1<br>2.2                                           | The Routing Problem                   | 21<br>23                                           |
| 3        | 2.1<br>2.2<br>2.3<br>2.4                             | The Routing Problem                   | 21<br>23<br>24                                     |
|          | 2.1<br>2.2<br>2.3<br>2.4                             | The Routing Problem                   | 21<br>23<br>24<br>26                               |
|          | 2.1<br>2.2<br>2.3<br>2.4<br>Dev                      | The Routing Problem                   | 21<br>23<br>24<br>26<br>29                         |
|          | 2.1<br>2.2<br>2.3<br>2.4<br>Dev<br>3.1               | The Routing Problem                   | 21<br>23<br>24<br>26<br>29<br>29                   |
|          | 2.1<br>2.2<br>2.3<br>2.4<br>Dev<br>3.1<br>3.2        | The Routing Problem                   | 21<br>23<br>24<br>26<br>29<br>32<br>35             |
|          | 2.1<br>2.2<br>2.3<br>2.4<br>Dev<br>3.1<br>3.2        | The Routing Problem                   | 21<br>23<br>24<br>26<br>29<br>29<br>35<br>36       |
|          | 2.1<br>2.2<br>2.3<br>2.4<br>Dev<br>3.1<br>3.2<br>3.3 | The Routing Problem                   | 21<br>23<br>24<br>26<br>29<br>29<br>35<br>36<br>39 |
|          | 2.1<br>2.2<br>2.3<br>2.4<br>Dev<br>3.1<br>3.2        | The Routing Problem                   | 21<br>23<br>24<br>26<br>29<br>29<br>35<br>36       |

| 4                                                                                                                                                                                     | $\operatorname{Res}$ | ults                          | <b>49</b> |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|-------------------------------|-----------|
|                                                                                                                                                                                       | 4.1                  | Nangate Open Cell Library     | 50        |
|                                                                                                                                                                                       | 4.2                  | CatLib Cell Library           | 50        |
| 5                                                                                                                                                                                     | Con                  | aclusions                     | <b>53</b> |
|                                                                                                                                                                                       | 5.1                  | Cost study                    | 53        |
|                                                                                                                                                                                       | 5.2                  | Future work                   | 54        |
|                                                                                                                                                                                       |                      | 5.2.1 Multiple Rows           | 54        |
|                                                                                                                                                                                       |                      | 5.2.2 Many More Things        | 54        |
| Bi                                                                                                                                                                                    | bliog                | graphy                        | 55        |
| $\mathbf{G}$                                                                                                                                                                          | lossa                | ry                            | 58        |
| 4.1 Nangate Open Cell Library 4.2 CatLib Cell Library  5 Conclusions 5.1 Cost study 5.2 Future work 5.2.1 Multiple Rows 5.2.2 Many More Things  Bibliography  Glossary  A Grid Format | 61                   |                               |           |
| В                                                                                                                                                                                     | Cell                 | Router Command Line Interface | 65        |

# Chapter 1

# Background

### 1.1 VLSI and EDA

The complexity of Integrated Circuits (ICs) has been growing year after year since they were first introduced. According to Moore's Law [1], the density of transistors on chips doubles approximately every 2 years. This tendency has been followed for the last 40 years, but of course such a fast-paced evolution of the number of transistors comes with a lot of challenges at many levels, such as technology, design and tools. Very Large System Integration (VLSI) is the technology that allows combining thousands of transistors in a single chip such as a microprocessor. This field has been constantly evolving, trying to make faster chips and integrate more transistors generation after generation. As the number of transistors has dramatically increased over the years, the complexity of circuits has also increased enormously; and with it, the challenges associated to the design of such circuits.

The design of VLSI circuits is therefore a very complex process that requires automation. Electronic Design Automation (EDA) is a category of software tools for designing electronic systems such as ICs. This aid has been evolving together with the needs of VLSI design since the mid-70s. Nowadays, given the level of complexity that VLSI design has reached, EDA tools play a very important role in the fabrication of ICs.

Current workflows for the fabrication of chips are very modular. The Register-Transfer Level (RTL) design abstraction, which models synchronous digital circuits and focuses on the flow of digital signals between hardware registers, is used to describe a circuit. Hardware Description Languages (HDLs) such as Verilog and VHDL can be used to create such high-level

representation of circuits. EDA tools are used to design and implement technology-dependent circuits from a higher level of abstraction. The automated synthesis goes through a lot of steps to transform an RTL design into a geometrical layout, creating a physical design for the given RTL specification. Given such a high-level specification multiple final circuits can be considered valid, so there is room for a lot of decisions and optimization to be done in order to generate a good final design of a circuit.

Given an RTL circuit, these are some of the steps it goes through during the physical design phase. Note that all of these processes are automated by using the above mentioned EDA tools and we can consider that the output of one step is the input for the next one.

### Logic Synthesis

Given a circuit abstraction, such as an RTL circuit, logic synthesis returns an implementation of the circuit in terms of logic gates. The RTL design is translated into boolean expressions. These formulas can be optimized using exact methods such as the Quine-McCluskey algorithm [2, 3], heuristic methods such as Espresso [4] or kernel and boolean factorization. After these technology-independent steps, the formulas are mapped onto a given cell library resulting on a netlist in that technology library, using data structures such as Directed Acyclic Graphs (DAGs) and dynamic programming algorithms. To show a little example, this is a very simple module written in Verilog.

```
module some_logic(a, b, c, out);
  input a, b, c, out;
  output out;

assign out = (a & b) | c;
endmodule
```

Figure 1.1 represents the output of the logic synthesis phase receiving this code as an input.

#### Floorplanning

This is the first step in the physical design flow. Floorplanning consists in identifying structures that should be placed close together, capturing relative positions rather than fixed coordinates. It can be considered a generalization of placement, a first draft of how things will be al-



Figure 1.1: Logic gate level representation

located in the chip allowing transformations of the components such as rotations and modifying their shapes. It allows for later hierarchical approaches and enables global wiring as a preparation for detailed routing. Simulated annealing, trees and slicing structures, as well as dynamic programming for floorplanning optimization[5], are widely used in this area. They can be optimized for metrics such as area, wirelength, routability and others. Figure 1.2, taken from [6], shows two different floorplans for a given set of components. The floorplan on the left is optimal in area while the one in the right introduces white spaces.



Figure 1.2: (a) Optimal area floorplan (b) Non-optimal area floorplan

#### **Partitioning**

The netlist of the functions to implement can easily be very large. Partitioning is the process of dividing the chip into smaller blocks so that later partitioning and routing are easier, using a divide and conquer strategy to tackle design complexity. It is also a necessary step in the case of synthetisation on Field Programmable Gate Arrays (FP-GAs), where a mapping from the netlist to hardware is needed. Many variants of partitioning exist, such as two-way partitioning (one of the first approaches, [7]), multi-way partitioning, which can be seen as an extension of the min-cut for two-way partitioning, and multi-level partitioning, where the result is represented by a tree structure. Many more partitioning approaches can be found in [8].

### Placement

This step consists in assigning cells to positions in the chip according to some cost functions while preserving legality (for example, with no overlapping). The inputs are the netlists and the goal is to find the best position for each module considering wirelength, routability density, power and other metrics. Many placement styles exist depending on the design methodology it is integrated with (such as building blocks, standard cells or gate arrays). This step is tightly related to the next phase, routing. Some placement paradigms are:

- Constructive algorithms, such that when the position of a cell is fixed, it is not anymore modified. Some examples are cluster growth, min-cut [9], or quadratic-placement algorithms (such as Hall placement [10], the first analytical placer).
- Iterative algorithms, where intermediate placements are modified in order to improve some cost function. This would include analytical methods such as force-directed placement. Figure 1.3, taken from [6], shows several phases of the placement in a force-driven algorithm. The elements approach their final position iteration after iteration.
- Nondeterministic approaches, including metaheuristics like simulated annealing and genetic algorithms.

All these methods can be combined to obtain a more accurate result. Additionally other methods can be considered, for example a flow consisting of a global placement and legalization phase followed then by detailed placement step. There are many interesting research directions in placement such as manufacturability-aware placement, but probably the most interesting for our project would be routability-driven placement [11]. More information about placement algorithms can be found in [12].

#### Routing

The routing process determines the precise paths for nets on the chip layout respecting a set of design rules ensuring that the chip can be correctly manufactured. It requires a physical placement of the layout, the netslists and the design rules required by the fabrication process. The main aim then is to complete all required connections on the layout, although other objectives such as reducing total wirelength or meeting timing requirements have become of essential relevance in modern chip design. The routing phase represents a very complex combinatorial



Figure 1.3: Placement of a chip

problem. Usually, a two-step approach consisting of a global routing followed by a detailed routing is used. The first considers the connection between different regions of the chip, while the second focuses on obtaining a definite geometric layout for the wire connections.

These are only some examples of steps where algorithms have become indispensable for the design of ICs. As we can see, EDA tools have become a basic component of digital circuit design. All of these steps have an important algorithmic load and much effort has been invested in such crucial synthesis tools.

## 1.2 Cell Routing

Routing is one of the multiple steps that take place in the physical design process. During the last years many algorithmic techniques have been explored to address the complex problem of determining how the pins of circuits should be interconnected. As the number of transistors per chip grows, the increasing complexity of the design becomes a challenge for the routing stage. It is typically a very complex combinatorial problem that, as mentioned before, is usually solved using a two-stage approach: global routing and detailed routing. In this section we will overview both, as well as algorithms fitted for general routing.

#### 1.2.1 General Considerations

The main aim of the routing problem is to find a valid interconnection of terminals that honors a set of design rules. Typically, most routing algo-

rithms are based on graph-search techniques guided by parameters such as congestion and timing information, trying to find a balance of the net distribution among routing regions. For example, a chip might be partitioned into an array of tiles. Then the global router would find tile-to-tile paths for all nets on such a graph and use this information to guide the detailed router.

For the detailed routing step, two kinds of models exist: the grid-based and the gridless-based models. In the first, a grid is superimposed on the routing region and the detailed router finds routing paths in the grid. Gridless-based models, on the other hand, can use different wire widths and spacing. They have greater flexibility and can handle variable widths and spacing; however, grid-based routing is usually more efficient and easier for implementation given its lower complexity when compared to the other model.

When routing, two kinds of constraints appear: performance constraints and design-rule constraints. The objective of the performance constraints is to make connections meet the performance specifications provided by the chip designers. Design rules, on the other hand, are a set of additional constraints imposed by a given technology node that will have to be honored if we want the chip to be correctly manufactured. They impose restrictions on, for example, the minimum width of the wires or the wire-to-wire spacing. Another example of design rules is related to the layer models, which can be either reserved or unreserved. In the first case each layer is allowed only one routing direction, whereas the placement of wires with any direction is permitted in the other case. Most of the routers, however, use the reserved model because it has lower complexity and is much easier for implementation; as we will see, manufacturability has a great impact on many of the decisions taken during the physical design flow.

## 1.2.2 General-purpose Routing

As mentioned before, graph-based algorithms have been extensively used for both global and detailed routing. In this section we will introduce the *maze routing algorithm*, probably one of the most basic graph-search based algorithms.

Maze routing is based on a breadth-first-search. Consider the grid on figure 1.4. We want to connect the node marked with an S to the one marked with a T. The grayed zones represent obstacles where the wire cannot be placed.



Figure 1.4: Maze Routing - Wave Propagation

The maze router is composed of two basic steps: wave propagation and retracing. During the first step, starting from the source S, all adjacent nodes get labeled with a 1, which is the distance from such nodes to S. Later, all cells adjacent to the nodes labeled 1 get labeled with a 2. This continues until the node T is reached. We can see the wave propagation phase in figure 1.4 with the waves corresponding to 2 and 3. Once the node T is reached, as seen in figure 1.5(a), a shortest path from T to S can be retraced by following any path such that the labels of the nodes decreases as shown in figure 1.5(b). Often, the preferred path is the one with the least number of detours. Notice that this algorithm guarantees to find a path between two points if such a path exists and this path is the shortest one. Both figures illustrating the example have been taken from [6].

This algorithm was proposed by Lee in [13] and is also widely known as Lee's Algorithm. In practice, it is slow, memory consuming and difficult to apply to large-scale dense designs. Many methods have been proposed to reduce its running time and memory requirements. For example, alternative coding schemes for the nodes have managed to reduce the number of needed bits to merely two. Other variants include using DFS in combination with the BFS, starting point selection or double fan-out.



Figure 1.5: Maze Routing - Retracing

After this algorithm, many other graph-based search algorithm appeared. The most significant ones probably are Line-search routing [14] and algorithms based in the well known A\*-search proposed in [15], which are widely used in modern routers.

## 1.2.3 Global Routing

As exposed before, global routing can be considered the most coarse grain type of routing. It consists of defining routing regions, generating tentative routings for nets and associating them to routing regions, but without specifying the actual layout of the wires. There exist two kind of global routing algorithms which differ on the basic routing strategy: sequential and concurrent algorithms. Whereas the first ones try to route signals one by one, concurrent algorithms try to find a valid solution for all signals at once. We will see a small introduction to each strategy.

### Sequential global routing

In this schema we select a specific net order and the route nets sequentially according to that order. The quality of the solution greatly depends on the ordering given that an already routed net might block the routing of subsequent nets. Finding the optimal net ordering has been proven NP-hard. Often, a *rip-up and reroute* heuristic is used to refine the solution. It basically consists of ripping-up some already connected nets and then re-route the ripped-up connections. It usually

performs iteratively until all nets are routed, a time limit is exceeded or no gain is obtained. As we will see, CellRouter uses a similar heuristic to obtain better solutions once an initial legal routing has been found. The main drawback of this method is that, because of the dependence on net-ordering, if no feasible solution is obtained, it is not clear whether it does not exist or the chosen ordering was not good enough.

### Concurrent global routing

Concurrent global routing tries to establish all connections at the same time. Therefore, whether or not a solution is found does not depend on any net ordering. One of the most popular approaches is to model the layout as a graph and then use 0-1 integer linear programming. However, given it is NP-complete, another approach would be to solve the continuous linear programming relaxation and the transform the fractional solution to integer solutions through a rounding scheme such as randomized rounding. In practice, such techniques are embedded into larger global routing frameworks which use a hierarchical, divide-and-conquer strategy. For routing multi-pin nets instead of two-pin nets, those multi-pin nets are usually decomposed using minimum rectilinear Steiner trees.

## 1.2.4 Detailed Routing

Given the output of the global routing stage, they are used by the detailed router to determine the exact geometry of the nets in the chip. A popular type of detailed routing related to this project is full-chip routing. To cope with the scalability problem, routing frameworks use hierarchical and multilevel frameworks for large-scale designs. They use a divide-and-conquer approach by transforming large routing instances into smaller subproblems and later proceeding with a top-down, bottom-up or hybrid manner. In the first approach, the algorithm recursively divides the routing regions into smaller regions, routes the current level and refines the result in the next level. On the other hand, a bottom-up approach consists on initially partitioning the region into multiple small cells and, at each step, routing each region individually and merging it with its neighbors to form a larger supercell until the whole initial region is routed. The routing decisions made at any of the intermediate routing levels might be suboptimal, so hybrid approaches using both methods have also been explored. However, since routing decisions at a given level are irreversible, the quality of the solution is limited. This is a problem we will also face later in the project.

The same classification that we exposed for global routing algorithms can be applied to detailed routing algorithms. Sequential algorithms may not guarantee a solution even if it exists. For this reason, the most recent approaches tend to use concurrent routing. When doing detailed routing from a concurrent approach, two kind of algorithms can be considered, depending on the objects used to take routing decisions. Tree-based algorithms first generate a set of candidate routing trees using algorithms that generate multiple  $minimum\ rectilinear\ Steiner\ trees$ . Next, the problem is formulated as a multicommodity flow problem and solved as a 0-1 integer linear programming problem. On the other hand, segment-based algorithms take decisions at the level of individual metal segments. This version has a finer granularity, at the expense of more computational complexity. However, manufacturing constraints are more easily included in this scheme. Usually, SAT-based formulations are used to solve the routing problem under the segment-based approach.

## 1.2.5 Modern Routing

It is interesting to consider the previous work directly related to the tool this project is based on. The closest is proposed in [16], a segment-based approach inspired by the satisfiability formulation presented in [17]. However, [16] only managed to route small gates given the high computational complexity of the resulting formulation, as will be shown in chapter 2.

As we have seen, many kind of routing strategies and approaches exist. It is not by sticking to one of the methodologies that routing can be easily solved. Given the complexity of the problem, many kinds of routers exist that combine several of the ideas briefly exposed in this short introduction to routing. For example, BoxRouter 2.0 [18] is an academic global router that uses A\*-search considering the congestion history of edges, wire rip-up and rerouting and finally a progressive integer linear programming approach to do layer assignment. The routing problem is constantly evolving and new algorithms and techniques will for sure continue to arise in order to meet with the new requirements. Fabrication considerations are day after day becoming more determining during the physical design process.

The figures for this section where taken from *Electronic Design Automation: Synthesis, Verification, and Test*[6]. For more information on routing or VLSI algorithms in general take a look at that book or at *Handbook of Algorithms for Physical Design Automation*[19].

## 1.3 Design Considerations

When building digital circuits there exist multiple design styles, and the one used is chosen depending on the needs of the target design. Full-custom design is based on specifying the layout of each individual transistor and the interconnections between them. It potentially maximizes the performance and minimizes the area, but is very laborious to implement. Full-custom design is typically used in a situation where there are area limitations or special application needs. Some examples are the design of cells within a standard cell library, memory cells or datapaths for high performance designs. However, other design methodologies exist.

## 1.3.1 Standard Cell Design

In the standard cell methodology, low-level VLSI layouts are encapsulated in abstract logic representations (for example, as a NAND gate). This way, logical level becomes independent of physical level design. Using this design methodology, high-level design time becomes shorter as designs can be reused. Standard cell design relies on so-called standard cell libraries which contain primitive cells (such as an AND or an inverter) required for cell design. Additionally, more complex optimized cells can also be included. The cells in the library have a fixed height and variable width. This way they can be easily placed in rows easing the synthesis process. All cells on the row will be constructed according to a certain structure, for example the one shown in figure 1.6. They are normally optimized full-custom layouts so that area and delay are minimized.

Each one of these cells is described by the following views. Additionally, metrics such as timing, power and noise for each cell are provided.

#### Logical View

Cell's boolean logic formula, captured in a truth-table or boolean algebra equation, in the case of combinational logic, and a state transition table, in the case of sequential logic. Figure 1.7 is an example of a state transition table for an AND gate.

#### Schematic View

Description of the transistors, their connections to each other and the terminals to the external environment. Multiple possible netlists for a given logical view exist. Figure 1.8 is a schematic representation of an AND gate.



Figure 1.6: Possible physical structure

| State Table |    |    |  |  |
|-------------|----|----|--|--|
| A1          | A2 | ZN |  |  |
| L           | -  | L  |  |  |
| Н           | Н  | Н  |  |  |
| -           | L  | L  |  |  |

Figure 1.7: AND gate, logical view



Figure 1.8: AND gate, schematic view

13

#### Layout View

Physical representation of the cell. The most important from the manufacturing point of view, as it is the closest to the actual final design. Again, many possible layouts exist for a given schematic description of a cell. Figure 1.9 represents the layout of an AND gate.



Figure 1.9: AND gate, layout for schematic in figure 1.8

Given the project works on the routing of standard cells, a simple method of representation of such cells has been chosen to illustrate important concepts in this report. It does not intend to be formal or exact, but useful as an abstraction to show situations that happen in real routing problem instances. Figure 1.10 shows an example of how such a grid would look like. In such a 2D representation of a cell, all components sharing the same color are the ones that must be connected. When a terminal has no color, it means it is not important for what the figure intends to show so it is ignored. The voltage and ground signals are always on the top and the bottom of the cell. In this simple representation we will consider that wire segments can run along the columns and the indicated rows in the figure.



Figure 1.10: An example of a cell representation

## 1.3.2 Manufacturability-Aware Design

As we have seen, standard cell design provides help by encapsulating primitives so that design becomes more modular and regular. However, in the last years additional problems have been adding up to the challenges of IC design. As the size of transistors decrease, the manufacturing process has been becoming increasingly complex.

One of the crucial processes involved in chip manufacturing is photolithography. During this process, chemicals which either harden of soften when exposed to ultraviolet light are used. Such products are applied onto the surface of the die and are exposed to light trough a mask with the desired pattern. After this step, the softened parts are removed and another mask with the next pattern is used. Layers grow one above the other until all masks have been applied. The patterns on the masks are the final product that the physical design produces for a given circuit abstraction.

This whole lithographic process is crucial to the fabrication of ICs. Given the constant reduction in the size of the transistors as fabrication technology changes, the lithographic gap between such small component sizes and the light wavelength is having an increasingly important impact on the patterns used during the manufacturing process. Currently, Resolution Enhancement Techniques are applied to obtain transistor sizes much smaller than the light wavelength [20]. Such approach is limited to a certain amount of geometrical configurations, contributing to an enormous increase of layout design rules at each technology generation. The design costs increases for two reasons. On

one hand, the enormous effort required to verify such layouts. On the other hand, masks must be pre-distorted in order to compensate the distortions later introduced during the photolithography phase.

Litho-friendly layout techniques must be considered to deal with all this increasing complexity. These techniques exploit the use of one-dimensional features and gridded locations for layout elements. This complicates the design of standard cell libraries, that must now cope with such restrictions trying to provide the best possible area, performance and power consumption. However, using such techniques, manufacturability-aware physical layouts are produced, which are friendly to the fabrication process.

The aim of this project is to help in the synthesis of such manufacturability-aware standard cells. Today, it is a process where complex design rules are imposed. Given so, regularity has been introduced as a means to make design automation tractable. Such regularity can be seen for example in figure 1.6, where a gridded layout is used. A clear example of such methodology can be found in [21], which proposes a model where all layout elements are located on a grid of evenly spaced points, with the grid unit begin a fraction of the wavelength of the light which will later be used during the fabrication process. As it will be later explained, the tools used in this project use a similar grid approach and try to be as technology-independent as possible, so that any desired design rule can be specified and the output of the tools will be a physical design adapted to the desired technology constraints.

## 1.4 Boolean Satisfiability Problem

As explained before, EDA tools rely on algorithms to tackle a variety of computational problems which arise during IC design. One of the algorithmical problems that has attracted much attention during the last years has been the Boolean Satisfiability Problem (SAT). As we will see, much progress has been done and it can be used to face real industrial problems. For example, the routing tool used in this project is based on SAT.

## 1.4.1 SAT problem

First of all we will formalize the SAT problem and give some nomenclature that will be used in the later chapters. Suppose we have a set of *variables*,

$$P = p, q, r, \dots$$

We will define a formula as follows.

- 1. A variable is a formula.
- 2. If F is a formula,  $\neg F$  is a formula.
- 3. If F and G are formulas,  $(F \wedge G)$  is a formula.
- 4. If F and G are formulas,  $(F \vee G)$  is a formula.

We will now define an  $interpretation\ I$  over P as a map

$$I: P \mapsto \{0, 1\}$$

It can be thought of as defining a value, either 0 or 1 (which can be read as false or true), to every variable in P.

We will also define the function

$$eval_I: F \mapsto \{0,1\}$$

as follows.

- 1.  $eval_I(p) = I(p)$
- 2.  $eval_I(\neg F) = 1 eval_I(F)$
- 3.  $eval_I(F \wedge G) = min(eval_I(F), eval_I(G))$
- 4.  $eval_I(F \vee G) = max(eval_I(F), eval_I(G))$

We will also consider that I satisfies  $F(I \models F)$  if  $eval_I(F) = 1$ .

For example, over the variable set P that has been defined before, these would be well-constructed formulas.

- $\bullet \ F_1 = p$
- $\bullet \ F_2 = r$
- $F_3 = p \wedge q$
- $F_4 = p \land \neg p$
- $F_5 = (p \lor q) \land r$

Let's define an interpretation  $I_1$  for P.

$$I_1(p) = 1, I_1(q) = 1, I_1(r) = 0$$

Under that interpretation, if we evaluate all five formulas we obtain

- $eval_{I_1}(F_1) = eval_{I_1}(p) = 1$
- $eval_{I_1}(F_2) = eval_{I_1}(r) = 0$
- $eval_{I_1}(F_3) = eval_{I_1}(p \wedge q) = 1$
- $eval_{I_1}(F_4) = eval_{I_1}(p \wedge \neg p) = 0$
- $eval_{I_1}(F_5) = eval_{I_1}((p \vee q) \wedge r) = 1$

Now, let's define a model to be an interpretation for which a given formula evaluates to 1. For example,  $I_1$  is a model of  $F_1$ , whereas it is not a model of  $F_2$ . We will say that a formula is satisfiable if it has at least a model, and we will say it is unsatisfiable if there is no I such that  $eval_I(F) = 1$ . In the case above,  $F_3$  is clearly satisfiable, for  $I_1$  is a model for it. On the other hand,  $F_4$  is clearly unsatisfiable since, for any interpretation I,  $p \land \neg p$  evaluates to 0.

The SAT problem now is straightforward to define. Given a formula, is there any interpretation that satisfies it? Or, in other words, is there any variable assignment such that the formula evaluates to 1? From the examples above, we can very easily see that all formulas except  $F_4$  are satisfiable.

We will consider that any formula used as an input to SAT is in Conjunctive Normal Form (CNF). First, let's define a *literal* as a variable or the negation of a variable  $(l_1, \neg l_1, l_3, \ldots)$ . Second, let's define a *clause* as a disjunction of literals  $(l_1 \lor l_2, \neg l_1 \lor l_3, \ldots)$ . Now, we will say a formula is a CNF if it is a conjunction of zero or more clauses, such as

$$(l_1 \vee l_2) \wedge (\neg l_1 \vee l_3) \wedge (\neg l_2 \vee \neg l_3)$$

Note that for a given F there always exists a CNF G such that  $G \equiv F$ . CNFs are used because the Tseitin Transformation [22] allows to obtain a CNF from any arbitrary formula such that it is satisfied only by the interpretations that satisfied the original formula, with only a linear growth compared to the original one. Solving the SAT problem for a formula in Disjunctive

Normal Form (DNF), defined as the CNF but changing disjunctions for conjunctions and viceversa, would be achievable in linear time by scanning the clauses until a satisfiable clause appeared. However, no transformation such that an arbitrary formula is transformed into a DNF and avoids an exponential growth has been found yet.

It is important to note that SAT is NP-Complete, in fact the first one to be known [23]. Some restricted versions are known to be solvable in polynomial time, such as 2SAT and HORN-SAT. However, even if it is NP-Complete, many practical instances can be solved in affordable time. Efficient and scalable algorithms for SAT developed in the last years have contributed to the use of SAT-solving engines as an essential tool in EDA.

## 1.4.2 Using SAT

The SAT problem is of central importance in many areas of computer science and industry. How can the SAT problem help in problems apparently as unrelated as industrial planning, scheduling of football leagues or the routing of standard cells? It is done by reducing a problem to SAT.

Consider a black-box SAT-solver, such that it receives a CNF F as an input and it returns "YES" if satisfiable, with a model that satisfies it, or "NO" otherwise. Reducing a problem to SAT consists on encoding our problem in a formula that we can give as an input to a SAT-solver in such a way that we can, in return, construct a solution to our problem from the answer the SAT-solver has provided.

Let's see a simple reduction. We will use SAT to solve the k-CLIQUE problem. Given a graph of size N and an integer k, k-CLIQUE returns "YES" if there is a totally connected subgraph of size k, "NO" otherwise. We will use the following variables.

 $p_{i,j}$  ="The *i*-th node in the graph is the *j*-th node in the clique"

Now we will explain how to construct a CNF such that, if it is satisfiable we can get a k-clique from the graph, and there is no k-clique otherwise. We will have four groups of clauses.

1. For every node in the clique, it must be at least one of the nodes of the graph. We can encode this clause as

$$p_{1,j} \vee p_{2,j} \vee \ldots \vee p_{N,j}$$

$$\forall j 1 \leq j \leq k$$

2. For every node in the clique, it must be at most one of the nodes of the graph. We can encode this clause as

$$\neg p_{i,j} \lor \neg p_{i',j}$$

$$\forall i \forall i', i \neq i'$$

$$\forall j, 1 \leq j \leq k$$

3. For every node in the graph, it can't occupy two nodes in the clique.

$$\neg p_{i,j} \vee \neg p_{i,j'}$$

$$\forall j \forall j', j \neq j'$$

4. For every two positions in the clique, if there is no edge connecting their nodes, they cannot both be in the clique.

$$\neg p_{i,j} \lor \neg p_{i',j'}$$

 $\forall i \forall i', i \neq i'$  and no edge between nodes i and i'

$$\forall j \forall j', j \neq j'$$

Now, given an instance of k-CLIQUE, we would encode it in CNF form. We would use it as an input to SAT-solver and examine its output. If it returned "unsat", it would mean that no assignment of values to the variables renders the formula true, thus implying that no clique of k nodes exists. However, if it returned a model for the formula, observing the i index of the variables assigned to one we would be able to know which vertices are on the k-clique and which are not.

As we have seen, SAT can be used as a black-box tool to solve any problem that we can encode in a boolean formula. This approach is used by CellRouter. To solve the problem we must describe it in terms of a CNF and use a SAT-solver to obtain a solution. There are many ways of creating such a formula and the success of this approach depends to a great extent on how variables are picked and restrictions are codified. However, as stated before, SAT is a problem supposedly difficult to solve. If this approach is getting more attention is because a lot of work is being done on the field, not only on SAT solving but in constraint programming (with examples such as Satisfiability Modulo Theories [24]). Most modern SAT solvers are based in the DPLL algorithm, a systematic backtracking looking for a satisfiable assignment of the variables. However, a lot of additions and optimization have been added, such as conflict analysis, clause learning, backjumping, random restarts and heuristics. These methods have been proven empirically to be essential to solve large instances of SAT. For some more information on the use of SAT in EDA take a look at [25].

# Chapter 2

# CellRouter

As explained on the abstract, the aim of the project is to develop divideand-conquer strategies on a previously existing framework to route standard cells. This router, called CellRouter, uses a technology-independent and parametrizable approach which can be adapted to different fabrics and rules. It uses a boolean formulation of the problem to find a legal detailed routing of a cell represented by a gridded layout. However, as cells become larger, approaches such as the one this project explores become mandatory to keep SAT formulas tractable. In this section, basic insight on how the CellRouter tool works and necessary vocabulary that will be later extensively used will be given.

## 2.1 The Routing Problem

As explained in section 1.1, routing and manufacturability-aware design have attracted lots of attention in the last years. This routing tool addresses both issues by considering geometrical regularity for the routing process. As mentioned before, it is not the first time that a boolean formulation of the problem is presented [16, 17]. However, the complexity of the problem restricted the applicability to small cells. Additionally, algorithms based on using regular layout fabrics had already been proposed in [26], but they were specially customized for that fabric and a specific set of design rules.

The router we are working on proposes an algorithmic approach for a generic problem of cell routing, which has the following characteristics.

• Should be independent from the layout templates and the interconnect resources, so that it can be configured with the resources available at

any technology generation.

- Attributes should be allowed for every wire segment.
- Should allow the router to select the best terminal locations.
- Should be independent from the set of design rules.
- In the case of unroutable cells, externally connected pints should be allowed.
- A set of recommended design rules to improve yield should be specifiable.
- Wirelength should be a parameter for optimization.

To do so, the CellRouter tool uses an encoding scheme for SAT-based formulas that makes large cells tractable by applying windowing heuristics. A formalism to specify gridded design rules and multiple-patterning constraints is provided. CellRouter also uses heuristics for quality improvement (wirelength and recommended design rules) and allows the connection of external pins in case of unroutability.

Graphs are used to represent a gridded routing problem. Every net has a set of terminals that must be connected. Each terminal is represented by a set of vertices. Edges represent wire segments that can be used to connect pairs of vertices. The routing problem is defined as follows.

Find a set of edges that define routes connecting the terminals of each net. The routes must be disjoint (cannot have common vertices) and satisfy a set of design rules.

It is important to realize that the number of possible solutions is finite. It can be reduced to a SAT formula in which a variable is associated to every edge representing the presence or absence of a given signal in that position. To find such a solution with the maximum quality, CellRouter uses two steps.

- 1. Finding a legal solution that honors the design rules.
- 2. Improving the solution by iteratively re-routing nets and using quality terms in the cost function.

## 2.2 Routing Problem Representation

The routing region is represented by a 3D undirected grid graph G(V, E) as depicted in figure 2.1(a).



Figure 2.1: (a) Grid model for routing. (b) Physical and logical grid

The vertices have associated integer coordinates in  $\{1, ..., W\} \times \{1, ..., L\} \times \{1, ..., H\}$ , where W, L and H represent width, length and height. The edges of the graph connect grid points. Notice that the represented physical grid may not have a uniform distribution such as the one shown in the grid as can be seen in figure 2.1(b).

Every vertex v is denoted by its coordinates v = (x(v), y(v), z(v)). In our context, z(v) represents the layer of the layout, thus  $z(v) \in \{pd, m1, m2\}$  as shown in figure 2.1(a). Every edge will be denoted by its endpoints, as in e(v, u). We will also define a  $net\ n \subset V$  as a set of grid points, called terminals, that must be connected. A *subnet* will be a pair of terminals of the same net.

A Viewer program is provided in order to see how a grid looks like. Given the description of a grid it shows a 3D representation of it using OpenGL, allowing interaction with the model including zoom and rotation. Figure 2.2 represents an instance of a real routing problem. Each color represents a different net or signal. We can see in the lowest layer some terminals that need to be connected. On the top and bottom of the cell we can see the VDD (voltage, red) and VSS (ground, grey) lines crossing the second layer. On the bottom left side of the screen, a complete list of the signals that appear on the cell is provided.



Figure 2.2: Routing grid problem instance

# 2.3 SAT to Solve the Routing Problem

Once we have some vocabulary we can make a broad overview at how Cell-Router uses SAT to solve the routing problem. To do so, CellRouter codifies the routing problem to a CNF formula that can be given as an input to a SAT solver. First, some variables to model the problem are needed.

- $\rho(e)$ : A variable that represents when e is occupied by a wire.
- $\rho(e, n)$ : A set of variables that represent the associated net in case e is occupied by a wire.
- $\rho(e, n, s)$ : A set of variables that represent the sub-nets associated to every wire.

Given these variables, the Boolean formula  ${\cal F}$  that represents the problem is as follows.

$$F \equiv C \wedge R \wedge DR$$

Here we will have a brief description of the elements in each of the components of F.

### C, Consistency constraints

These clauses ensure the consistency of the formula. For example, make sure that if an edge is associated with a net, such net is occupied by a wire, or that if an edge is associated to some subnet of a net, it is also associated to that net.

#### R, Routability constraints

This clause set represents the constraints for the grid given its routability nature. For example, we must impose that each edge is assigned to at most one net and two adjacent wires are assigned to the same net. Additionally, windowing is considered. That is because, empirically, it has been shown that the route of a two-terminal subnet rarely spans beyond the bounding box determined by the two terminals. Thus, clauses that enforce the variables outside the region to be falsified can also be added. This might imply that no solution is found even if one exists, but it greatly improves the tractability of the problem.

#### DR, Design-rules contraints

Finally, these clauses represent constraints imposed by the user-defined set of design rules. Such design rules might impose, for example, that no adjacent vias can be connected to different nets. Extra clauses modeling wire attributes are included among this constrains.

CellRouter allows for any discrete set of attributes to be binary-encoded and incorporated to the formula. For example, wires might have two different widths (thin and thick) or could be assigned to different masks to comply with some patterning lithography rules. Additional variables with the form  $\rho(e,x)$  which represent the presence of attribute x in edge e are then added, as do the necessary clauses to deal with such attributes.

The formula F thus generated is given as an input to the *picosat* SAT solver which will return a satisfying model, if such exists. Given this model, a solution for the original routing problem can be obtained: this is the main goal of the router. In figure 2.3 we can see a solution to the routing problem in 2.2.

As stated before, among all valid solutions, some have better quality than others. For example, cells with smaller wirelength are preferred. We can at first glance notice that probably the solution proposed by the SAT solver in 2.3 has many elements that can be gotten rid of. CellRouter proposes a heuristic method to make this problem tractable using a 0-1 linear programming engine, *gurobi*. However, this model becomes intractable when



Figure 2.3: First obtained solution for 2.3

dealing with large cells. Large Neighborhood Search is then used to reduce complexity of the problem in combination with Integer Linear Programming (ILP). The algorithm consists on ripping-up and re-routing nets of the basic solution obtained using the SAT solver until no significant improvement is observed, which in practice takes two rounds of re-routing for each net. This strategy admits variants such as ripping and re-routing more than one net simultaneously; additionally, other aspects such as the ordering of the nets could be considered to search for even better local minima. Figure 2.4 shows an optimized version of the first obtained solution.

For more information on the grid data structure and what the command line interface of CellRouter is, refer to appendixes A and B.

## 2.4 Results

The CellRouter tool was used to synthesize the Nangate 45nm Open Cell Library, which contains 127 cells. All layouts were checked for DRC correctness and can be found in

http://layout.potipoti.org



Figure 2.4: Optimized solution for 2.2

The encoding scheme of CellRouter is compared to another SAT formulation of the routing problem presented in [27]. As we can see in table 2.1, CellRouters' sparse encoding outperforms the dense encoding that was used in the previous work. Additionally, it is interesting to see how the windowing heuristic has a big impact on the CPU time for routing cells, managing to divide the computation time at the expenses of losing some solutions. All library cells were routed in 1 hour and 5 minutes.

|           |      | Sparse | (w=2) | Sparse | (w=5) | Der  | nse [27] |
|-----------|------|--------|-------|--------|-------|------|----------|
| Cell      | Area | Size   | CPU   | Size   | CPU   | Size | CPU      |
| OAI221_X1 | 5    | 102    | 0.1   | 141    | 0.1   | 96   | 0.1      |
| HA_X1     | 9    | 198    | 0.1   | 249    | 0.1   | 239  | 29.4     |
| FA_X1     | 15   | 521    | 1.2   | 624    | 8.7   | 486  | 2959.0   |
| DFFS_X1   | 21   | 731    | 2.0   | 893    | 8.1   | 903  | 205.4    |
| SDFF_X1   | 25   | 657    | 2.3   | 1073   | 7.2   | 1380 | 1424.0   |
| OAI221_X1 | 33   | 1626   | 15.4  | 1944   | 98.1  | 2679 | 40 hours |

Table 2.1: Results for SAT solving (Size in 10<sup>3</sup> literals, CPU in secs.)

However, we must take into account that CellRouter has been applied to cells of a limited size. What happens when it has to deal with bigger, more complex cells? Given that the tool is based on a SAT-solver, and SAT is a hard problem, as soon as the complexity of the problem scales, it becomes intractable. This project aims to find a way for such hard cells to be routed and, to do so, it uses a technique that is not so new in the field of routing algorithms as we have seen in chapter 1: The divide-and-conquer approach.

# Chapter 3

# Development

In this section, the development of the project will be described. Following some preliminary considerations, multiple iterations were developed in order to look for the best possible solution to the proposed problem, which will be explained briefly. Finally, special care will be put in the explanation of the last version.

As we have seen in the previous section, CellRouter is a tool based on SAT that finds valid routings on grids. However, given the nature of SAT, when the problem grows on complexity or size it becomes intractable. The main aim of this project is help CellRouter dealing with such cases when normal brute-force SAT-solving is not enough. In order to do so, the chosen approach is one that has already shown up in the routing world: the Divide-and-conquer strategy.

## 3.1 Preliminary Study

When interacting with an already existing tool, knowing its ecosystem becomes of great relevance, as the project will need a very close interaction, if not modifying the tools themselves. Figure 3.1 shows the basic workflow in the placement and routing of a cell.

The input to the flow is a .pla file. It contains a linear placement of the nets that must be routed. It is given to the *Gridder*, which is a C++ binary that, given such placement and a .tpl template file, will output a .grd file. The template file is needed because it contains information related to the geometrical structure of the problem, for example the length of both p and n transistors. Figure 3.2 shows five possible templates that a given netlist



Figure 3.1: CellRouter data flow

could be adapted to. In fact, the decision of what template to use is crucial, since a routing problem solvable for a given template is not necessary solvable for another one. Figure 3.3 shows the result of gridding an AND gate using three different templates. The same placement is respected but the final grid may even vary on vertical width.

The outputted .grd file represents the grid that will be routed by the Cell-Router. More details about it can be found in appendix A. CellRouter has been implemented with C++ and it uses external software, mainly picosat for SAT solving and gurobi for optimization. Both processes are called internally from the CellRouter when needed. In addition to the .grd, a file specifying the design rules to be followed by the router is required. It consists of a .so library that is loaded dynamically, allowing different rule sets to be used for different instances of routing. Given such file and a .grd, CellRouter proceeds to find a valid assignment for the SAT formula and to optimize it using gurobi. More information on the interface of CellRouter can be found in appendix B.

It is important to note that the problem of finding a routing for a given



Figure 3.2: Five possible templates



Figure 3.3: Three different griddings for the same placement

grid is a combinatorial optimization problem. Not only a solution can be found (or not), but also the quality of the solution is relevant. The output, however, doesn't depend exclusively on the .grd input and the specified design rules. Given all the parameters that can be modified for the CellRouter, either when looking for a solution or during optimization (halo size, number of escapes, heuristic rounds...), the space search for even only a cell becomes big. As we will see later, the use of divide-and-conquer techniques will make such search space even greater. This is a problem since the best configuration to route a given cell won't necessarily be the best for the following to come, and it won't in many cases. This inherent difficulty must be taken

into account when designing the tool and the experiments.

Another important aspect to take into consideration is how to apply the divide-and-conquer scheme. As exposed in section 1.2, two approaches can be considered: bottom-up and top-down. The last one would imply to first create some general routing connections and then work at a lower level, whereas the first one implies first routing little zones and then expand the routing to bigger ones, such as an entire cell. In this project, the bottom-up methodology seems as the more natural approach. A top-down scheme would imply first routing the connections of parts far away in the cell, which would potentially occupy spaces needed for the detailed local routing. Additionally, a top-down approach should allow already decided routes to be modified and adapted when routing smaller regions. Given the characteristics of the tool, a bottom-up approach seems a more natural and easily implementable option. The idea would be to solve parts of a given cell so that finally the whole routing problem becomes easier than it was originally.

## 3.2 Implementation

Given that the original tool was developed using C++, the first implementation that was made was a C++ binary called CellDivider. Its aim was to begin interacting with the problem, getting used to the data structures and overall flow. CellDivider received the same input that was before given to CellRouter and directly interacted with the infrastructure that called picosat and gurobi. It used directly many of the classes that were used by the original tool. The grid data structures already existed in C++, so CellDivider focused on interacting with the problem through them. It did the most basic partition one could think of; given a cell as an input such as the one on figure 3.4, the program generated a new cell that was exactly the left half of the original one (figure 3.5, left side). The router was called from inside CellDivider, solved the left-hand part (figure 3.5, right side). When solving a half of the problem, the program did not consider any information of the remainder of the cell. Finally, CellDivider copied the solution onto the original cell, as shown in figure 3.6, and the partially routed problem was sent to the router to obtain a valid solution for the whole cell.

When the router receives a grid as an input, all the variables are fixed, including those which come from an earlier partial solution. This first version showed how a poorly chosen distribution on a partial routing could lead the total cell to become insatisfiable. It was tested against a subset of cells



Figure 3.4: Input for CellDivider



Figure 3.5: Partial problem and solution



Figure 3.6: Final routing problem

obtaining very different results, from cells that were insatisfiable to cells that were routed in a half of the original time. From this moment on, the focus was on obtaining partial solutions that were conscious of the rest of the cell, not leading to insatisfiable global problems.

Following this first version, it was decided to use python to continue with the project and to leave CellRouter as it was, interacting with it through the command line and external files instead of modifying the router itself. The idea was to work at a grid level, separating CellDivider from the other parts of the project. Python seemed a good implementation language given that the work CellDivider itself does is not computationally intensive and that much prototyping and testing had to be done. Ipython was used to develop the project; it is a framework that allows using Python in a more interactive fashion, with improved debugging and a enriched web-based editor.

Given that CellDivider was going to be developed using python and work at a grid level, with the grid being a C++ data structure, SWIG was used to interface the class and use it from python. However, many problems arose when trying to create the interface for such a complex class and the grid data structure was replicated in python. Complete separation from the C++ was achieved, so the development of CellDivider became independent of the internals of CellRouter. This involved some extra work as the original data structures were modified halfway through the project, which would have been easier if python accessed directly the C++ classes.

The final flow using CellDivider is as shown in figure 3.7. CellDivider reads the file cell.grd, containing the grid to be routed, and a configuration file, which includes the routes of the cells, binaries and the design rules file. Several smaller grid routing problems are created. For each one, CellDivider creates a temp.grd file and writes in it the grid that needs to be routed. Then it creates a subprocess and calls the CellRouter, which uses temp.grd as an input and routes it independently. When the router ends, the routed grid is saved in a temp.rte file and CellRouter returns the output to CellDivider, including information such as if the cell was routable or not, computation time and other metrics. Finally, CellDivider creates a cell.rte file where the original routed cell is saved, if a routing has been found, or announces the opposite otherwise.



Figure 3.7: CellDivider data flow

During the development of the project, many versions of CellDivider have

existed due to the experimental nature of the problem. All of them share the basic functions, which include reading the configuration file, routing a set of cells specified on a given file and calling the Gridder, Cellrouter or Viewer, extract information from their results. Aside from this common features, every version of CellDivider can be described in terms of their partition algorithm, how to prepare a partial problem, and their meta-algorithm, how to use the partial problems to obtain a global solution. The main task during the project has been coming up with several algorithms for both cases.

### 3.3 Partition algorithms

A partition algorithm refers to how to prepare a part of the cell to be routed. As explained before, the most basic approach would be to absolutely ignore what lies on the other sides of the cell when doing a partial routing. This, however, is dangerous because, as stated before, a bad partial solution could lead the global problem to be insatisfiable. For example, consider routing the cell in figure 3.8 by first routing the left half. Figure 3.9 shows what the partition would look like and a possible solution the that partial problem. Note that, on column 4, all horizontal positions are occupied. This is entirely valid on the partial solution as the green node does not need to be connected to any component in the right side. But when the partial solution is incorporated to the total cell and we try to route it we can not find a valid routing because the green net can not be connected as shown in figure 3.10.



Figure 3.8: Input for CellDivider

This example perfectly ilustrates the situation despite being a little unrealistic. However it must be kept in mind that when CellRouter returns a satisfing model it will probably not be optimal, so high-congested zones may appear. It is also important to notice that the design rules may be very strict, so finding a situation where a partial solution renders the whole



Figure 3.9: Partial problem and solution



Figure 3.10: Final routing problem

problem insatisfiable is quite usual. To cope with this problem, the following partitioning approaches try to keep some information on what lies on the parts of the cell that are not being routed in the moment.

### 3.3.1 Realistical Dividing

Consider a more general form of routing a part of the cell, where we want to solve the routing problem from column n to column m. The realistical dividing algorithm considers the central part of the cell, from columns n to m, and certain regions at their right and left. The left zone ranges from column ini to column n, whereas the right zone ranges from column m to end. The main idea is that any signal which has a terminal on the central zone and another one on any of the side zones should be granted an exit to the terminal outside the central zone. Additionally, if a signal has a terminal in the right and left zone but none in the central zone, it should be imposed that a crossing path exists.

In order to do so, first the ini and end columns are calculated. ini is the leftmost column where there is a terminal that must be connected to another terminal on either the central and right zone. end is defined the other way around. A temporal cell is created, where the width is end - ini. The zone

from n to m receives an exact copy of the original cell from n to m. The zone from ini to n and from m to end, however, will only contain the closer terminals to n and m such that they are required to be connected with the central zone or the opposite side. Finally, a remap of the positions where the iopins are considered valid is done and the cell can be routed.

Consider the cell in figure 3.11. We want to route the zone defined by n and m. However, we want to grant that both signals on column 6 have an exit to the other parts of the cell where they are required, which are columns 3 and 9. Additionally, on these two columns there is a terminal of the same signal that should cross the central zone. We need to set ini to column 3 and end to column 9. Now, the as explained earlier, the zone between n and m will have all the terminals included and the rest of the partial cell will only have the required terminals as shown in figure 3.12.



Figure 3.11: Input for CellDivider



Figure 3.12: Partial problem and solution

After the partial routing is completed, only the signals routed in the range from column n to column m are copied back to the original cell. All wires that are not in a path from one terminal to another are also removed. This would be the case of a wire that simply crossed the central region but did not

have any terminal in it. Since we only wanted to impose that such a path existed, the routing tool will occupy again those positions if needed. Figure 3.13 shows how the final cell looks like after the partial solution has been added. Notice that all wires that did not begin and end in the central zone have disappeared, but the path they occupied is available.



Figure 3.13: Final routing problem

Now let's take the example that was showed before where a cell was rendered insatisfiable. If the partial routing is done using realistical dividing, figure 3.14 shows how the partial routing problem and solution looks like. Given that there is a signal in column 2 whose closer terminal is in column 11 on the original cell, the partial cell will have to include up until that column. However, since it is the only signal shared between the n-m zone and the rest of the cell, it will be the only terminal from column 6 to column 10 in the partial cell. Finally, figure 3.15 shows how the final routing problem is. Notice that now a path for the signal that before rendered the cell unsatisfiable exists. This does not mean that now the total cell will have a valid routing, but it gives more chances that it will exist.



Figure 3.14: Partial problem and solution



Figure 3.15: Final routing problem

Notice that a number of variations exist for this partitioning algorithm. For instance, when copying the partial solution to the original cell, strictly all wires from n to m can be copied, not only the ones both beginning and ending in that range. One could also consider copying all wire of the partial solution, regardless of what zone they are in. Another option would be, when preparing the partial cell, including all terminals outside the central zone and not only those strictly required to be routed. It is specially useful when multiple partial routings are performed on a single cell, given that zones that might look empty for a partial solution could have already been occupied by an earlier routing. This option will be explored later in this chapter.

### 3.3.2 Approximation Dividing

This dividing approach is similar to the one described in the previous subsection. Once again we intend to route a cell from a column n to a column m. However, the idea is to approximate all terminals outside the routing zone that should be connected with signals that are between the nth and mth columns or that should simply cross that zone.

Consider the case of the right part of the zone we want to route. The algorithm will find, for every signal that is both in the right part and outside it, the terminal that is closer to the boundary between the zones. The partial cell can be created when the number of signals to approximate in both sides is known. Like in the realistical dividing approach, the zone between n and m is entirely copied. As for the other parts, respecting the closeness of every terminal to the boundary, fake terminals are added to represent that they must be connected with a terminal in that zone.

The following example illustrates how approximation dividing works. Figure 3.16 shows the cell we want to route. We decide to partially route the

cell between n and m. The signal on the top part of column 6 must be connected to the one on the top part of column 3 and both parts of column 10. In the partial problem, we approximate those terminals to the boundary of the zone we want to route. On the left part, the signals on columns 2 and 3 will approach column 4. On the right part we will proceed respecting which signal is closer to the mth column. First comes the signal in column 8, which already is close to the boundary. Following come the terminals on column 10. However, two of them exist at the same distance. In this case, we approach only one of them, the one in the zone which has a lower number of approached terminals - which is the top zone in this case. Finally, the terminal in column 11 also moves. The partial problem and a possible solution are shown in figure 3.17.



Figure 3.16: Input for CellDivider



Figure 3.17: Partial Problem and Solution

Figure 3.18 shows how the final routing problem looks like once the partial solution is copied. Notice that, since this method alters the geometry of the original problem even more than when using realistical dividing, all wire routed outside the n-m range need to be discarded. As we did before, only wires which both end and begin in between the nth and mth column are kept.



Figure 3.18: Final routing problem

Approximation dividing is useful when we are partially routing big cells that have signals which are potentially far away. We can come back to our example in figure 3.8. As we saw in figure 3.14, a lot of empty columns were added in the partial solution. Using approximation dividing, all that empty zone can be spared as shown in figure 3.19. The result when copying the information back to the original cell is the same as the one shown in figure 3.15. Avoiding all that white cell space simplifies the boolean formula, thus reducing the amount of memory and computation time that SAT will need to obtain a partial solution.



Figure 3.19: Partial problem and solution

As in the case of realistical dividing, many variations exist. For example, we could decide to add terminals in both the high and the low part of the cell, not only on one of them, when they appear on both areas in the original cell. This partitioning algorithm admits a pad parameter. It specifies the number of columns that will be left empty among the approximated terminals. It can be used to reduce congestion in the cell at the expenses of making it bigger. The examples of this section have considered a value of 0, but figure 3.20 show how the partial problem on figure ?? would look like with a different pad value.



Figure 3.20: Partial problem with pad = 1 and pad = 2

## 3.4 Meta-algorithms

Given a complete cell we want to route, meta-algorithms describe how to use partitioning algorithms, the ones described before, variations of completely different ones, in order to find a solution which is valid for the whole cell. Most CellDivider use a bottom-up approach consisting on routing parts of the cell and finally trying to route the entire cell with the information provided by the partial solutions. However, other approaches were a final global routing is not needed have also been explored.

It is important to remark that the number of possible meta-algorithms is enormous. Not only can they differ in the partitioning algorithms they use, but also in where and when they use them. Additionally, in every partial solution to the problem, parameters such as the halo have to be chosen. Any wrong decision could lead to an insatisfiable solution and, given the case, how the meta-algorithm reacts is also of great relevance.

This difficulties can be seen by considering a very simple example. Let us consider a cell which can be routed with a halo value of 5. Now we want to apply a meta-algorithm which is as simple as possible: dividing the cell in two. However, which parts should be partially routed, the left one, the right one or both? Using realistic dividing, approximation dividing or a variation of any of them? What halo should be used when routing the partial solutions? And when looking for a global routing?

The example above is one of the many meta-algorithms that have been tested. A variation is dividing the cell in an arbitrary number of parts, not

just two. The meta-algorithms that will be described in the following subsections try to find a more accurate way of deciding what to do with the cell they receive as an input. Many variations of them are also possible but have not been explored.

### 3.4.1 Congestion-Driven Routing

Metaalgorisme amb histograma n-m, longitud fixada.

### 3.4.2 Scan Routing

The aim of scan routing is to route the cell without the need of a final step where the whole cell is codified into a sat formula. In order to do so, the cell is divided in a given number of contiguous chunks which are then routed from left to right. After routing the ith part of the cell, all subnets which should both begin and end in the ith chunk or before are already routed. This way, after routing the final division of the cell, the whole cell has been routed. In order to do so, considering we want to route from column n to column m, the partitioning algorithm used is as follows.

For the part at the left of this zone we will use the approach of the realistic division algorithm with the variant where all the terminals and wires of the already routed zone will be included on the partial problem. In scan routing we consider that the left part of a partial routing has already been routed and incorporated to the final solution. If this is not regarded when routing the next part, maybe the new partial solution would need to assign certain signals to positions were a different wire has already been defined. The approximation divising approach can not be used in this part given that it does not respect the geometry of the cell and does easily lead to unsatisfiable solutions. When obtaining a partial result, all wires at the left of n will be considered as fixed and all wires at the right of m will be discarded, given that we only wanted to impose that a path from the zone we already routed and some signals at the right of column m existed.

On the right part of the zone we want to route we will use the approximation routing approach. This is because this way we can potentially save a lot of blank space when looking for the partial solution. It is not this important to keep the original geometry given that the part in the right will not be fixed after the partial routing.

The partial solutions used in scan routing, thus, take advantage of both of the approaches introduced in section 3.3. After all the parts of the cell have been routed from left to right a solution for the routing of the whole cell is obtained. Partial routings are optimized in order to raise the chances of getting a satisfying assignment and, at the same time, give better final solutions. However, as happens with the other methods, scan routing not finding a solution does not imply it does not exist.

The following example will illustrate how the algorithm works. Consider the cell in figure 3.21.



Figure 3.21: Input for CellDivider

We decide to use the scan algorithm on that cell in three parts. The cell is divided in the following ranges: (1,8), (9,15) and (16,22). Figure 3.22 shows the routing of the first part. In this case, the closest terminal of all signals which are to be connected to a net in the (1,8) range get approximated to the boundary. Figure 3.23 shows the state of the whole cell after the partial routing has been added.



Figure 3.22: Partial routing, first part



Figure 3.23: Cell after first partial routing

Now it is time to route the second range as seen in figure 3.24. This time only one signal will be come closer by the right, but a big part of the already routed cell will be included in the partial routing. This is because some of the terminals in (9,15) have to be routed with terminals in the previous part. Figure 3.25 shows the state of the cell after this second routing.



Figure 3.24: Partial routing, second part



Figure 3.25: Cell after second partial routing

The last step is to route the third part of the cell. After this partial solution is computed, no more routings will be needed. Figure 3.26 shows the solution for the last partial routing and how it is all integrated in the cell.



Figure 3.26: Third partial routing and final solution

However, the scan algorithm needs an extra tweaking on the partitioning algorithm. Notice how the terminals on column 6 have been routed twice. This is because when doing the second partial routing, as seen in figure 3.24, the connection of those terminals disappears beyond the left boundary. In order to prevent so, the following addition is included. When the left border is known, the algorithm checks if in that column more than one wire for the same signal exists. If it is the case and they are not connected, the shortest path between those positions is added to the partial problem. Given this implies expanding the cell by the left side, all positions on the left that are not part of the shortest path are blocked. Figure 3.27 shows how the second partial routing problem is. The black zone represents positions that can no be modified but have been added in order to let the router know that both terminals on the original column 6 are already connected. Notice that avoiding duplicate connections is important since it could mean the difference between a satisfiable and an insatisfiable cell.

The scan approach has an important befit over the others: it does not need a total routing over the cell. This is very interesting because, when cells get big and the number of signals increases, the generation of the formula can become expensive. Besides, never having the whole formula at once allows to route cells using less maximum RAM memory. However, this method also turns out to produce many insatisfiable cells, specially when they are very congested.



Figure 3.27: Second partial routing with blocks and solution

Mes possibles: Metaalgorisme jerrquic, histograma sense finestra fixada, scan amb lentitud.

# Chapter 4

## Results

Much testing was done during the development of the project. Each time a new version or variant was produced, it was tested with a set of representative cells of the Nangate library, about which we will explain more in section 4.1. Those tests were generally done in local, given that the size of the Nangate instances is manageable using a normal laptop. The experiments that will be shown in this chapter have been conducted on the LSI Cluster, using two of the nodes of the Nozomi queue. They are two Dell PowerEdge R410 using an Intel Xeon X5660 processor, with 12 cores and 2,8 Ghz frequency, and equipped with 96GB of RAM memory each. Using the cluster was advantageous over running the tests on local for several reasons. Both nodes are equipped with high-end processors, so the computation time is lower. The main advantage however is that, given the number of cores, many routing instances were solved in parallel. Besides, the memory of the system also mattered; there were routing instances that required more than 4GB of memory, which the laptop used for development could not provide.

This chapter will show some tests that were done using two different cells sets. The first section is devoted to the use of the cells provided by the Nangate Open Cell Library, whereas the second section relates to the *CatLib Cell Library*. In both of them some tables regarding different experiments are presented and finally some conclusions are drawn. A summary of the project's general conclusions will be presented in the next chapter.

### 4.1 Nangate Open Cell Library

The first set is the Nangate Open Cell Library, which is the same that was used to test the CellRouter. It is an open-source standard cell library provided for research purposes. It includes several different functions, including buffers, combinational gates and flip-flops, all of which come in different drive strength variants. When referring to the name of a cell, first will come the name of the function and after the drive strength (for example, an AND4\_X1 gate is a 4-input AND with the first level of drive strength).

A subset of the cells in the Nangate library has been selected for the purpose of making these experiments. The aim of this project is to help with big and complex cells. The cells that will be used are those that fall into one of such categories. The cells are AND4\_X4, AOI22\_X4, BUF\_X32, CLKGATETST\_X8, DFF\_X1, DFFR\_X1, DFFS\_X1, DFFRS\_X1, FA\_X1, INV\_X32, NOR4\_X4, OAI22\_X4, OR4\_X4 and SDFF\_X1, SDFFR\_X1, SDFFS\_X1, SDFFRS\_X1.

Some conclusions on given results.

### 4.2 CatLib Cell Library

The second set of cells consists of the concatenation of Nangate cells which have been grouped under the name of *CatLib*. They have been created on purpose to test several properties of the divide and conquer algorithms given that we wanted to test the tool on cells bigger than those offered by the Nangate library.

In order to do so, *CellCat*, a basic cell concatenator, has been developed. It requires a little tuning depending on what cell to multiply and outputs the concatenation of a given number of instances of the cell. When choosing which cells should be on CatLib we wanted to focus on two kind of cells, depending on whether they had a big congestion or not. An analysis of the whole Nangate library was conducted to determine which cells would be useful to generate bigger cells with the lowest congestion possible. To select these cells, both the mean number of subnets that crossed each column and the total size of the cell were considered. Finally, NOR4\_X4 and OAI22\_X4 where chosen. In the case of the congested cells, given that the time of solving each cell by the original tool was known, those which proved to be hard

were selected: The FA\_X1 full adder cell and various flip-flops.

Finally, these three kinds of cells were used for concatenation: Combinational cells, full adders and flip-flops. The first case simply considers the concatenation of some combinational cells (*COMB* in figure 4.1) which share no signal. These should prove to be the easiest to route given that no congestion should exist in the region between cells.



Figure 4.1: Concatenation of combinational cells

In the second case, where 1-bit full adders are concatenated, the carry out signal of a given full adder has to be connected with the carry in of the next full adder, with the exception of the first carry in and the last carry out which will be terminals of the final cell. This issue is addressed by CellCat, which assigns the same signal number when appropriate. In figure 4.2 it can be seen how a routing of a *n*-bit full adder might look like.



Figure 4.2: Concatenation of full adders cells

The last case is the 1-bit flip-flop concatenation generating a n-bit flip-flop cell. This time we don't need to carry signals from one region to the next one, but we need to share signals across the whole cell. The most basic

| Cell     | 1     | 2      | 3       | 4       |
|----------|-------|--------|---------|---------|
| NOR4_X4  | ?     | ?      | ?       | ?       |
| OAI22_X4 | ?     | ?      | ?       | ?       |
| FA_X1    | 17,05 | 101,95 | 316,96  | 1440,18 |
| DFF_X2   | 10,4  | 74,14  | 221,16  | 726,14  |
| DFFS_X1  | 22,25 | 289,59 | 2644,04 | ?       |
| DFFS_X2  | 14,38 | 398,74 | 2018,35 | ?       |

Table 4.1: Results for initial solving

flip-flop only needs to share the clock signal, but up to five signals need to cross the whole cell in the case of scan flip-flop with set and reset. In figure 4.3, the concatenation with a possible routing of reset flip-flops is shown.



Figure 4.3: Concatenation of reset flip-flops

These cells were routed in the cluster using various halo sizes. Table 4.1 shows the routing time in seconds using a halo of 2. The number on the column indicates how many times each cell was concatenated. No more than the concatenation of 4 cells was used given the enormous time it took to solve smaller cases.

Now, some results using the library.

Some conclusions on given results.

# Chapter 5

# Conclusions

Que s'ha aconseguit? Que no? Dificultats previstes, no previstes?

## 5.1 Cost study

To calculate the cost of the project we will consider two aspects. One will be the cost of the work done. The second is the cost of the equipment that has been used to develop the project. On the work costs:

| Month         | Work done                             | Man-hours |
|---------------|---------------------------------------|-----------|
| October 2012  | Knowing problem. Environment.         | 30 h.     |
| November 2012 | Knowing problem. Environment.         | 40 h.     |
| December 2012 | C++ Porting.                          | 40 h.     |
| January 2013  | Python Porting.                       | 40 h.     |
| February 2013 |                                       | 60 h.     |
| March 2013    | Cluster. Part Routing. Concatenation. | 80 h.     |
| April 2012    |                                       | 90 h.     |
| May 2013      | Meta-algorithms. Experiments.         | 120 h.    |
| June 2013     | Experiments. Project Report.          | 100 h.    |
|               |                                       | 600 h.    |

Table 5.1: Time study

Considering a salary of about 25  $\in$  per man hour, the total human cost of the project would be of 15000 $\in$ .

As for the tools used during the project:

| Tool              | $\mathbf{Cost}$ |
|-------------------|-----------------|
| Laptop            | 1000€           |
| LSI Cluster       | 1000€           |
| Ipython, texmaker | Free            |
| Total tool cost   | 2000€           |

Table 5.2: Tool cost study

When considering all costs together:

|                    | $\mathbf{Cost}$ |
|--------------------|-----------------|
| Engineering costs  | 15000€          |
| Tool costs         | 2000€           |
| Total project cost | 17000€          |

Table 5.3: Total costs

### 5.2 Future work

Now, here comes some future work.

## 5.2.1 Multiple Rows

## 5.2.2 Many More Things

# **Bibliography**

- [1] Gordon E. Moore. Cramming more components onto integrated circuits. Electronics (magazine), 1965.
- [2] Willard V. Quine. The problem of simplyfing truth functions. *American Mathematical Monthly*, 59(8):521–531, 1952.
- [3] Edward J. McCluskey. Minimization of boolean functions. *The Bell Systems Technical Journal*, 35(5):1417–1444, 1956.
- [4] Richard L. Rudell. Multiple-valued logic minimization for pla synthesis. *Memorandum No. UCB/ERL M86/65*, 1986.
- [5] Ralph H.J.M. Otten. Efficient floor plan optimization. In *Proceedings* of International Conference on Computer Design, pages 499–503, 1983.
- [6] Yao-Wen Chang and Kwang-Ting Cheng, editors. *Electronic Design Automation: Synthesis, Verification, and Test.* Morgan Kaufmann, 2009.
- [7] Brian W. Kernighan and Shen Lin. An efficient heuristic procedure for partitioning graphs. *Bell System Technical Journal*, 49(2):291–307, 1979.
- [8] Sao-Jie Chen and Chung-Kuan Cheng. Tutorial on vlsi partitioning. *VLSI Design*, 2000.
- [9] A. E. Dunlop and Brian W. Kernighan. A procedure for placement of standard-cell vlsi circuit. *IEEE Transactions of Circuits and Systems*, 4(1):92–98, 1985.
- [10] K. M. Hall. An r-dimensional quadratic placement algorithm. Management Science, 17(3):219–229, 1970.
- [11] Meng-Kai Hsu. Routability-driven analytical placement for mixed-size circuit designs. *IEEE/ACM International Conference on Computer-Aided Design (ICCAD)*, pages 80–84, 2011.

56 BIBLIOGRAPHY

[12] Chris Chu. Electronic Design Automation: Synthesis, Verification, and Test, chapter 11. Morgan Kaufmann, 2009.

- [13] C. Y. Lee. An algorithm for path connection and its application. *IRE*. Trans. on Electronic Computer, 10:346–365, 1961.
- [14] K. Mikami and K. Tabuchi. A computer program for optimal routing of printed circuit connectors. In *Proc. Int. Federation for Information Processing*, pages 1475–1478, 1986.
- [15] P. E. Hart, N. J. Nilsson, , and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. *IEEE Trans. on Systems Science and Cybernetics*, 4(2):100–107, 1968.
- [16] B. Taylor and L. Pielaggi. Exact combinatorial optimization methods for physical design of regular logic bricks. In *Proc. ACM/IEEE Design* Automation Conference, pages 344–349, 2007.
- [17] W. Hung, X. Song, T. Kam, L. Cheng, and G. Yang. Routability checking for three-dimensional architectures. *IEEE Transactions on VLSI Systems*, 12(12):1371–1374, 2004.
- [18] M. Cho, K. Lu, K. Yuan, and D. Z. Pan. Boxrouter 2.0: Architecture and implementation of a hybrid and robust global router. *IEEE/ACM International Conference on Computer-Aided Design (ICCAD)*, pages 503–508, 2007.
- [19] Charles J. Alpert, Dinesh P. Mehta, and Sachin S. Sapatnekar, editors. *Handbook of Algorithms for Physical Design Automation*. CRC Press, 2009.
- [20] R. F. Pease and S. Y. Chou. Lithography and other patterning techniques for future electronics. In *Proceedings of the IEEE*, volume 96, pages 248–270, Feb 2008.
- [21] W. Maly, Y.-W. Li, and M. Marek-Sadowska. Opc-free and minimally irregular ic design style. In *Proc. ACM/IEEE Design Automation Con*ference, pages 954–957, 2007.
- [22] G.S. Tseitin. On the complexity of derivation in propositional calculus. Structures in Constructive Mathematics and Mathematical Logic, Part II, Seminars in Mathematics, pages 115–125, 1968.

BIBLIOGRAPHY 57

[23] Stephen Cook. The complexity of theorem-proving procedures. In *Proceedings of the Third Annual ACM Symposium on Theory of Computing*, pages 151–158, 1971.

- [24] Roberto Nieuwenhuis, Albert Oliveras, and Cesare Tinelli. Solving sat and sat modulo theories: from an abstract davis-putnam-logemann-loveland procedure to dpll(t). *Journal of the ACM*, 53(6):937–977, 2006.
- [25] Joo P. Marques-Silva and Karem A. Sakallah. Boolean satisfiability in electronic design automation. In *Proceedings of the 37th Annual Design Automation Conference*, pages 175–180, 2000.
- [26] N. Ryzhenko and S. Burns. Physical synthesis onto layout fabric with regular diffusion and polysilicon geometries. In *Proc. ACM/IEEE Design Automation Conference*, pages 83–88, 2011.
- [27] B. Taylor. Automated layout of regular fabric bricks. Master's thesis, Carneige Mellon University, 2005.

# Glossary

 $\mathbf{CNF}$  Conjunctive Normal Form. 18–21, 25

**DAG** Directed Acyclic Graph. 5

**DNF** Disjunctive Normal Form. 19

**EDA** Electronic Design Automation. 4, 5, 7, 16, 19

FPGA Field Programmable Gate Array. 6

HDL Hardware Description Language. 4

IC Integrated Circuit. 4, 7, 15, 16

**ILP** Integer Linear Programming. 27

RTL Register-Transfer Level. 4, 5

SAT Boolean Satisfiability Problem. 16–19

VLSI Very Large System Integration. 4, 13

# Appendix A

# **Grid Format**

In this annex the format of the grid files will be described. They are kept in a plain text file with the .grd or the .rte extension, depending on whether or not the cell has already been routed. The file consists of a series of headers followed by the information indicated in such header. The sections of the file are as follows.

#### Title

Title of the grid.

#### **Sizes**

One line with several sizes of the grid, including

- Width.
- Length.
- Height.
- Number of signals.
- Number of properties.
- Number of attributes.
- Number of iopins.

#### **Signals**

The list with the name of the signals included in the grid, on for each line.

#### **Terminals**

A boolean indicating if such signal is a terminal or not, on for each line.

#### **Iopins**

One line with the coordinates of all positions where iopins are considered legal.

#### Attributes

The list with the name of the attributes included in the grid, on for each line.

#### **Properties**

A list of properties of the grid, each line including its name and value.

#### Grid

The actual values of the grid points. Every line represents a vertex in the grid. For a given vertex, the signal present in said vertex and all edges of that vertex is represented with the index of the signal in the signal list. -1 indicates the position is free and -2 indicates the position is locked. In the case of attributes being present on the grid, they will also be expressed for every vertex and edge right after the corresponding signal.

Below comes a reduced example of a .grd file corresponding to an AND4 gate.

```
TITLE AND4_X1
SIZES
13 11 3 11 3 0 78
SIGNALS
VSS
VDD
A1
A2
A3
A4
ZN
ZN_{-}neg
n\,e\,t\,{}_-0
net_{-}1
n\,e\,t\,{}_-2
TERMINALS
0
0
1
1
1
1
```

```
0
0
0
0
IOPINS
0 \ 0 \ 2 \quad 1 \ 0 \ 2 \quad 2 \ 0 \ 2 \quad 3 \ 0 \ 2 \quad 4 \ 0 \ 2 \quad 5 \ 0 \ 2 \quad 6 \ 0 \ 2 \quad 7 \ 0 \ 2
    2 2 2 3 2 2
                   [\ldots]
ATTRIBUTES
PROPERTIES
PLACEMENT / some_path / some_name.pla
TEMPLATE /some_path/some_name.tpl
TIME 1979 - 01 - 00@12:00:00
GRID
-2 -2 -2 -2
1 \ -1 \ 1 \ -1
-1 -1 -1
-2 -2 -2 -2
1 -1 1 -1
-1 -1 -1
[...]
0 - 1
-2 -2 -2
0 \ 0 \ -1
-1 -1
-2 -2 -2
0 \ 0 \ -1
-1 \ -1
-2 \ -2
0 - 1
-1
END
```

# Appendix B

# CellRouter Command Line Interface

In this appendix we will explain what the interface of CellRouter is. Cell-Router admits the following command line arguments. All grid files follow the .grd structure exposed in appendix A.

### Input

Path of the input grid file.

#### Output

Path of the output grid file.

#### Result

Path of the file where execution data such as partial times is stored.

#### Rules

Path of the file where the design rules are stored.

#### Rules set

Name of the rules set that will be used, located into the file mentioned above.

#### Halo

As explained before, given a subnet, all variables not included to in certain routing region defined by the subnet elements get a direct value of false. The halo metric, which is a positive integer, allows to expand such region. Sometimes, when the halo is too little, no solution is found because some subnet becomes unroutable. However, when the halo is big, the problem might become computationally hard.

#### Escapes

When no valid routing is found, if the escapes argument is given, the router will allow for some pins to be connected externally. The argument is the number of pins which are allowed to be left unconnected; it should be minimum.

#### Rounds

Number of rip-up and rerouting the iterations the optimization heuristic will make. More rounds usually means a better result at the expense of more computation time.

#### **Packs**

Number of signals that the optimization phase will rip-up and reroute at once.