# Complementary Synthesis for Encoders with Pipeline and Flow Control Mechanism

Abstract—Complementary synthesis automatically generates an encoder's decoder that recovers the encoder's inputs from its output. This paper proposes the first complementary synthesis algorithm that can handle flow control and pipeline mechanism widely employed in modern encoders. First, it identifies the flow control variables and infers the flow control predicate. Second, it identifies all pipeline stages in the encoder with the inferred flow control predicate. Finally, it applies Craig interpolant to characterize the decoder's Boolean functions that recover each pipeline stage and input. Experimental results indicate that this algorithm can always generate significantly faster pipelined decoders for flow controlled encoders.

#### I. INTRODUCTION

One of the most difficult jobs in designing communication chips is to design and verify complex encoder and decoder pairs. The encoder maps its input  $\vec{i}$  to its output  $\vec{o}$ , while the decoder recovers  $\vec{i}$  from  $\vec{o}$ . Complementary synthesis [1]–[6] eases this job by automatically generating a decoder from an encoder, with the assumption that  $\vec{i}$  can always be uniquely determined by and recovered from a bounded sequence of  $\vec{o}$ .

However, the flow control mechanism [7] in many encoders fails this assumption. Fig. 1a) shows a communication system with such mechanism that can prevent faster transmitter from overwhelming slower receiver. When receiver can keep up with the transmitter, the transmitter sends the data bit d to the encoder with  $f \equiv 1$ . According to the encoder's source code in Fig. 1b), it maps  $\vec{i} = \{d, f \equiv 1\}$  to  $\vec{o} = D_d$ , which can uniquely determine the value of both d and f. However, when receiver can NOT keep up with the transmitter, the transmitter sends  $f \equiv 0$  and meaningless d to the encoder, which maps them to the idle symbol I that can uniquely determine only f but not d

Qin [8] proposed the first complementary synthesis algorithm that can handle flow control mechanism. For the case in Fig. 1, it first identifies f to be the only variable that can always be uniquely determined, and then infers a predicate  $f \equiv 1$  that enable d to be recovered from o, and rules out the case  $f \equiv 0$  in which recovering d is impossible.

However, Qin et al. [8]'s algorithm ignored the encoder's internal pipeline stages  $s\vec{t}g^0$  in Fig. 2a), which is used to cut the encoder's datapath and boosts its frequency. So, as shown in Fig. 2b), the non-pipelined decoder generated by [8] directly recover input  $\vec{i}$  from output  $\vec{o}$  with a huge logic block  $C^0*C^1$ . While a proper and faster decoder, as shown in Fig. 2c), should cut this huge logic block  $C^0*C^1$  into two small pieces with the pipeline stage  $s\vec{t}g^0$ , just like its encoder.

Thus, this paper proposes a novel algorithm to generate such pipelined decoder for flow controlled encoder. It first applies



- a) The faster transmitter and slower receiver connected by the encoder and decoder
- b) The encoder's source code mapping *d* and *f* to *o*

Fig. 1. Encoder with flow control mechanism

Qin et al. [8]'s algorithm to identify  $\vec{f}$ , the subset of all input variables  $\vec{i}$  that can always be uniquely determined by output  $\vec{o}$ , and infer  $valid(\vec{f})$ , the predicate that enables the set of all other input variables  $\vec{d}$  to be uniquely determined by  $\vec{o}$ . It further identifies all state variables in each pipeline stage  $s\vec{t}g^j$ , and partitions them into data vector  $\vec{d}^j$  and flow control vector  $\vec{f}^j$ . It finally characterize the Boolean functions that recover each  $s\vec{t}q^j$  and  $\vec{i}$  with Craig interpolant [9].

Experimental result indicates that this algorithm can always generate pipelined decoder with flow control mechanism.

This paper is organized as follows. Section II introduces the background material; Section III presents our algorithm framework. Section IV identifies  $\vec{f}^j$  and  $\vec{d}^j$  in each pipeline stages  $s\vec{t}g^j$ , while Section V characterizes the decoder's Boolean functions that recover each  $s\vec{t}g^j$  and the input  $\vec{i}$ . Sections VI and VII present the experimental results and related works; Finally, Section VIII sums up the conclusion.

#### II. PRELIMINARIES

#### A. Propositional satisfiability

The Boolean value set is denoted as  $\mathbb{B} = \{0,1\}$ . A vector of variables is represented as  $\vec{v} = (v, \dots)$ .  $|\vec{v}|$  is the number of variables in  $\vec{v}$ . If a variable v is a member of  $\vec{v}$ , then we say  $v \in \vec{v}$ ; otherwise  $v \notin \vec{v}$ .  $v \cup \vec{v}$  is a vector that contains



Fig. 2. The pipelined encoder and its decoders

**Algorithm 1:** Determine whether i can be uniquely determined by a bounded sequence of  $\vec{o}$ 

```
Input: The input variable i \in \vec{i}.

1 p:=0; l:=0; r:=0;

2 while true do

3 | p++; l++; r++;

4 | if Sound(i) then return (true, p, l, r);

5 | else if Complete(i) then return (false, 0, 0, 0)
```

both v and all members of  $\vec{v}$ .  $\vec{v} - v$  is a vector that contains all members of  $\vec{v}$  except v.  $\vec{a} \cup \vec{b}$  is a vector that contains all members of  $\vec{a}$  and  $\vec{b}$ .  $\vec{a} - \vec{b}$  is a vector that contains all members of  $\vec{a}$  but no member of  $\vec{b}$ .

For formula F over variable set V, SAT solvers try to find a satisfying assignment  $A:V\to\mathbb{B}$ , so that F can be evaluated to 1. If A exists, then F is satisfiable; otherwise unsatisfiable.

For formulas  $\phi_A$  and  $\phi_B$ , with  $\phi_A \wedge \phi_B$  unsatisfiable, there exists a formula  $\phi_I$  referring only to the common variables of  $\phi_A$  and  $\phi_B$  such that  $\phi_I \wedge \phi_B$  is unsatisfiable and  $\phi_A \Rightarrow \phi_I$ .  $\phi_I$  is the **interpolant** [10] of  $\phi_A$  with respect to  $\phi_B$ .

#### B. Finite state machine(FSM)

The encoder is modeled by a FSM  $M=(\vec{s},\vec{i},\vec{o},T)$ , consisting of a state variable vector  $\vec{s}$ , an input variable vector  $\vec{i}$ , an output variable vector  $\vec{o}$ , and a transition function  $T: \vec{s} \times \vec{i} \to \vec{s} \times \vec{o}$  that computes the next state and output variable vector from the current state and input variable vector. When unrolling transition function  $T, s \in \vec{s}, i \in \vec{i}$  and  $o \in \vec{o}$  at the n-th step are respectively denoted as  $s_n, i_n$  and  $o_n$ .  $\vec{s}, \vec{i}$  and  $\vec{o}$  at the n-th step are respectively denoted as  $\vec{s}_n, i_n$  and  $\vec{o}_n$ . A **path** is a state sequence  $(\vec{s}_n, \dots, \vec{s}_m)$  with  $\vec{s}_i \vec{j}_j \vec{o}_j (\vec{s}_{j+1}, \vec{o}_j) \vec{b}_j \vec{o}_j (\vec{s}_{j+1}, \vec{o}_j) \vec{b}_j \vec{o}_j (\vec{s}_{j+1}, \vec{o}_j)$  for all  $n \leq j < m$ . A **loop** is a path  $(\vec{s}_i, \dots, \vec{s}_m)$  with  $\vec{s}_n \equiv \vec{s}_m$ .

#### C. The halting algorithm that determines whether $i \in \vec{f}$

Qin et al. [8] proposed Algorithm 1 to determine whether an input variable  $i \in \vec{i}$  can be uniquely determined by a bounded sequence of  $\vec{o}$ , by iteratively calling a sound and a complete approaches until they converge.

1) The sound approach in Fig. 3a) shows how to check whether an input variable  $i \in \vec{i}$  can be uniquely determined by a bounded sequence of  $\vec{o}$ : if there exists p, l and r, such that for every output sequence  $<\vec{o}_p, \ldots, \vec{o}_{p+l+r}>, i_{p+l}$  cannot be different. This is equal to the unsatisfiability of  $F_{PC}(p, l, r)$  in Equation (1), in which Line 1 and 2 of correspond to the two paths in Fig. 3a), Line 3 forces these two paths' output to be the same, while Line 4 forces their  $i_{p+l}$  to be different.

$$F_{PC}(p, l, r) := \begin{cases} \bigwedge_{m=0}^{p+l+r} \{ (\vec{s}_{m+1}, \vec{o}_m) \equiv T(\vec{s}_m, \vec{i}_m) \} \\ \bigwedge \bigwedge_{m=0}^{p+l+r} \{ (\vec{s'}_{m+1}, \vec{o'}_m) \equiv T(\vec{s'}_m, \vec{i'}_m) \} \\ \bigwedge \bigwedge_{m=p}^{p+l+r} \vec{o}_m \equiv \vec{o'}_m \\ \bigwedge i_{p+l} \equiv 1 \wedge i'_{p+l} \equiv 0 \end{cases}$$
(1)



Fig. 3. The sound and complete approximative approaches

2) The complete approach in Fig. 3b) shows how to check whether an input variable  $i \in \vec{i}$  can NOT be uniquely determined by a bounded sequence of  $\vec{o}$ : It is similar to Fig. 3a) but with three new constraints to detect loops on the three state sequences  $\langle \vec{s}_0, \ldots, \vec{s}_p \rangle, \langle \vec{s}_{p+1}, \ldots, \vec{s}_{p+l} \rangle$  and  $\langle \vec{s}_{p+l+1}, \ldots, \vec{s}_{p+l+r} \rangle$ . It is formally defined as  $F_{LN}(p,l,r)$  in Equation (2) with the last three lines corresponding to the three new constraints. If  $F_{LN}(p,l,r)$  is satisfiable, then by unrolling the three loops, we can be sure that any larger p,l and r can also make  $F_{LN}(p,l,r)$  and  $F_{PC}(p,l,r)$  satisfiable.

$$F_{LN}(p, l, r) := \begin{cases} F_{PC}(p, l, r) \\ \wedge \bigvee_{x=0}^{p-1} \bigvee_{y=x+1}^{p} \{\vec{s}_{x} \equiv \vec{s}_{y} \wedge \vec{s'}_{x} \equiv \vec{s'}_{y}\} \\ \wedge \bigvee_{x=p+1}^{p+l-1} \bigvee_{y=x+1}^{p+l} \{\vec{s}_{x} \equiv \vec{s}_{y} \wedge \vec{s'}_{x} \equiv \vec{s'}_{y}\} \\ \wedge \bigvee_{x=p+l+1}^{p+l+r-1} \bigvee_{y=x+1}^{p+l+r} \{\vec{s}_{x} \equiv \vec{s}_{y} \wedge \vec{s'}_{x} \equiv \vec{s'}_{y}\} \end{cases}$$

$$(2)$$

Algorithm 1 is a halting algorithm because if there indeed exists such p, l and r that make  $F_{PC}(p,l,r)$  unsatisfiable, then they can eventually be found in Line 4; Otherwise, p, l and r will eventually be larger than the length of the encoder's longest non-loop path, which makes  $F_{LN}(p,l,r)$  satisfiable. Both cases will terminate the loop.

## D. Inferring $valid(\vec{f})$ that makes $\vec{d}$ to be uniquely determined

For Boolean relation  $R(\vec{a}, \vec{b}, t)$  with  $R(\vec{a}, \vec{b}, 0) \wedge R(\vec{a}, \vec{b}, 1)$  unsatisfiable, Subsection 4.1 of [8] proposes an algorithm to characterize a Boolean function that covers exactly all the valuations of  $\vec{a}$  that can make  $R(\vec{a}, \vec{b}, 1)$  satisfiable:

CharacterizingFormulaSAT
$$(R, \vec{a}, \vec{b}, t) := \{\vec{a} | \exists \vec{b}, R(\vec{a}, \vec{b}, 1) \text{ is satisfiable} \}$$
 (3)

Its implementation will not be presented here.

### **Algorithm 2:** Inferring $valid(\vec{f}_{p+l})$

- 1 while  $Under(p, l, r) \neq Over(p, l, r)$  do p ++ ; l ++ ; r ++ ;
- 3 return Under(p, l, r)

Subsection 4.2 of [8] proposed Algorithm 2 to infer valid(f) by iteratively increasesing p, l and r, and calling CharacterizingFormulaSAT to characterize Under(p, l, r), a monotonically growing under-approximation of  $valid(\bar{f})$ , and Over(p, l, r), a monotonically shrinking over-approximation of  $valid(\vec{f})$ . They will eventually converge to valid(f). Please refer to [8] for details.

#### III. ALGORITHM FRAMEWORK

As shown in Fig. 4, we assume that the encoder has npipeline stages  $s\vec{t}g^{j}$ , where  $0 \le j \le n-1$ . And each pipeline stage  $s\vec{t}g^{j}$  can be further partitioned into flow control vector  $\vec{f}^j$  and data vector  $\vec{d}^j$ . The input vector  $\vec{i}$ , as in [8], can also be partitioned into flow control vector  $\vec{f}$  and data vector  $\vec{d}$ . If we take the combinational logic block  $C^{j}$  as a function, then this encoder can be represented by the following equations.

$$\begin{array}{rcl}
s\vec{t}g^{0} & := & C^{0}(\vec{i}) \\
s\vec{t}g^{j} & := & C^{j}(s\vec{t}g^{j-1}) & 1 \le j \le n-1 \\
\vec{o} & := & C^{n}(s\vec{t}g^{n-1})
\end{array} \tag{4}$$

With this encoder model, our algorithm framework is:

- 1) Calling Algorithm 1 for each  $i \in \vec{i}$  to partition  $\vec{i}$  into  $\vec{f}$  and  $\vec{d}$ . And defining (p, l, r) to be the maximal one returned from all calls to Algorithm 1.
- 2) Calling Algorithm 2 to infer  $valid(\bar{f})$  that enables  $\bar{d}$  to be uniquely determined with parameters p, l and r.
- 3) In Section IV, identifying  $\vec{f}^j$  and  $\vec{d}^j$  in each pipeline stage  $s\vec{t}q^{\jmath}$ .
- 4) In Section V, characterizing the decoder's Boolean functions that recover each pipeline stages  $s\vec{t}g^{j}$  and input vector  $\vec{i}$ .

#### IV. INFERRING THE ENCODER'S PIPELINE STRUCTURE

#### A. Minimizing r

In the remainder of this paper, superscript always means the pipeline stage, while the subscript, as mentioned in Subsection II-B, always means the step index in the unrolled transition function. For example,  $s\vec{t}g^{j}$  is the j-th pipeline stage. While



Fig. 4. The encoder's general structure with pipeline and flow control

#### **Algorithm 3:** Minimizing r

1 for  $r' := r \rightarrow 0$  do if  $r' \equiv 0$  or  $F'_{PC}(p, l, r' - 1) \wedge valid(\vec{f}_{p+l})$  is satisfiable for some  $i \in \vec{i}$  then break 4 return r'

 $s\vec{t}g_i^j$  is the value of this j-th pipeline stage at the i-th step in the unrolled state transition sequence.

With this notation, we can substitute Equation (4) into Fig. 3a). This leads to the following observations, which is also shown intuitively in Fig. 5:

- 1)  $\vec{i}_{p+l}$ , the value of input  $\vec{i}$  at step p+l, can be uniquely determined by  $s\vec{t}g_{p+l}^0$ , the value of the 1st pipeline stage
- $s\vec{t}g^{0}_{0}$  at the same step p+l; 2)  $s\vec{t}g^{0}_{p+l}$ , the value of the 1st pipeline stage  $s\vec{t}g^{0}$  at step p+l, can be uniquely determine by  $s\vec{t}g^{1}_{p+l+1}$ , the value of the 2nd pipeline stage  $s\vec{t}g^1$  at the next step p+l+1.
- 4)  $s \vec{t} g_{p+l+j}^{j}$ , the value of  $s \vec{t} g^{j}$  at step p+l+j, can be uniquely determine by  $s \vec{t} g_{p+l+j+1}^{j+1}$ , the value of next pipeline stage  $s \vec{t} g^{j+1}$  at the next step p+l+j+1.
- 6)  $s\vec{t}g_{p+l+r'}^{r'}$ , the value of the last pipeline stage  $s\vec{t}g^{r'}$  at step p+l+r', can be uniquely determined by  $\vec{o}_{p+l+r'}$ , the value of output  $\vec{o}$  at the same step p + l + r'.

With these observations, it is obvious that  $\vec{i}_{p+l}$  can be uniquely determined by  $\vec{o}_{p+l+r'}$ . By comparing this conclusion to Fig. 3 and Algorithm 1, we can be sure that  $r' \leq r$ . To find out r', we define the following new formula:

Compared to Equation (1), this new formula tries to use only  $\vec{o}_{p+l+r'}$  instead of sequence  $\langle \vec{o}_{p+l}, \dots, \vec{o}_{p+l+r} \rangle$  to



Fig. 5. Unrolled transition function with pipeline stages

uniquely determine  $\vec{i}_{p+l}$ .

Algorithm 3 is proposed to use this new formula to find out r'. In Line 2, it conjugates the inferred flow control predicate  $valid(\vec{f})$  with  $F'_{PC}(p,l,r'-1)$ . If it is satisfiable, then r' is the last one that makes  $F'_{PC}(p,l,r') \wedge valid(\vec{f}_{p+l})$  unsatisfiable, we return it directly. On the other hand, when  $r' \equiv 0$ ,  $F'_{PC}(p,l,0)$  must have been tested in last iteration, and the result must be unsatisfiable. In this case we return 0.

#### B. Identifying pipeline stages

Now, with p and l inferred by Algorithm 2 and r' founded by Algorithm 3, we need to generalize  $F'_{PC}$  in Equation (5) to the following new formula that can determine whether  $v_j$ , the value of variable v at step j can be uniquely determined by  $\vec{w}_k$ , the value of a vector  $\vec{w}$  at step k. Now  $v_j$  and  $\vec{w}_k$  can be either input, state or output variables (vectors).

Obviously, when  $F_{PC}''(p, l, r', v_j, \vec{w}_k)$  is unsatisfiable,  $\vec{w}_k$  can uniquely determine  $v_j$ .

According to Fig. 5, for  $0 \le j \le r'$ , the flow control vector  $\vec{f}^j$  in the j-th pipeline stage is exactly the set of state variables  $s \in \vec{s}$  that can be uniquely determined at the p+l+j-th step by  $\vec{o}$  at the p+l+r'-th step without enforcing  $valid(\vec{f}_{p+l})$ . It can be formally defined as:

$$\vec{f}^{j} := \left\{ s \in \vec{s} \mid \begin{array}{c} F_{PC}^{"}(p, l, r', s_{p+l+j}, \vec{o}_{p+l+r'}) \\ is \ unsatisfiable \end{array} \right\}$$
 (7)

While the data vector  $\vec{d^j}$  in the j-th pipeline stage is the set of state variables  $s \in \vec{s}$  that can be uniquely determined at the same p+l+j-th step by  $\vec{o}$  at the p+l+r'-th step with enforcing  $valid(\vec{f_{p+l}})$ . It can be formally defined as:

$$\vec{d}^{j} := \begin{cases} s \in \vec{s} \mid F''_{PC}(p, l, r', s_{p+l+j}, \vec{o}_{p+l+r'}) \land valid(\vec{f}_{p+l}) \\ \land valid(\vec{f'}_{p+l}) \text{ is unsatisfiable} \end{cases}$$
(8)

With Equation (7) and (8), pipeline stages can all be identified:

$$s\vec{t}q^j := \vec{d}^j \cup \vec{f}^j \tag{9}$$

V. CHARACTERIZING THE BOOLEAN FUNCTIONS RECOVERING INPUT VARIABLES AND PIPELINE STAGES

A. Characterizing the Boolean functions recovering the last pipeline stage  $\vec{stg}^{r'}$ 

According to Equation (7), every state variable  $s \in \vec{f}^{r'}$  at the p+l+r'-th step can be uniquely determined by  $\vec{o}_{p+l+r'}$ .

That is,  $F_{PC}''(p,l,r',s_{p+l+r'},\vec{o}_{p+l+r'})$  is unsatisfiable and can be partitioned into :

$$\phi_{A} := \left\{ \begin{array}{cc} & \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s}_{m+1}, \vec{o}_{m}) \equiv T(\vec{s}_{m}, \vec{i}_{m}) \} \\ & S_{p+l+r'} \equiv 1 \end{array} \right\}$$
 (10)

$$\phi_{B} := \left\{ \begin{array}{c} \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s'}_{m+1}, \vec{o'}_{m}) \equiv T(\vec{s'}_{m}, \vec{i'}_{m}) \} \\ \bigwedge & \vec{o}_{p+l+r'} \equiv \vec{o'}_{p+l+r'} \\ \bigwedge & s'_{p+l+r'} \equiv 0 \end{array} \right\}$$
(11)

According to [9], a Craig interpolant  $\phi_I$  of  $\phi_A$  with respect to  $\phi_B$  can be constructed, which refer only to  $\vec{o}_{p+l+r'}$ , the common variables of  $\phi_A$  and  $\phi_B$ . And  $\phi_I$  covers all the valuations of  $\vec{o}_{p+l+r'}$  that can make  $s_{p+l+r'} \equiv 1$ . At the same time,  $\phi_I \wedge \phi_B$  is unsatisfiable, which means  $\phi_I$  covers nothing that can make  $s_{p+l+r'} \equiv 0$ .

Thus,  $\phi_I$  can be used as the decoder's Boolean function that recovers  $s \in \vec{f}^{r'}$  from  $\vec{o}$ .

According to Algorithm 2, the inferred flow control predicate  $valid(\vec{f}_{p+l})$  can enable  $\vec{d}_{p+l}$  to be uniquely determined by  $\vec{o}_{p+l+r'}$ . By further referring to Fig. 5, this predicate  $valid(\vec{f}_{p+l})$  can also enable  $\vec{d}_{p+l+r'}^{r'}$  in the last pipeline stage to be uniquely determined by output  $\vec{o}_{p+l+r'}$ . So, for each  $s \in \vec{d}^{r'}$ , the formula  $F_{PC}''(p,l,r',s_{p+l+r'},\vec{o}_{p+l+r'}) \wedge valid(\vec{f}_{p+l}) \wedge valid(\vec{f}_{p+l})$  is unsatisfiable and can be partitioned into :

$$\phi_{A} := \left\{ \begin{array}{c} \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s}_{m+1}, \vec{o}_{m}) \equiv T(\vec{s}_{m}, \vec{i}_{m}) \} \\ \bigwedge \qquad \qquad s_{p+l+r'} \equiv 1 \\ \bigwedge \qquad \qquad valid(\vec{f}_{p+l}) \end{array} \right\}$$
(12)

$$\phi_{B} := \left\{ \begin{array}{c} \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s'}_{m+1}, \vec{o'}_{m}) \equiv T(\vec{s'}_{m}, \vec{i'}_{m}) \} \\ \bigwedge & \vec{o}_{p+l+r'} \equiv \vec{o'}_{p+l+r'} \\ \bigwedge & s'_{p+l+r'} \equiv 0 \\ \bigwedge & valid(\vec{f'}_{p+l}) \end{array} \right\}$$
(13)

B. Characterizing the Boolean functions recovering other pipeline stages  $s \vec{t} g^j$ 

According to Fig. 5,  $\vec{f}_{p+l+j}^{j}$  can be uniquely determined by  $s \vec{t} g_{p+l+j+1}^{j+1}$ . So for every  $s \in \vec{f^j}$ , we can partition the unsatisfiable formula  $F_{PC}''(p,l,r',s_{p+l+j},s \vec{t} g_{p+l+j+1}^{j+1})$  into:

$$\phi_{A} := \left\{ \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s}_{m+1}, \vec{o}_{m}) \equiv T(\vec{s}_{m}, \vec{i}_{m}) \} \\ s_{p+l+j} \equiv 1 \right\}$$
 (14)

$$\phi_{B} := \left\{ \begin{array}{c} \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s'}_{m+1}, \vec{o'}_{m}) \equiv T(\vec{s'}_{m}, \vec{i'}_{m}) \} \\ \wedge \qquad \qquad s\vec{t}g_{p+l+j+1}^{j+1} \equiv s\vec{t}g'_{p+l+j+1}^{j+1} \\ \wedge \qquad \qquad s'_{p+l+j} \equiv 0 \end{array} \right\}$$
(15)

Again, a Craig interpolant  $\phi_I$  of  $\phi_A$  with respect to  $\phi_B$  can be constructed, and used as the decoder's Boolean function that recovers  $s \in \vec{f}^j$  from  $s\vec{t}g^{j+1}$ .

Similar to Equation (12) and (13), by replacing  $F_{PC}''(p,l,r',s_{p+l+j},s\vec{t}g_{p+l+j+1}^{j+1}) \qquad \text{with } F_{PC}''(p,l,r',s_{p+l+j},s\vec{t}g_{p+l+j+1}^{j+1}) \wedge valid(\vec{f}_{p+l}) \wedge valid(\vec{f'}_{p+l}) ,$  we can characterize the Boolean function that recovers each  $s \in \vec{d^j}$  from  $s\vec{t}g^{j+1}$ .

# C. Characterizing the Boolean functions recovering the encoder's input variables

According to Fig. 5,  $\vec{f}_{p+l}$  can be uniquely determined by the 1st pipeline stage  $s \vec{t} g_{p+l}$ . So for every flow control input  $i \in \vec{f}$ ,  $F_{PC}''(p,l,r',i_{p+l},s \vec{t} g_{p+l}^{\,\,0})$  is unsatisfiable and can be partitioned into :

$$\phi_A := \left\{ \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s}_{m+1}, \vec{o}_m) \equiv T(\vec{s}_m, \vec{i}_m) \} \right\}$$
 (16)

$$\phi_{B} := \left\{ \begin{array}{c} \bigwedge_{m=0}^{p+l+r'} \{ (\vec{s'}_{m+1}, \vec{o'}_{m}) \equiv T(\vec{s'}_{m}, \vec{i'}_{m}) \} \\ \bigwedge & s\vec{t}g_{p+l}^{\ 0} \equiv s\vec{t}g'_{p+l} \\ \bigwedge & i'_{p+l} \equiv 0 \end{array} \right\}$$
(17)

Again, the Craig interpolant  $\phi_I$  of  $\phi_A$  with respect to  $\phi_B$  can be used as the decoder's Boolean function that recovers  $i \in \vec{f}$  from  $s\vec{t}g^0$ .

Similar to Equation (12) and (13), by replacing  $F_{PC}''(p,l,r',i_{p+l},\vec{stg}_{p+l}^0)$  with  $F_{PC}''(p,l,r',i_{p+l},\vec{stg}_{p+l}^0) \wedge valid(\vec{f}_{p+l}) \wedge valid(\vec{f}_{p+l}')$ , we can characterize the Boolean function that recovers each  $i \in \vec{d}$  from  $\vec{stg}^0$ .

#### VI. EXPERIMENTAL RESULTS

We have implemented these algorithms in OCaml language, and solved the generated CNF formulas with MiniSat 1.14 [11]. All experiments have been run on a server with 16 Intel Xeon E5648 processors at 2.67GHz, 192GB memory, and CentOS 5.4 Linux.

#### A. Comparing timing and area

We use the benchmarks from Qin et al. [8], whose experimental results are shown in Table I. pcie is a PCI Express [12] encoder, while xgxs and t2eth are two Ethernet [13] encoders.

The 2nd and 3rd column of Table I show respectively the number of inputs, outputs and registers of each benchmark. The 4th column shows the area of the encoder when mapped to LSI10K library with Design Compiler. In this paper, all area and delay are obtained in the same setting.

TABLE I
BENCHMARKS AND EXPERIMENTAL RESULTS

| Names | The encoders |     | decoder gener-<br>ated by [2] |      |       | decoder gener-<br>ated by this paper |       |       |      |
|-------|--------------|-----|-------------------------------|------|-------|--------------------------------------|-------|-------|------|
|       | #            | #   | area                          | run  | delay | area                                 | run   | delay | area |
|       | in/out       | reg |                               | time | (ns)  |                                      | time  | (ns)  |      |
| pcie  | 10/11        | 23  | 326                           | 0.37 | 7.20  | 624                                  | 8.08  | 5.89  | 652  |
| xgxs  | 10/10        | 16  | 453                           | 0.21 | 7.02  | 540                                  | 4.25  | 5.93  | 829  |
| t2eth | 14/14        | 49  | 2252                          | 12.7 | 6.54  | 434                                  | 430.4 | 6.12  | 877  |

TABLE II Inferred pipeline stages of pcie

|              | input            | pipeline stage 0   | pipeline stage 1   |
|--------------|------------------|--------------------|--------------------|
| $\vec{f}$ or | CNTL_TXEnable_P0 | InputDataEnable_P0 | OutputData_P0[9:0] |
| $ec{f}^{j}$  |                  |                    | OutputElecIdle_P0  |
| $\vec{d}$ or | TXDATA[7:0]      | InputData_P0[7:0]  |                    |
| $ec{d}^{j}$  | TXDATAK          | InputDataK_P0      |                    |

The 5th to 7th columns show respectively the run time of [2]'s algorithm to generate the decoder without pipeline, and the delay and area of the generated decoder. While the 8th to 10th columns show respectively the run time of this paper's algorithm to generate the pipelined decoder, and the delay and area of the generated decoder.

Comparing the 6th and the 9th column indicates that the decoders' delay have been significantly improved.

#### B. Inferred pipeline stages of pcie

The benchmark pcie has two pipeline stages, whose flow control vector and data vector are respectively shown in Table II. The inferred  $valid(\vec{f})$  is  $CNTL\_TXEnable\_P0$ .

One issue to be noticed that is the data vector at pipeline stage 1 is empty, while all registers in that stages are recognized as flow control vector. We inspect the encoder's source code and find that these registers are directly fed to output. So they can actually be uniquely determined by  $\vec{o}$ . This doesn't affect the correctness of the generated decoder, because the functionality of flow control vector never depend on the inferred flow control predicate.

#### C. Inferred pipeline stages of xgxs

The benchmark xgxs has only 1 pipeline stage, whose flow control vector and data vector are respectively shown in Table III. The inferred  $valid(\vec{f})$  is  $!bad\_code$ .

#### D. Inferred pipeline stages of t2ether

The benchmark t2ether has four pipeline stages shown in Table IV. The inferred  $valid(\vec{f})$  is:

$$\begin{array}{l} (tx\_enc\_ctrl\_sel[2] \ \& \ tx\_enc\_ctrl\_sel[3] \ ) \ | \\ (tx\_enc\_ctrl\_sel[2] \ \& \ !tx\_enc\_ctrl\_sel[3] \ \& \\ !tx\_enc\_ctrl\_sel[0] \ \& \ tx\_enc\_ctrl\_sel[1] \ ) \ | \\ (!tx\_enc\_ctrl\_sel[2] \ \& \ tx\_enc\_ctrl\_sel[3] \ ) \ | \\ (!tx\_enc\_ctrl\_sel[2] \ \& \ !tx\_enc\_ctrl\_sel[3] \ \& \\ tx\_enc\_ctrl\_sel[0]) \end{array}$$

TABLE III
INFERRED PIPELINE STAGES OF XGXS

|                          | input               | pipeline stage 0    |  |
|--------------------------|---------------------|---------------------|--|
| $\vec{f}$ or $\vec{f}^j$ | bad_code            | bad_code_reg        |  |
| $\vec{d}$ or $\vec{d}^j$ | encode_data_in[7:0] | ip_data_latch[2:0]  |  |
|                          | konstant            | plus34_latch        |  |
|                          |                     | data_out_latch[5:0] |  |
|                          |                     | konstant_latch      |  |
|                          |                     | kx_latch            |  |
|                          |                     | minus34b latch      |  |

TABLE IV
INFERRED PIPELINE STAGES OF T2ETHER

|                          | input                | pipeline        | pipeline        |  |
|--------------------------|----------------------|-----------------|-----------------|--|
|                          |                      | stage 0         | stage 1         |  |
| $ec{f}$                  | tx_enc_ctrl_sel[3:0] | qout_reg_0_8    | qout_reg_0_9    |  |
| or                       |                      | qout_reg_2_4    | qout_reg_1_5    |  |
| $ec{f}^{j}$              |                      | qout_reg_1_4    | qout_reg_2_5    |  |
|                          |                      |                 | qout_reg_0_10   |  |
| $\vec{d}$ or $\vec{d}^j$ | txd[7:0]             | qout_reg[7:0]   | qout_reg[7:0]_1 |  |
|                          | pipeline             | pipeline        |                 |  |
|                          | stage 2              | stage 3         |                 |  |
| $ec{f}$                  | qout_reg[9:0]_2      | qout_reg[7:1]_3 | qout_reg_0_7    |  |
| or                       |                      | qout_reg_8_1    | sync1_reg1      |  |
| $\vec{f}^{j}$            |                      | qout_reg_9_1    | sync1_reg       |  |
|                          |                      | qout_reg_3_4    | Q_reg1          |  |
|                          |                      | qout_reg_0_4    | Q_reg           |  |
|                          |                      | qout_reg_3_5    |                 |  |
| $\vec{d}$ or $\vec{d}^j$ |                      |                 |                 |  |

#### VII. RELATED PUBLICATIONS

Shen et al. [1] proposed the first complementary synthesis algorithm. It checks the decoder's existence by iteratively increasing the length of unrolled transition function sequence, and generates the decoder's Boolean functions by enumerating all satisfying assignments of the decoder's output. Its major shortcomings are that it may not halt and it is too slow in building the decoder.

Shen et al. [2] and Liu et al. [4] tackled the halting problem independently by searching for loops in the state sequence, while the runtime overhead problem was addressed in [3], [4] by Craig interpolant [10].

Shen et al. [3] automatically inferred an assertion for configuration pins, which can lead to the decoder's existence.

Tu and Jiang [6] proposed a break-through algorithm that recover the encoder's input by considering its initial and reachable states.

Qin et al. [8] proposed the first algorithm that can handle encoder with flow control mechanism. But it can not handle pipeline stages.

#### VIII. CONCLUSIONS

This paper proposes the first complementary synthesis algorithm that can handle encoders with pipeline stages and flow control mechanism. Experimental result indicates that the proposed algorithm can always correctly generate pipelined decoder with flow control mechanism.

#### REFERENCES

- [1] S. Shen, J. Zhang, Y. Qin, and S. Li, "Synthesizing complementary circuits automatically," in ICCAD '09, pp. 381–388.
- [2] S. Shen, Y. Qin, L. Xiao, K. Wang, J. Zhang, and S. Li., "A halting algorithm to determine the existence of the decoder," *IEEE Transactions* on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 10, pp. 30:1556–30:1563.
- [3] S. Shen, Y. Qin, K. Wang, Z. Pang, J. Zhang, and S. Li., "Inferring assertion for complementary synthesis," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 31, no. 8, pp. 31:1288–31:1292.
- [4] H.-Y. Liu, Y.-C. Chou, C.-H. Lin, and J.-H. R. Jiang, "Towards completely automatic decoder synthesis," in ICCAD '11, pp. 389–395.

- [5] H.-Y. Liu, Y.-C. Chou, C.-H. Lin, and J.-H. R. Jiang., "Automatic decoder synthesis: Methods and case studies," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 31, no. 9, pp. 31:1319–31:1331.
- [6] K.-H. Tu and J.-H. R. Jiang, "Synthesis of feedback decoders for initialized encoders," in DAC '13, pp. 1-6.
- [7] D. Abts and J. Kim, High Performance Datacenter Networks, 1st ed., ser. Synthesis Lectures on Computer Architecture. Morgan and Claypool, 2011, vol. 14, ch. 1.6, pp. 7–9.
- [8] Y. Qin, S. Shen, Q. Wu, H. Dai, and Y. Jia., "Complementary synthesis for encoder with flow control mechanism," accepted by ACM Transactions on Design Automation of Electronic Systems, unpublished. https://github.com/shengyushen/dualsyn\_paper/blob/master/todaes14\_final/files/v2-acmsmall-sample.pdf.
- [9] J. R. Jiang, H. Lin, and W. Hung, "Interpolating functions from large boolean relations," in ICCAD'09, pp. 779–784.
- [10] W. Craig, "Linear reasoning: A new form of the herbrand-gentzen theorem," *The Journal of Symbolic Logic*, vol. 22, no. 3, pp. 250–268, Sep. 1957.
- [11] N. Eén and N. Sörensson, "An extensible sat-solver," in SAT'03., pp. 502–518.
- [12] M. Jackson, R. Budruk, J. Winkles, and D. Anderson, PCI Express Technology 3.0. Mindshare Press, 2012.
- [13] IEEE, "Ieee standard for ethernet section fourth," 2012. [Online]. Available: http://standards.ieee.org/getieee802/download/802.3-2012\_section4.pdf