# A 10 bit 130Msample/s 1.43mW 84.9fJ/MHz asynchronous StrongARM SAR ADC in 90nm CMOS

Thomas Flores and Samuel Lenius

Abstract—An asynchronous ten bit analog-to-digital converter based on a successive approximation architecture is presented. This device features a StrongARM based comparator latch, minimum size unit capacitors (2.5fF) and bootstrapped track and hold circuit. This design achieves 59.31dB SNR with a 6.1MHz full-scale input. At 130MS/s this converter achieves a 84.9 fJ/MHz figure of merit.

### I. INTRODUCTION

The architecture of this proposed device is based around a top-plate sampling constant common mode 9-bit capacitive DAC, bootstrapped track-and-hold, StrongARM latch comparator and asynchronous processing logic to achieve high throughput and low power consumption. The optimized top plate sampling DAC achieves good settling and transient characteristics. The bootstrapped track-and-hold circuit achieves a high tracking bandwidth and good linearity across the full scale input range. The highly biased StrongARM comparator achieves both a fast regeneration time constant and low dynamic power consumption. The asynchronous logic allows this device to allocate more time to bits that are difficult to resolve and less time to bits that are easy to resolve in order to achieve both high conversion rate with low metastability.

This paper is organized as follows; section two discusss design modules in detail, section three details the results and section four concludes.

### II. BLOCK DESIGN

# A. Comparator

Proper optimization of the comparator element in the SAR ADC will provide the greatest improvement in our figure of



Fig. 1. Top level design for a fully differential asynchronous SAR ADC.



Fig. 2. Circuit topology for the strong arm latch comparator.

merit, as it will ultimately set the maximum speed of our design. Given that our figure of merit is proportional to the power consumed for each decision, the ideal design includes a dynamic comparator which only draws current during the decision time. For this reason, we choose the Strong Arm Latch (Figure 2) for its energy efficiency and simple design [1].

To size our comparator to first order, we begin by considering the required noise specification of the total SAR ADC. We can derive the maximum noise power for our full-scale signal power as

$$\sigma_{n,in}^2 = \frac{P_{sig}}{10^{SNR_{spec}/10}} = \frac{\frac{(0.5V_{FS})^2}{2}}{10^{\frac{SNR_{spec}}{10}}} \tag{1}$$

Because we are designing to first order and want to budget a large amount of noise to the comparator function to improve speed, we impose a slightly higher noise spec of  $58\,\mathrm{dB}$ , thereby allocating half of our noise budget to the comparator. This gives an input referred noise requirement of  $\sigma_{n,in}=445\,\mathrm{\mu V_{RMS}}$  for  $V_{FS}=1\,\mathrm{V}$ .

We approximate the input referred noise for the strong arm latch as

$$\sigma_{n,in}^2 \approx \frac{8\gamma}{A_v} \frac{kT}{C_{PO}}$$
 (2)

where the gain  $A_v$  during the amplification phase can be approximated as

$$A_v \approx \frac{g_{m1,2}}{I_D} V_{tn} \tag{3}$$

We select  $\frac{g_m}{I_D}=15$  as this provides a reasonable tradeoff between speed and power of the transistor. From simulations, we find that the approximate threshold voltage  $V_{tn}\approx 300\,\mathrm{mV}$  for our nmos devices. We can then solve for the minimum capacitance necessary nodes P and Q using Equations 2, 3, and  $\sigma_{n,in}=500\,\mathrm{\mu V_{RMS}}$ 

$$C_{P,Q} \ge \frac{8\gamma}{A_v} \frac{kT}{\sigma_{n,in}^2} \to C_{P,Q} \ge 31.11 \,\text{fF}$$
 (4)

To begin our design, we create a unit comparator constructed from some reasonable design choices. To maximize speed in the cross coupled inverters, we assume that pmos elements M5,6 should be twice the width of the nmos elements M3,4. To size M3,4, we must limit the widths so they do not significantly contribute to the mismatch at the input. For our design with  $A_v \approx 4$ , this means that  $W3, 4 \geq \frac{1}{4}W1, 2$  due to the offset referral to the input. M7 must be able to bias the complete latch structure, and so we set W7 = 2W1, 2. Finally, we size S1-S4 such that they are equal width to M1,2.

Our design methodology involves sweeping this unit cell over W1 until the  $C_{P,Q}$  is above spec. A transient noise simulation is conducted to verify the input referred noise voltage over spec. The first order design resulted in much less input referred noise, and so the unit cell width W1 was decreased to approach the noise spec. The resulting sizing is shown in Table I.

TABLE I WIDTH SIZING OF STRONG ARM LATCH COMPARATOR TOPOLOGY (µm). All transistors minimum length  $L=90\,\mathrm{nm}$ 

| Transistor | Unit                  | Optimized |
|------------|-----------------------|-----------|
| M1,2       | 1 µm                  | 5.3 µm    |
| M3,4       | $\frac{1}{4}W_{M1,2}$ | 1.3 µm    |
| M5,6       | $2W_{M3,4}$           | 2.6 µm    |
| M7         | $2W_{M1,2}$           | 10.6 µm   |
| S1-4       | $W_{M1,2}$            | 5.3 µm    |

# B. Track and Hold



Fig. 3. Track and hold switch designs. A) Simple nmos swith, B) transmission gate, and C) bootstrapped nmos.

For the SAR ADC to accurately reproduce the input signal in digital code, the signal must be relayed to the comparator input capacitors with minimal distortion. The initial sample acquisition, or "track" mode, must accurately follow the input and later hold the signal in "hold mode" on the comparator input capacitors for the SAR logic to determine the bit level. This requires the use of a switch with highly linear channel resistance across the input full-scale voltage. For a simple nmos switch (Figure 3A), we can write the channel on resistance as

$$R_{on} = \frac{\mu_n C_{ox} \frac{W}{L}}{V_{DD} - V_{in} - V_{th}}$$
 (5)

where  $V_g = V_{DD}$  and  $V_s = V_{in}$ . The on-resistance clearly shows a signal dependence  $\propto \frac{1}{V_{in}}$ , with the channel becoming infinitely resistive at  $V_{in} = V_{DD} - V_{th}$ . To alleviate this issue, a transmission gate can be used in which nmos and pmos devices with tied source and drain are driven out of phase (Figure 3B), resulting in finite input resistance across all signal inputs due to the parallel combination of the channels. However, there remains strong signal dependence maximal around half of the full-scale, which will distort the signal.

The transmission gate is also problematic due to input signal dependent charge injection incident upon the hold capacitor. The channel charge for the nmos device can be written as

$$Q_{ch} = WLC_{ox} \left( V_{DD} - V_{in} - V_{th} \right) \tag{6}$$

where  $Q_{ch} \propto V_{in}$ .

For our track and hold circuit, we choose the bootstrapped nmos switch (Figure 3C) for highly linear channel resistance. Equations 5 and 6 indicate that signal dependence will be greatly reduced when  $V_{gs} = V_{DD} - V_{in}$  is held constant. The bootstrapped nmos accomplishes this by maintaining a constant voltage drop of  $V_{gs} = V_{DD}$  by connecting a precharged capacitor  $C_3$  from the gate to source terminals. The complete circuit topology for the bootstrapped switch



Fig. 4. Circuit topology for the bootstrapped switch for improved on-resistance linearity and reduced charge injection. Switch M10 (red) denotes the main switch.

is shown in Figure 4. Here, M10 acts as the main nmos switch, with  $C_{boot}$  connected across the gate and source terminals. The remaining switches work to precharge  $C_{boot}$  and connect or disconnect it from M10 and discharge the M10 gate capacitance.

When  $\phi = 0$ ,  $C_{boot}$  is disconnected from the switch M10 by closing M7 and M8 and charged to the supply rail through

M3 and charged to  $V_{DD}$ . During this time, the gate of M10 is tied to ground through M6 and M9. When  $\phi=1$ , M3 and M11 close, allowing  $C_{boot}$  to be connected across M10 by opening M7 and closing M9. This allows the signal to pass through M10 while maintaining  $V_{GS}=V_{DD}$ . When  $\phi=0$ ,  $C_{boot}$  disconnects and recharges back to  $V_{DD}$ , while the gate capacitance of M10 is discharged to ground.

Sizing the bootstrapped switch is straightforward. The main design consideration involves the tradeoff between minimizing the on resistance by increasing width and minimizing charge injection by reducing width. We then considering the on resistance seen through the main switch M10 as

$$R_{on} = \frac{\mu_n C_{ox} \frac{W}{L}}{\frac{C_{par}}{C_{boot} + C_{par}}} V_{DD} - \frac{C_{par}}{C_{boot} + C_{par}} V_{in} - V_{th}$$
(7)

To limit the signal dependence, we need to ensure  $\frac{C_{par}}{C_{boot}+C_{par}} < 1 \rightarrow C_{boot} > C_{par}$  where  $C_{par}$  is the parastitic capacitance as seen from the top plate of  $C_{boot}$ . As a design choice, we choose  $C_{boot} > 10C_{gM10}$ , as the major capactive element in this path is the gate capacitance of the M10 switch.

Sizing M3 and M11 remains the only critical choice. Here, the transistors must be large enough to ensure the capacitor charges completely to  $V_{DD}$  within the sampling window.

The remainder of the sizing is to ensure that the switches function as intended. M7 is sized large to ensure minimum resistance for the cap connection to the gate of M10. M4 and M5 are sized with reasonable widths such that  $W_p=2W_n$ , and M1 and M2 are equivalent. Finally, M6 and M9 are moderately sized to ensure fast discharge of the M10 gate capacitance. The resulting sizing of the bootstrapped topology is shown in Table II.

TABLE II SIZING OF BOOTSTRAPPED SWITCH TOPOLOGY. ALL TRANSISTORS MINIMUM LENGTH  $L=90\,\mathrm{nm}$ 

| Optimized |
|-----------|
| 1 μm      |
| 1 μm      |
| 2 μm      |
| 2 μm      |
| 1 μm      |
| 5 µm      |
| 10 µm     |
| 5 µm      |
| 5 µm      |
| 25 µm     |
| 2 μm      |
| 1 μm      |
| 300 fF    |
| 200 fF    |
| 200 fF    |
|           |

### C. 9 Bit Capacitive DAC

Here we implement a 9 bit capacitive constant common mode top plate sampling DAC as in [2].

Each bit of the DAC is binary weighted with a single dummy value in the sequence 1, 1, 2, 4, 8, 16, 32, 64, 128. For each bit there is both the top plate sampled capacitor and the corresponding inverter cell.



Fig. 5. Optimization of settling time vs device unit width.



Fig. 6. Capacitive DAC with top-plate sampling.

In order to have equal time constants between capacitors and inverter cells, equal scaling parameters are applied to each, hence the device widths of the cells follow the same sequence as the capacitors.

One complication of this approach is that the electrical effort to drive the inverters scales with this same sequence. Hence as the driver inverters scale up, the sequence of gate drive buffers must as well for optimal delay and power consumption.

This leads to each DAC inverter cell being designed individually for the expected range of electrical effort of that cell across optimization.

Our initial implementation of the inverter cell followed [2013 Tripathi] however we found that by using an inverter and a pair of N-channel devices we were able to achieve lower total delay and settling time.

The specific issue with a standard inverter cell was that as the DAC switched it's MSB or MSB-1 bits, the dV/dT induced currents through the P-channel devices of the MSB-1 and MSB-2 cells caused a disturbance that reduced the gate overvoltage, hence increasing channel resistance and lengthening the duration of the glitch.

By using a pair of N-channel devices we have applied a much higher gate overvoltage, and additionally the switching glitches now increase the gate overvoltage, improving the recovery time for glitches. The use of two N channel devices here reduces the total gate capacitance to drive by a factor



Fig. 7. DAC inverter unit cells.

of two while reducing the output node parasitic capacitance, hence reducing DAC power overall.

Using a criterion of settling to within 1/2 LSB our optimized design has a delay of 370ps. Accounting for the 1% parasitic capacitance to ground on the top plate of the DAC capciators, our DAC reference voltage is 606mV in this design.

### D. Asynchronous Reset Logic



Fig. 8. State transition diagram for asynchronous reset logic.

In this ADC we implement asynchronous reset logic as in [3]. The state machine that our asynchronous logic implements is as follows.

First the sampling clock edge rises, this opens the sampling gates and resets the internal state of the logic.

Second, the sampling clock drops and the asynchronous clock rises, enabling the dynamic comparator.

Third, the comparator makes a decision an one of it's two output lines drops. The outputs of the comparator are tied to the inputs of NAND gate as well as a buffer chain to drive the logic. The reset state of the comparator has both lines high, hence the output of the NAND is low. Once the comparator has made a decision, the output of the NAND is high, we call this signal 'Ready'. When Ready rises, we sample and shift one bit, and drop the asynchronous clock.

Fourth, after the bit is sampled, we reset the comparator. Here again we use the Ready signal to indicate that this step is finished. Once both of the comparator's output lines rise, the NAND's output will go to zero and we're prepared to convert the next bit. The Ready signal dropping will cause the clock to rise again, reenabling the comparator.

Fifth, once all 10 bits are shiftd out, we assert the done signal internally to the logic and the comparator is held in low power reset, ready for when the next sample comes in.

We compute the maximum synchronous sampling speed given our specifications for  $P_{meta}$  and the measured performance characteristics of the circuit. Our comparator has a measured regeneration time of 16.7ps and a slewing time of 64ps. We require two FO4 buffers to connect our comparator to our SAR logic block, each with a delay of 40ps. Maximum synchronous sampling speed assuming a 50% duty cycle clock, 12 clocks per conversion can be computed as follows:

$$N_{taus} = ln\left(\frac{N_{codes}}{P_{meta}}\right) = 20.7$$
 (8)

$$T_{clock,high} = N_{taus} * \tau_{comp} + 2 * T_{FO4} + T_{slew} = 490 \text{ ps}$$
 (9)

$$T_{clock} = 2 * T_{clock,high} \tag{10}$$

$$T_{sample,sync} = 12 * T_{clock}$$
 (11)

$$F_{sample,sync} = 85.1 \,\mathrm{MS \,s^{-1}} \tag{12}$$

[3] gives us a speedup achievable for using asynchronous logic in equation 5 of that article. Here Chen shows that the expected speedup for an N bit converter approaches 2 for a large number of bits, for our 10 bit converter, the expected speedup is 1.56. This means that for no additional energy per bit, that a converter using asynchronous logic can run over 1.5 times as fast as it's synchronous counterpart. The decision to use asynchronous logic here is obvious.

$$F_{sample,async} = F_{sample,sync} * AsyncSpeedup = 133.2 \,\mathrm{MS}\,\mathrm{s}^{-1}$$
(13)

This establishes the maximum achievable speed of our device while maintaining the  $P_{meta}$  specification.



Fig. 9. Comparator switching (left) and regeneration time (right).



Fig. 10. Transient noise simulation with varying input differential voltage. Error function fit extracts the input referred noise voltage  $\sigma_n=411\,\mu\mathrm{V}_{\mathrm{RMS}}$ 

### III. RESULTS

### A. Comparator

The proposed device achieves 59.31dB SNR at 6.1MHz input and 64.6dB SNR at Nyquist with 64 point FFTs. The transient time domain results along with their corresponding fourier transforms are presented here at fs = 130MS/s. In this operating condition the design consumes 1.43mW of power and hence achieves a figure of merit of 84.9 fJ/MHz. We achieve these results through judicious optimization of each module individually followed by careful integration paying close attention to delays caused by suboptimal logic fanouts.

## IV. CONCLUSIONS

Fusce mauris. Vestibulum luctus nibh at lectus. Sed bibendum, nulla a faucibus semper, leo velit ultricies tellus, ac venenatis arcu wisi vel nisl. Vestibulum diam. Aliquam pellentesque, augue quis sagittis posuere, turpis lacus congue quam, in hendrerit risus eros eget felis. Maecenas eget erat in sapien mattis porttitor. Vestibulum porttitor. Nulla facilisi. Sed a turpis eu lacus commodo facilisis. Morbi fringilla, wisi in dignissim interdum, justo lectus sagittis dui, et vehicula libero dui cursus dui. Mauris tempor ligula sed lacus. Duis cursus enim ut augue. Cras ac magna. Cras nulla. Nulla egestas. Curabitur a leo. Quisque egestas wisi eget nunc. Nam feugiat lacus vel est. Curabitur consectetuer.

# REFERENCES

- [1] B. Razavi, "The StrongARM Latch [A Circuit for All Seasons]," *IEEE Solid-State Circuits Magazine*, vol. 7, no. 2, pp. 12–17. [Online]. Available: http://ieeexplore.ieee.org/document/7130773/
- [2] V. Tripathi and B. Murmann, "An 8-bit 450-ms/s single-bit/cycle sar add in 65-nm cmos," in 2013 Proceedings of the ESSCIRC (ESSCIRC), Sept 2013, pp. 117–120.



Fig. 11. Transient noise simulations and 64 point FFTs.

[3] S. W. M. Chen and R. W. Brodersen, "A 6-bit 600-ms/s 5.3-mw asynchronous adc in 0.13-muhboxm cmos," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 12, pp. 2669–2680, Dec 2006.