# An Efficient and Implementation-friendly Method of Precision-adaptive ADC Design for More Intelligent Edge Computing

## 1st Given Name Surname

dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address or ORCID

## 4<sup>th</sup> Given Name Surname

dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address or ORCID

# 2<sup>nd</sup> Given Name Surname

dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address or ORCID

## 5<sup>th</sup> Given Name Surname

dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address or ORCID

# 3<sup>rd</sup> Given Name Surname

dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address or ORCID

## 6<sup>th</sup> Given Name Surname

dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address or ORCID

Abstract—Deploying Neural Networks (NNs) on edge devices is an emerging trend which leads to many efforts of research, where improving the system's energy efficiency is critical. While the the NNs have been able to process data of varying precision for more efficient multi-task analysis, Analog-to-Digital Converters (ADCs), which dominate the power consumption of traditional sensing systems, can also be smartly designed for more intelligent edge computing. In this work, an efficient and implementation-friendly method of the precision-adaptive ADC design is proposed with fine-grained power gating strategies. We present two case study ADC designs applied in CMOS Image Proceessors (CISs) and demonstrate the effectiveness of the proposed method in detail. Results show that almost a half of the ADCs' power consumption can be saved for low-precision conversion, while little cost of extra control circuits is required.

Index Terms—ADC, precision-adaptive, energy-efficient, edge computing

## I. INTRODUCTION

With the development of the Internet-of-Things (IoT), edge devices like mobile phones, smart watches, and other portable products have been playing an important role in people's daily lives for collecting and processing data in site and in time. On the other hand, Neural Networks have shown broad prospects for sensing applications, such as computer vision, speech recognition, and robotics. Therefore, deploying NNs on edge devices is an emerging trend which leads to many efforts of research.

To integrate large and computationally intensive NN models on edge devices of which the power supply and computing resources are quite limited, improving the systems' energy efficiency is critical. While compression and optimization of NN models have been extensively studied [xxxx], Analog-to-Digital Converters, which dominate the power consumption

of traditional sensing systems, have been generally treated as black boxes.

However, as more and more intelligent algorithms are able to process data of varying precision for efficient multi-task analysis [xxxx], opportunities have been offered for us to further improve the energy efficiency by taking algorithm-aware adjustments inside the ADCs. Considering the precision (i.e. the quantization bits) is at the heart of an ADC's energy constraints, making it dynamically adaptive will be promising.

There have been related works on configurable ADC design [xxx]. But these works mainly focus on a single Successive Approximation Register (SAR) ADC with complex extra configuration circuits, which is not suitable for vision applications because it is difficult for a single SAR ADC to achieve the throughput required by image processing, where columnparallel ADCs are widely adopted. On the other hand, for more efficeient image processing, the camera's resolution has already been considered to be dynamically adaptive [xxxx], where the precision of ADCs has not been considered as a variable. Besides, there are efforts trying to relieve the design specifications of ADCs by applying low-precision analog computing firstly close to the sensor [xxxx]. But highprecision ADCs and NN processors are still there for complex tasks. Therefore, dynamical adjustments inside ADCs remains competitive.

Motivated by these facts, we mainly makes the following contributions in this paper:

 Two case study ADC designs are presented. The first one is column-parallel Single-Slope (SS) ADCs and the second is column-parallel SAR/SS ADCs. Both of the two designs can be applied in the CISs and are suitable for different design specifications.

- 2) A method combining precision adaptive and fine-grained power gating strategies is proposed for the more smart ADC design. We present the power distribution across all related modules of the ADCs and demonstrate the effectiveness of the proposed method with specific simulating results. Results show that the ADCs' power consumptionin can be reduced to nearly half for lowprecision conversion.
- 3) Although with different details, we apply this method successfully to the two different ADC designs s with similar logic and little extra control circuits. Therefore, we argue for the universality of the proposed method for ADCs of not only different design specifications but also different architectures.

The remainder of this paper is organized as follows. Sect. II presents the architecture overview of the two case study ADC designs. Sect. III describes the adaptive precision and power gating strategies for the two case study ADC designs, specifically. Sect. IV reports the evaluation results and Sect. V develops discussions. Finally, Sect. VI concludes this paper.

## II. ARCHITECTURE OVERVIEW

## A. Architecture of the SS ADCs

The overall architecture of the SS ADCs is presented in Fig. 1. The main modules include column-parallel Correlated Double Sampling (CDS) circuits, comparators, and a column-shared ramp generator. Fig. 2 shows the basic operational waveform of the SS ADCs. When the ramp signal exceeds the output of a CDS circuit in a certain column, the corresponding comparator will be flipped and latch the time information  $\Delta t$  in the 8-bit registers in that column as conversion results. And Such conversions across all columns will be done as soon as the ramp signal reaches  $V_{refh}$ .



Fig. 1. Overall Architecture of the SS ADCs.



Fig. 2. Operational waveform of the SS ADCs.

As the structure of the three main modules are described more specifically as follows, details of the three waves in Fig. 2 are also revealed.

1) CDS Circuits: CDS circuits are the interface between the pixel array and the ADCs, responsible for subtracting the pixels' signal voltages from reference voltages and amplifying the difference by a certain coefficient. The difference (i.e.  $\Delta V$  in Fig. 2) is physically attached to the exposure time of the pixels, and the subtraction will help cancel the noises caused by the varying reference voltages.

Switched-capacitor operational amplifiers are commonly used in CDS circuits as presented in Fig. 3. According to the law of charge conservation, the output voltage of the CDS circuits (in PHASE2 of Fig. 2) can be calculated as (1). It is noticed that the Input Offset Cancelation (IOS) is realized, which is necessary because the amplifiers in different columns may have different offset voltage.



Fig. 3. the Structure of the CDS Circuits.

$$V_{out} = \left[ V_{ref} + \frac{C_1}{C_2} * (V_{rst} - V_{sig}) \right] * \frac{\beta A}{1 + \beta A}$$

$$+ (V_{refl} + V_{os}) * \frac{A}{1 + A} * \frac{1}{\beta A}$$

$$where \quad \beta = \frac{C_2}{C_1 + C_2}$$
(1)

2) the Ramp Generater: As presented in Fig. 4, the ramp generator consists of a thermometer counter and a Capacitor Digital-to-Analog Converter (CDAC). While the capacitors in CDAC are being switched one by one from  $V_{refl}$  to  $V_{vefh}$ , the output voltage of the ramp generator will be as (2) according to

the law of charge conservation. In this equation, N means the number of switched capacitors and M means the total number of capacitors of the same size (for the 8-bit precision, the total number will be 255). Therefore, as in PHASE3 of Fig. 2, the ramp signal will be like stages from  $V_{refl}$  to  $V_{refh}$ , of which the range is consistent with the output of CDS circuits. And the height of every stage is actually the Least Significant Bit (LSB) converted by the ADCs.

The three buffers in Fig. 4 make sure that the reference voltages and ramp signal have enough driving capability, and the output buffer will be in the largest size because it has to drive hundreds of column-parallel comparators.



Fig. 4. the Structure of the Ramp Generator in the SS ADCs.

$$V_{ramp} = V_{refl} + \frac{N}{M} * (V_{refh} - V_{refl})$$
 (2)

3) Comparators: The comparators work for comparing the CDS circuits' output and the ramp signal. In the SS ADCs, two-stage open-loop comparators can be applied as presented in Fig. 5. Again according to the law of charge conservation, the comparators' output (in PHASE3 of Fig. 2) can be calculated as (3). The comparison will be dominated by  $V_{ramp} - V_{cds}$  as long as the amplifiers' open-loop gain is large enough and the IOS is realized. Besides, the comparators' speed depends on the amplifiers' bandwidth and slew rate.



Fig. 5. the Structure of the Comparators in the SS ADCs.

$$V_{out} = A^2 (V_{ramp} - V_{cds})$$

$$+ (V_{refl} + V_{os}) * \frac{A}{1+A}$$
(3)

## B. Architecture of the SAR/SS ADCs

The overall architecture of the SAR/SS ADCs is almost the same as the SS ADCs, and the only two differences are that the comparators are replaced by low-precision (4bit) SAR sub-ADCs and the ramp generator is replaced by a onehot counter plus a Resistance Digital-to-Analog Converter (RDAC).

As presented in Fig. 6, while the SAR sub-ADCs generating the upper 4-bit results,  $V_X$  in Fig. 6 will be changed according to SAR logic. That means after 4 comparisons with the reference voltage,  $V_X$  will be as (4), where  $D_U\left[i\right]$  is the i th bit of the upper 4 bits. We assign 14 steps, i. e. 28 clocks in the SAR/SS ADCs, for the 4 comparisons of SAR logic, and then the ramp generator will start working, making  $V_X$  increase gradually as (5), where  $D_L\left[i\right]$  is the i th bit of the lower 6 bits. When the  $V_{X.2}$  reaches  $V_{ref}$ , the corresponding  $V_{cds}$  will be as (6), represented by the 10-bit conversion results, exactly.

Fig. 7 shows the structure of the ramp generator in the SAR/SS ADCs, which consists of an R-string made up of 68 unit resistors.  $V_{ramp}$  has a total number of 68 steps, of which 64 steps (128clocks) with a step size of  $(V_{refh} - V_{refl})/64$  are used to generate the lower 6-bit results and 4 steps (4 clocks) are used to make sure that the comparators will always be flipped for lacthcing the results. In the working time, the R-string outputs,  $V_0$  to  $V_{67}$  is sequentially selected, and thereby  $V_{ramp}$  is changed from  $V_{vefl}$  to  $V_{vefl} + 17/16(V_{refh} - V_{refl})$ . Compared to CDAC, RDAC is able to generate the ramp signal without the gain error caused by the input capacitors of the output buffer, which is necessary for achieving 10-bit precision in the SAR/SS mixed architecture.

The related operational waveform is presented in Fig. 8. It is obvious that the upper 4-bit results (as the second last item of (6)) are generated from the SAR logic and the lower 6-bit results (as the last item of (6)) is counted according to the time between the ramp signal's start and the comparators' last flip.



Fig. 6. the Structure of the SAR Sub-ADCs in the SAR/SS ADCs.

$$V_{X.1} = V_{cds} + \sum_{i=1}^{4} \frac{V_{ref}}{2^i} * D_U[i]$$
 (4)



Fig. 7. the Structure of the Ramp Generator in the SAR/SS ADCs.

$$V_{X.2} = V_{X.1} + \frac{V_{ramp}}{2^4}$$

$$where V_{ramp} = \frac{V_{ref}}{2^6 - 1} * \sum_{i=1}^{6} 2^{6-i} * D_L[i]$$
(5)

$$V_{cds} = k * (V_{rst} - V_{sig})$$

$$\approx V_{ref} - \sum_{i=1}^{4} \frac{V_{ref}}{2^{i}} * D_{U}[i] - \sum_{i=1}^{6} \frac{V_{ref}}{2^{4+i}} * D_{L}[i]$$
(6)



Fig. 8. the Structure of the SAR Sub-ADCs in the SAR/SS ADCs.

As for the comparators inside the low-precision SAR sub-ADCs, the traditional strong-arm plus pre-amp structure is adopted as presented in Fig. 9. Such comparators are suitable for multiple comparisons because high speed is easy to achieve. Besides, the pre-amps' offset voltages can be canceled effectively through the Output Offset Cancelation (OOS) technique [x].

The total number of clocks for conversion in the SAR/SS ADCs are 82 (14+68), which is much less than 1024 clocks (for 10-bit conversion)if SS architecture is adopted. However, the SAR/SS ADCs have traded off area for less conversion clocks. Therefore, for different design specifications, different arcitecture can be chosen.



Fig. 9. the Structure of the Comparators in the SAR Sub-ADCs.

# III. ADAPTIVE PRECISION AND POWER GATING STRATEGIES

## A. Power Gating Implementation

The power gating can be implemented simply by adding PMOS-transistor switches between the functional blocks and the supply voltage [x]. When the switches are turned off, the corresponding blocks' current paths will be cut off, thus the energy is saved. It is obvious that the more currents are under control, the more effective the power gating is. However, to avoid unacceptable IR drop, the total size of the switches will be large, thus inverters should be inserted between the control signal and the switches' gates for adequate driving capability. Besides, the functional blocks' recovery speed from power off should be taken into consideration, because the slower the speed is, the less time the blocks can be power gated.

## B. Strategy for the SS ADCs

As evaluated in Sect. IV, the SS ADCs' power consumption is mainly taken up by the column-parallel comparators, bias circuits, and the output buffer of the ramp generator. Considering that all bias circuits are settled down only once (tens of microseconds after the whole system's power up) and then other circuits can be settled down quickly by the distributed vcom bias circuits, we just apply power gating to the comparators and the output buffer.

The related waveform are presented in Fig. 10. For low-precision conversion, the thermometer counter should have been extended to support switching the capacitors in CDAC 16 by 16 rather than one by one, thus the ramp signal will reach  $V_{refh}$  in 16 clocks (for 4 bits) rather than 256 clocks (for 8 bits). After the 16 clocks, the comparators and the output buffer can be power gated for a long time with the output signals drop gradually.

## C. Strategy for the SAR/SS ADCs

As evaluated in Sect. IV, the SAR/SS ADCs' power consumption is mainly taken up by the column-parallel buffers of SAR logic, which can be power gated as presented in Fig. 11. It shows that for low-precison conversion, the ramp signal is generated as usaual but the buffers of sar logic will be power off then, leaving the 4-bit results converted completely by the SAR logic.

Compared with the SS ADCs, the onehot counter in the SAR/SS ADCs does not need to support two modes for adaptive precision. Besides, the start signal of the ramp generator and the power off signal of the buffers can be the same.



Fig. 10. Power Gating and Adaptive Precision Strategies for the SS ADCs.

Therefore, the SAR/SS ADCs relatively require less extra control circuits. As for the proportion between the power gating time and the conversion time, 68/82 is achieved in the SAR/SS ADCs, which is similar to the number 240/256 in the SS ADCs.



Fig. 11. Power Gating and Adaptive Precision Strategies for the SAR/SS ADCs.

## IV. EVALUATION RESULTS

This section provides quantitative results of the two case study ADC designs' fundamental characteristics and energy-saving performance. The statistical data is obtained from simulation in Virtuoso's AMS Environment. Although some digital modules are in behavior level written by Verilog, we argue that the power consumption of this parts can be ignored because these digital modules do not have static current paths.

## A. Evaluation of the SS ADCs

The fundamental characteristics of the SS ADCs are summarized in Table I. Assuming a 512×512 pixel array, the frame rate will be xx fps. The SNDR of SS ADCs is xx dB/ xx dB, which means the ENOB is 3.x bits/ 7.x bits. The power consumption of the SS ADCs is measured and divided by columns, it shows that compared to xx uW/column for high-precison conversion, only xx uW/column is needed for low-precison conversion, reduced by almost a half. A more specific energy analysis is presented in Fig. 12. As we can see, most parts of the power consumption are taken up by the column-parallel comparators and output buffer of the ramp generator,

both of which can be power gated effectively for low-precision conversion. The related quantity results is presented in Fig. 13, where the peripheral circuits include a bandgap and voltage divider, some level-shift circuits and global buffers.

TABLE I PERFORMANCE OF THE SS ADCS

| Prameter                        | Value           |
|---------------------------------|-----------------|
| Process                         | 65nm            |
| Supply voltage                  | 2.5V/1.2V       |
| Clock Frequency                 | 25 MHz          |
| Architecture                    | SS              |
| Quantization bits               | 4/8             |
| Conversion time (us)            | 2.4 us/12 us    |
| Number of parallel columns      | 512             |
| Throughput (samples per second) | 42.5MSPS        |
| Power (per column)              | 40.8 uW/76.2 uW |
| SNDR                            |                 |
| Fom                             |                 |



Fig. 12. Power Saving Results of the SS ADCs.



Fig. 13. Power Saving Results of the SS ADCs.

## B. Evaluation of the SAR/SS ADCs

The fundamental characteristics of the SAR/SS ADCs are summarized in Table II. While the throughput is set similar to the SS ADCs, a lower clock frequency is required. The SNDR is xx dB/ xx dB and the ENOB is 3.x bits/ 9.x bits, consistent with the design specifications. The power consumption is xx uW/column for high-precison conversion and xx uW/column for low-precison conversion, respectively. According to the

energy analysis presented in Fig. 14, most parts of the SAR/SS ADCs' power consumption are taken up by the column-parallel comparators and buffers of SAR logic. And for low-precison conversion, this parts' energy can also be reduced to nearly half. The related quantity results is presented in Fig. 15.

| TABLE II                       |
|--------------------------------|
| PERFORMANCE OF THE SAR/SS ADCS |

| Prameter                        | Value           |
|---------------------------------|-----------------|
| Process                         | 65nm            |
| Supply voltage                  | 2.5V/1.2V       |
| Clock Frequency                 | 25 MHz          |
| Architecture                    | SS              |
| Quantization bits               | 4/8             |
| Conversion time (us)            | 2.4 us/12 us    |
| Number of parallel columns      | 512             |
| Throughput (samples per second) | 42.5MSPS        |
| Power (per column)              | 40.8 uW/76.2 uW |
| SNDR@                           |                 |
| Fom                             |                 |



High-precision mode

Low-precision mode

Fig. 14. Power Saving Results of the SAR/SS ADCs.



Fig. 15. Power Saving Results of the SAR/SS ADCs.

#### V. DISCUSSIONS

Combining precision adaptive and power gating strategies, both the SS ADCs and SAR/SS ADCs can save energy dynamically and significantly.

In comparison, the SAR/SS ADCs can support higher quantization bits and require fewer extra control circuits for the

adaptive precision, while SS ADCs inherently require less area and can effectively applied in the 4/8-bit situation. Therefore, according to specific design specifications, different structures can be chosen.

For other different precision configurations (e.g. 2/8 adaptive precision) and number of parallel collumns, the corresponding power consumption and energy-saving performance can also be estimated with extending the evaluation results in Sect. IV.

## VI. CONCLUSIONS

In this work, a method combining adaptive precision and fine-grained power gating strategies is proposed for more smart ADC design. According to the evaluation results of two CIS-applied ADC designs, almost a half of the ADCs' power consumption can be reduced for low-precision conversion, while little cost of extra control circuits is required.

The method in this paper may resonate with varying downstream NNs for efficient multi-tasks analysis. As such an integrated system will be promising for more intelligent edge computing in the future, further works on the co-design of ADCs and NNs remain to be developed.

## REFERENCES

- G. Eason, B. Noble, and I. N. Sneddon, "On certain integrals of Lipschitz-Hankel type involving products of Bessel functions," Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955.
- [2] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
- [3] I. S. Jacobs and C. P. Bean, "Fine particles, thin films and exchange anisotropy," in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.
- [4] K. Elissa, "Title of paper if known," unpublished.
- [5] R. Nicole, "Title of paper with only first word capitalized," J. Name Stand. Abbrev., in press.
- [6] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, "Electron spectroscopy studies on magneto-optical media and plastic substrate interface," IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
- [7] M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989.