# A 12 pJ/Pixel Analog-to-Information Converter Based 816 × 640 Pixel CMOS Image Sensor

Denis Guangyin Chen, Member, IEEE, Fang Tang, Member, IEEE, Man-Kay Law, Member, IEEE, and Amine Bermak, Fellow, IEEE

Abstract—Analog-to-information converters (AICs) take advantage of the limited information bandwidth in high-frequency signals to improve the energy efficiency of front-end data converters. High-resolution image sensors often convey limited information due to the spatial redundancy between neighboring pixels. This paper proposes a mixed-signal AIC which compresses each nonoverlapping  $4 \times 4$  pixel block in a  $816 \times 640$  pixel prototype active-pixel sensor (APS) imager. It combines an energy-efficient charge-pump bit-image processor (BIP) with an area-efficient successive-approximation-register-single-slope (SAR-SS) hybrid analog-to-digital converter (ADC) via a charge-transfer-amplifier (CTA). The AIC is fully dynamic and consumes no static power. The ADC's capacitor array doubles as a computational device for parts of the compression algorithm which reduces its sampling rate by a factor of four. The compressed data contains direct edge information and can be decoded by a very simple receiver. The fabricated prototype consumes 12 pJ per pixel at 111 fps in the image compression mode and 48 pJ per pixel at 28.7 fps in raw data mode (9 b per pixel) under the same clock rate. To the best of our knowledge, this is the most energy-efficient compressive CMOS image sensor ever reported in the literature, thanks to the proposed AIC.

Index Terms—Analog-to-information converter (AIC), charge transfer amplifier, CMOS image sensor, compressive sensing, image compression, SAR ADC.

# I. INTRODUCTION

MERGING imaging applications such as wireless multimedia sensor networks (WMSNs) [1], [2], retinal prosthesis [3], [4], and disposable sensors [5], [6] are presenting new demands on CMOS image sensor (CIS) design in terms of power consumption, bandwidth, memory size, cost, and processing capability [7]. This is especially acute in modern high-resolution sensors. The bandwidth problem can be partly solved by adding a dedicated digital signal processor (DSP) to compress the visual data, but this approach is computationally

Manuscript received July 20, 2013; revised November 29, 2013; accepted February 02, 2014. Date of publication March 10, 2014; date of current version April 21, 2014. This paper was approved by Associate Editor Boris Murmann. This work was supported by the Hong Kong Research Grant Council under Grant 610412.

D. G. Chen and A. Bermak are with the Smart Sensory Integrated Systems (S2IS) Lab, Electronic and Computer Engineering department, Hong Kong University of Science and Technology, Hong Kong (e-mail: denischn@gmail.com; eebermak@ust.hk).

F. Tang is with the College of Communication Engineering, Chongqing University (CQU), Chongqing 400050, China (e-mail: frankfangtang@gmail.com). M.-K. Law is with the State Key Laboratory of Analog and Mixed-Signal

VLSI, Macau University, Macau (e-mail: mklaw@umac.mo). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2014.2307063



Fig. 1. Information bandwidth for different signal pipeline designs. (a) Compress after storage. (b) Compress before storage (c) Compress before ADC.

intense and therefore introduces significant power, area, and thermal overhead [8]-[10]. State-of-the-art DSPs using commercial algorithms such as the industry standard JPEG codecs [11] based on discrete cosine transform (DCT) or discrete wavelet transform (DWT) can have power consumptions of the same order as the image sensor itself [12]-[14]. The inefficiency of this conventional approach is largely due to the way data are compressed after quantization and storage Fig. 1(a). The image data are sampled and written to memory at Nyquist rate, therefore all circuit blocks-digital and analog-must operate at the full rate with high power consumption up to the DSP where data compression occurs. This architecture has been ubiquitously favored in the past because it allows for the greatest level of modularity [15], [16].

Some recent efforts have investigated the possibility of compressing the image before it is stored [Fig. 1(b)], thereby reducing the memory size. While the image is still sampled at full rate, the quantizer only presents compressed representations of the data to the memory module, so the raw data are never stored digitally. In practice, this architecture is limited to linear prediction algorithms [17]–[20] and applications with temporal redundancy [21]–[23]. The analog-to-digital converter (ADC) needs to either be arbitrated faster than the Nyquist rate or be implemented in a pixel parallel fashion leading to low spatial resolution.

The most aggressive approach to designing an efficient image sensor is to integrate image compression at or before the point of quantization [Fig. 1(c)], so the sensor output can be sampled at sub-Nyquist rates. Earlier purely analog attempts [24], [25] required either special devices [26] or large circuits [27] which lead to high cost and low spatial resolution. They are also difficult to scale in advanced CMOS technologies. Consequently, more recent implementations have favored integrating image compression with the ADC which resembles an analog-to-information converter (AIC). AIC enables a high-frequency signal to be sampled at sub-Nyquist rates [28] when the intrinsic information rate is lower than the signal bandwidth. A number of AICs have been demonstrated in biomedical, radio-frequency, and signal processing applications [29]–[31]. In image sensors, the AIC concept can be realized by algorithmic  $\Delta\Sigma$  modulators performing analog matrix operations [32]. Systems based on the Haar wavelet [32] and compressive sensing [33]–[35] have been demonstrated with good compression ratio and peaksignal-to-noise ratio (PSNR). However, these transform-domain methods require complex decoders, especially for compressive sensing. This additional power consumption in the receiver is an important consideration for WMSN and biomedical implant applications.

In this paper, a column-parallel AIC based on visual pattern image coding (VPIC) [36] is proposed for the first time [37]. When compared with DCT, DWT, and vector quantization (VQ) algorithm classes, VPIC has the advantage of not requiring any matrix operations. This is important when considering mixed-signal implementations and receiver design. For example, the VPIC decoder can be implemented by a simple  $3 \times 3$ averaging filter. The cost of this simplicity comes in VPIC's higher bit rate (1 bit-per-pixel) and limited image quality [38], but these drawbacks may not matter in applications intended for image interpretation. Unlike transform-domain algorithms such as DCT, DWT, and VQ, where spatial information is abstracted into transform-domain basis, VPIC output contains direct edge and intensity information of the image. In this sense, VPIC can be thought of as an intermediate processing step between a raster image and its symbolic interpretation [39].

The proposed AIC circuit is prototyped in a high-resolution image sensor fabricated using a cost-competitive standard CMOS process. It has a new type of charge-transfer-amplifier (CTA) which integrates a charge-pump bit-image processor (BIP) with a successive-approximation-register-single-slope (SAR-SS) hybrid ADC, making the prototype sensor fully dynamic. The energy efficient charge-pump BIP is made robust against parasitic capacitance by the proposed circuit techniques. The SAR-SS ADC's capacitor array is exploited to provide additional functionalities both as a load for the CTA and as a computational device to perform parts of the compression algorithm. The proposed algorithm reduces the ADC sampling rate by a factor of four, leading to an approximately fourfold improvement in both energy consumption per pixel and pixel throughput. These figures are among the best in the literature,



Fig. 2. Components used in the image compression algorithm.



Fig. 3. Block diagram of the image sensor chip.

and the measured PSNR is comparable to designs based on transform-domain methods [38].

The remainder of this paper is structured as follows. The revised VPIC algorithm and the overall image sensor architecture are introduced in Section II. The circuit implementation of the key building blocks in the AIC is elaborated on in Section III. Measurement results are discussed in Section IV before making final remarks in Section V.

## II. ALGORITHM AND SYSTEM

## A. Revised Visual Pattern Image Coding

The proposed algorithm is a circuit-friendly implementation of the VPIC [36]. Modifications are made to the original VPIC algorithm to facilitate ADC integration. A variant of the proposed algorithm is presented in [38]. It had good image quality (approximately 30 dB PSNR) at very low computational complexity. The basic components of this algorithm are illustrated in Fig. 2. A version suitable for active-pixel sensor (APS) implementation is described below.



Fig. 4. Column circuit: from pixel to AIC.

For an  $M \times N$  pixel image, where a pixel at position (m, n) has value P(m, n), the following quadrant values are computed for  $4 \times 4$  nonoverlapping blocks:

$$L1 = \frac{1}{4} \sum_{i=1}^{2} \sum_{j=1}^{2} P(4k+i, 4l+j)$$
 (1)

$$L0 = \frac{1}{4} \sum_{i=3}^{4} \sum_{j=1}^{2} P(4k+i, 4l+j)$$
 (2)

$$R1 = \frac{1}{4} \sum_{i=1}^{2} \sum_{j=3}^{4} P(4k+i, 4l+j)$$
 (3)

$$R0 = \frac{1}{4} \sum_{i=3}^{4} \sum_{j=3}^{4} P(4k+i, 4l+j). \tag{4}$$

A larger block size would yield greater improvements in circuit power consumption and speed because the compression overhead will be shared between more pixels, but the block size of  $4 \times 4$  has been shown to yield the optimal tradeoff between compression ratio and image quality [36]. The quadrant values L1, L0, R1, and R0 from (1)–(4) are well suited for mixed-signal implementation because they can be easily computed via capacitor charge-sharing.

The mean u and gradient G of this  $4 \times 4$  pixel block (illustrated in Fig. 2 is defined as

$$u(4k, 4l) = \frac{1}{4} \times (L1 + L0 + R1 + R0) \tag{5}$$

$$G(4k, 4l) = \frac{1}{4} \times (|L0 - R1| + |L1 - R0|)$$
 (6)

and the bit-image  $\boldsymbol{B}$  is quantized by comparing each pixel to the mean

$$B(4k+i,4l+j) = (P(4k+i,4l+j) > u) \mid_{i \in \{1,2,3,4\}, j \in \{1,2,3,4\}}.$$
 (7)

If the gradient G is less than a certain threshold, this  $4 \times 4$ pixel block is regarded as a uniform pattern (UP) and only the mean u is sent in  $n_{uu}$  bit resolution; otherwise, it is an edge block (EP) and the mean u (in  $n_{ue}$  bit resolution), the gradient G (in  $n_q$  bit resolution), and the bit-image B are all sent. This compressed representation contains both edge information and decimated image intensity information. The edge orientation is readily obtained from B and its steepness is given by G. The block mean u can be passed through further levels of image processing or compression. The data rate is already decimated by a factor of 16 at this point, so subsequent digital processors can operate with greatly reduced throughput and power consumption. This inherent feature extraction property of VPIC resembles the first layer of a hierarchical neural network [39]. It can be very useful as a starting point for future image interpretation systems. The percentage of EP blocks in a natural image is found to be typically around 10%. The threshold value for G is tuned to satisfy this heuristic observation [38].

#### if G < threshold then

UP, send: u in  $n_{uu}$  bit resolution.

#### else

EP, send: u in  $n_{ue}$  bit resolution, G in  $n_g$  bit resolution, and B in  $4 \times 4$  bits.

## end if

During image capture, only four quadrant values (L1, L0, R1, and R0) are quantized by the ADC as opposed to 16 distinctive pixel values.

At the receiver, an UP block is simply treated as a uniform image. An EP block is reconstructed by superimposing the product of G and B on the  $4 \times 4$  mean block. The reconstructed image is then passed through a  $3 \times 3$  low-pass filter to remove blocking artifacts.

#### if UP then

$$I(4k+i,4l+j) = u(4k,4l), i \in \{1,2,3,4\}, j \in \{1,2,3,4\}$$

else

$$I(4k+i,4l+j) = u(4k,4l) + B(4k+i,4l+j) \times G(4k,4l),$$
  
 $i \in \{1,2,3,4\}, j \in \{1,2,3,4\}$ 

#### end if

## B. Image Sensor System

A  $816 \times 640$  pixel CMOS image sensor (CIS) is designed to demonstrate the idea of image compression AIC. Its block diagram is shown in Fig. 3. Each AIC can simultaneously process four pixel columns and it is multiplexed across two neighboring groups of four pixel columns ( $COL\_MUX$  in Fig. 4. This avoids an overly large aspect ratio in the circuit layout; each AIC can be pitched with the width of eight pixels. The data output from each column is bit-serial. Once a bit becomes available, the column outputs are multiplexed onto the 16-bit output data bus (16 columns at a time) by  $COL\_INDEX[2:0]$  before processing the next bit.

## III. CIRCUIT IMPLEMENTATION

## A. Pixel and Column Circuit Interface

The entire column circuit, originating from the pixel, is depicted in Fig. 4. The 3T pixel structure is chosen for compactness. An n-diffusion-p-substrate diode under 1.8-V supply is used as the sensing device. The bit-image processor (BIP) in the AIC requires the offset error between pixel inputs to be small, so a charge-transfer amplifier (CTA) is used to implement correlated double sampling (CDS). A by-product of configuring the CTA in the proposed manner is that the ADC sample also enjoys CDS. Furthermore, the reset switch COL\_CL is synchronized with the CTA so the pixel source-followers do not require any static current sources and thereby consumes no static power. In practice, the pixel has a usable integration range of approximately 0.7 V under a 1.8 V supply due to the threshold voltage  $(V_{\rm th})$  drop across the in-pixel reset and source follower transistors. In simulation, the dynamic sampling method introduces less than 1.7 mV of nonlinearity error over this integration range

The SAR ADC is combined with a single-slope reference to implement a compact hybrid SAR-SS ADC. The SAR-SS ADC does not quantize individual pixels when it is operated in compression mode. Instead, it quantizes the quadrant values L1, L0, R1, and R0 (1)–(4) two at a time after sampling each two rows (2 × 4) of pixels. All four quadrant values from the 4 × 4 pixel block are quantized by the ADC before the BIP starts to quantize the bit-image B from (7). The ADC and BIP data output is multiplexed by the  $BIT\_MUX$  switch. The mean u and gradient G are computed off-chip using the quadrant values and (5)–(6).

## B. Charge Transfer Amplifier

The CTA in Fig. 5(a) is an amplifier with no static bias current [40]. Its main advantage when compared with a conventional



Fig. 5. Operation of the (a) CTA schematic during: (b) reset, (c) first sample  $(V_1)$  of the pixel value, and (d) second sample  $(V_2)$  of the pixel reset and amplifying their difference.

source-follower circuit is its energy efficiency due to on-demand current consumption which is especially useful for low-power applications with variable frame-rate. Its gain is defined by the ratio A between the source capacitor  $(C_S)$  and the output load capacitor  $(C_L)$ . Consequently, the main disadvantage of the CTA is that it needs a very large source capacitor to achieve reasonable gain. The proposed CTA configuration in Fig. 5 exploits the fact that a large capacitor already exists in the SAR ADC. It stacks the BIP capacitor on top of the SAR ADC capacitor and charges both through the same current path. This sharing of the



Fig. 6. Timing control of the AIC in compression mode.

CTA between the BIP and the ADC leads to significant power savings.

In the reset phase, the pixel output is cleared by the COL\_CL and CTA\_SINK switches [Fig. 5(b)], and the load capacitors selected by  $B\_SEL[x]$  is reset by  $B\_RST$ . During the first sampling phase Fig. 5(c), the pixel voltage  $V_1$  is sampled onto the sampling capacitor  $C_D$ . The proposed CTA differs from the conventional design by allowing the CTA transistor ( $M_{\rm CTA}$ ) to be configured in diode connection using the CTA\_RST switch and biasing it with current  $i_{\rm CTA}$ . The gate voltage  $V_{\rm gs}$ , is sampled onto the bootstrap capacitor  $C_B$ ; the CTA speed in subsequent phases are configured by the size of  $i_{\rm CTA}$ . A large  $i_{\rm CTA}$  will accommodate a large  $C_L$ . The SAR ADC's capacitor array is reset by the CTA\_RST switch. Since CTA\_RST can be operated independently to the sampling process of  $V_1$ , it only needs

to turn on long enough for  $V_{\rm gs}$  to settle on  $C_B$ . Its pulse is very short (2% of the total time taken to sample and quantize each 4 × 4 pixel block), as shown in Fig. 6 to save power. Therefore, the power consumption due to  $i_{\rm CTA}$  scales with framerate and the CTA can be considered dynamic. In the second sampling phase [Fig. 5(d)], the pixel is reset, and the difference between the reset voltage  $V_2$  and  $V_1$  is sampled onto the SAR capacitor at CTout. The SAR capacitor is charged by  $C_L$ , so the voltage at colSamp is amplified by the capacitor ratio  $C_S/C_L \approx 16C/4C$ . In post-layout simulation, this gain is reduced from 4.07 to 3.23, implying a 26% increase in  $C_L$ . The value of  $C_L$  in a group of capacitor cells sharing the same CTA Fig. 7 is well matched because they share the same parasitic capacitance. Variations between CTAs can be minimized by matching their layout.



Fig. 7. Schematic of the AIC: bit-image processor (BIP) and reconfigurable SAR-SS ADC.

#### C. Analog-to-Information Converter

The complete AIC is shown in Fig. 7. The SAR-SS ADC in the left half of Fig. 7 can be configured to either quantize one 9-bit sample (in raw data mode) or two 8-bit samples (in compression mode) using the same capacitor array. The right half of Fig. 7 is a capacitive charge-pump circuit which quantizes the bit-image, B.

In the raw data mode, the SAR\_U switch in Fig. 7 is closed and the pixels are multiplexed by the SAR\_MUX[3:0] switches. The SAR-SS ADC quantizes each pixel to 9 b resolution. In the compression mode, the SAR\_U switch is opened and all SAR\_MUX[3:0] switches are closed. Each  $2 \times 4$  pixel block is sampled onto the SAR-SS ADC in a row-by-row fashion. One half of the capacitors is connected during each row readout to sample the two rows onto independent capacitors. After the two rows are sampled, each quadrant array will hold four pixel values. When the capacitor bottom-plates in a quadrant are connected together, the four pixel values are averaged into a single quadrant mean by charge sharing. This quadrant mean is then quantized to 8 bits by the SAR-SS ADC. After both quadrant means from the  $2 \times 4$  pixel block are quantized, the next  $2 \times 4$ pixel block is sampled and processed likewise to complete the  $4 \times 4$  pixel block. This produces the L1, L0, R1, and R0 quadrant values. The mean u and gradient G are computed in a fieldprogrammable gate array (FPGA) during readout. The control sequence of the AIC in compression mode is given Fig. 6.

The bit-image B is quantized in the BIP's 16 capacitor cells Fig. 7. They are sampled by the  $B\_S[x]$  switch one row (four pixels) at a time concurrently to the SAR-SS ADC Fig. 8(a). Once the  $4\times 4$  pixel block is sampled, the capacitor cells are connected in a loop via  $B\_M$  to calculate the block mean, as shown in Fig. 8(b). The advantage of connecting the 16 capacitor cells in a loop as opposed to a common bus is that the parasitics are evenly distributed across the cells and the charge-sharing error caused by stray capacitance will be minimized. To obtain the bit-image, each cell is connected to the comparator in turn by the  $B\_SEL[x]$  switch, and the difference between the stored pixel value and the block mean is charge-pumped onto the comparator Fig. 8(c). This differential readout ensures the circuit's robustness against noise and parasitics.

#### D. Hybrid SAR-SS ADC

A standard 8-bit SAR ADC needs 256 unit-capacitors in its digital-to-analog converter (DAC). The subsequent circuit area would make the AIC impractically large. A SAR-single-slope (SS) hybrid ADC architecture is proposed in this section to address this issue. It resolves the MSBs in a small SAR ADC section and quantizes the remaining LSBs using a single-slope (SS) reference. Not only do the SAR MSB capacitors speed up the conversion time (5+8 clocks as opposed to 256 clocks) when compared to a pure SS ADC, it also, as seen earlier, serves as a computational device for calculating the  $2\times2$  pixel quadrant value. The SAR ADC's simplicity, robustness, and inherent



Fig. 8. (a) Sample the pixel value. (b) Calculate the 4 × 4 block mean by charge-sharing. (c) Quantize the pixels to 1 bit against the block mean.

capacitor array makes it a natural fit for the proposed image compression design. To balance between robustness and compact circuit area, the SS LSB ADC is useful for scaling the capacitor array down to a reasonable size. The SS ADC uses a global reference so it introduces less overhead than a split-capacitor LSB DAC. The operation of this SAR-SS ADC is illustrated in Fig. 9. During sampling,  $C_{SS}$  in Fig. 7 is reset to  $V_{\rm sup}$ . It retains this voltage during the SAR phase. Once the SAR bits are determined in the conventional manner, the SS reference generator slowly discharges the REF\_OUT node of  $C_{\rm SS}$  towards ground while the counter value  ${\rm CTR}_{[2:0]}$  is decremented from 7 down to 0 over eight clock cycles. This discharge range is designed to match the step size of the SAR section. The kick-back compensation capacitor  $C_K$  is sized to be the same as  $C_{\rm SS}$  to minimized differential component of the comparator's kick-back noise. The DAC output  $V_{DAC}$  in Fig. 7 always settles to a value lower than  $V_{\mathrm{sup}}$  after the last SAR bit, so a bit decision can be made by the comparator on each falling clock edge to determine whether REF\_OUT has crossed this DAC residual voltage (see  $T_{\rm cross}$  in Fig. 9). The SS registers are continuously refreshed with the new counter value until this occurs. At the end of the eight clock cycles, each SS register will hold a 3-bit timer value corresponding to the quantized residual voltage. The hybrid SAR-SS ADC output can be calculated by

$$D = D_{\text{SAR}} \times 2^{b_{\text{SS}}} + \sum_{i=0}^{b_{\text{SS}}-1} 2^{i} S_{i}$$
 (8)

where  $b_{\rm SS}=3$  in this implementation. One single-slope reference generator is shared across all AICs.

The SS ADC bits  $S_{[2:0]}$  should normally be read out in parallel at the end of the conversion; however, for the reason of circuit simplicity, it is preferable to output the counter's content in a bit-serial fashion consistent with the SAR part. After examining the sequence of timer values, it is observed that  $S_2$  cannot have any value other than '0' after the fourth clock cycle. After the sixth clock cycle,  $S_1$  can only be '0'; similarly, after the seventh clock cycle,  $S_0$  can only be '0'. In other words, if the registers are reset to 0 at the beginning, the last clock cycle where in a "1" can be written to  $S_2$  register is the fourth clock cycle. Similarly,  $S_1$  and  $S_0$  cannot change to "1" after the sixth



Fig. 9. Timing diagram of the SAR-SS controller.

and seventh clock cycle, respectively. A multiplexer can take advantage of this and serially readout the valid  $S_{[2:0]}$  bits as is shown in Fig. 9 while the counter is still counting.

#### E. Dynamic Comparator

The two comparators in Fig. 7 make a significant contribution to the AIC's power consumption. Furthermore, offset errors in these comparators can affect the image quality. The SAR-SS ADC's comparator can tolerate some offset mismatch if it is compensated by post-readout calibration. The BIP, however, does not have this luxury because it only generates one bit of information for each pixel. Its comparator must therefore have minimal offset error during quantization. The energy efficient dynamic comparator from [41] is used in the SAR-SS ADC. The calibrated dynamic comparator from [42] with an offset error of 0.86 mVrms is used in the BIP. This calibrated comparator is



Fig. 10. Schematic of the dynamic comparator from [42] and its calibration circuit.



Fig. 11. Schematic of the programmable SS ADC reference current source.

shown in Fig. 10. The SAR-SS comparator is identical to Fig. 10 but minus the AD1 < x > and AD2 < x > calibration cells.

## F. SS ADC Current Source

The single-slope reference is generated by discharging a big capacitor (aggregated from a large number of small capacitors from each column) with the current source in Fig. 11. Since the SS LSB ADC quantizes the 3 LSBs, the SAR MSB ADC in 5-bit or 6-bit mode have step sizes of 56.3 and 28.1 mV, respectively. This is the output range (REF\_OUT) of the SS reference. Therefore the current variation induced by the change in drain-to-source voltage,  $V_{\rm ds}$ , at REF\_OUT is negligible. A simple single NMOS current source will suffice. The current value can be configured by a binary weighted register, D[5:0], which is configured to compensate for different ADC clock speeds. A redundant 2W weight is included to avoid missing codes in the programmable current output. All weights are referenced from a single poly resistor based current reference.

The SINK node is held at  $V_{\rm sup}$  (the SAR reference voltage) until RAMP is asserted during the SS quantization phase, upon which the SAR reference node (REF\_OUT) is slowly discharged towards ground. Holding SINK at  $V_{\rm sup}$  in this way helps to minimized the charge redistribution error between REF\_OUT and SINK during the turn-on of RAMP.

#### G. Layout and Floor-Planning Considerations

AIC The proposed column circuit measures 642  $\mu$ m × 14.8  $\mu$ m. Half of this area is occupied by the BIP. Its size is constrained by the 16 capacitor cells used to store and quantize the bit-image. The MIM capacitors are placed on top of active circuits to save circuit area. The unit capacitor used in both the BIP and the ADC is a  $5 \mu m \times 5 \mu m$  (minimum design rule) MIM capacitor with a typical capacitance of 28 fF. A similar 20-fF MIM capacitor in TSMC 0.18  $\mu$ m technology has approximately 1% mismatch standard deviation [43]. A smaller column circuit means more of them can be placed in parallel to achieve higher frame-rates. The BIP is sensitive to parasitic capacitance because of its floating capacitors. The SAR-SS ADC, on the other hand, is less sensitive to MIM capacitor layout parasitics because the bottom-plates of its capacitor array are driven by active signals.

## IV. MEASUREMENT RESULTS

The prototype image sensor with image compression AIC is fabricated in Global Foundry 0.18  $\mu$ m mixed-signal CMOS technology. Its microphotograph is shown in Fig. 12, and its specifications are summarized in Table I. The image data is



Fig. 12. Prototype image sensor with image compression AIC.

TABLE I
SUMMARY OF THE PROTOTYPE IMAGE CHIP WITH ADC INTEGRATED
IMAGE COMPRESSION

|                  | Compressed                                            | Raw                  |  |  |  |  |  |  |
|------------------|-------------------------------------------------------|----------------------|--|--|--|--|--|--|
| Process          | 0.18 μm 1P6M Mixed-signal CMOS                        |                      |  |  |  |  |  |  |
| Supply voltage   | 3.3 V, 1.8 V                                          |                      |  |  |  |  |  |  |
| Chip clock       | 4 Mhz                                                 |                      |  |  |  |  |  |  |
| AIC area         | 642 $\mu$ m $	imes$ 14.8 $\mu$ m $\mu$ m <sup>2</sup> |                      |  |  |  |  |  |  |
| Imager size      | 816 × 640                                             |                      |  |  |  |  |  |  |
| Frame rate       | 111 fps                                               | 28.7 fps             |  |  |  |  |  |  |
| Pixel size       | $1.85~\mu\mathrm{m}~	imes~1.85~\mu\mathrm{m}$         |                      |  |  |  |  |  |  |
| Fill factor      | 13 %                                                  |                      |  |  |  |  |  |  |
| Dark current     | $<4307 e^{-}/{\rm sec}$                               |                      |  |  |  |  |  |  |
| Saturation level | 7718 e <sup>-</sup>                                   |                      |  |  |  |  |  |  |
| Conv. gain       | $35.76~\mu\mathrm{V}/e^-$                             |                      |  |  |  |  |  |  |
| Sensitivity      | 309 e <sup>-</sup> /Lux.sec @ 1167 Lux                |                      |  |  |  |  |  |  |
| Dynamic range    | 46 dB                                                 | 34 dB                |  |  |  |  |  |  |
| ADC resolution   | 8b                                                    | 9b                   |  |  |  |  |  |  |
| Temporal noise   | 2 LSB <sub>rms</sub>                                  | 4 LSB <sub>rms</sub> |  |  |  |  |  |  |
| Standby          | $1.1~\mu\mathrm{W}$                                   |                      |  |  |  |  |  |  |
| Power            | 0.69 mW                                               | 0.72 mW              |  |  |  |  |  |  |
| Energy           | 12 pJ/pixel                                           | 48 pJ/pixel          |  |  |  |  |  |  |
| Data rate        | 3 bpp (1 bpp after FPGA)                              | 9 bpp                |  |  |  |  |  |  |
| PSNR             | 20 dB                                                 | 25 dB                |  |  |  |  |  |  |

captured by a field-programmable-gate-array (FPGA) chip. The reported maximum frame-rate is limited by the FPGA's clock speed and universal-serial-bus (USB) throughput capacity. The sensor's stand-by power consumption is  $1.1~\mu W$  when its frame-rate is zero. This confirms its fully dynamic operation.

The proposed SAR-SS ADC cannot be directly accessed at the column level without affecting the bit-image processor



Fig. 13. (a) DNL and (b) INL of the proposed SAR-SS ADC in 8-bit image compression mode.

(BIP) because the CTA is sensitive to the input capacitance of the SAR ADC. The measured differential non-linearity (DNL) and integral non-linearity (INL) presented in Fig. 13 is obtained from an independent test structure. The DNL and INL in compressive imaging mode (8-bit resolution) are -1.0/+9.9and -9.8/+4.7 LSBs, respectively. The SS reference is able to resolve for resolutions beyond what is available from the capacitive DAC alone. The large mid-input DNL spike in Fig. 13(a) is caused by mismatch in the MSB capacitor. When the MSB capacitor is bigger than the sum of the remaining capacitors the DAC output becomes saturated until the input signal is large enough to cause the MSB capacitor to switch. This step change is evident in the INL errors profile shown in Fig. 13(b). It can be addressed by adding redundancy and error-correction to the DAC [44]. Another factor contributing to the DNL is kick-back noise from the comparator being coupled into the SS reference capacitor during SS discharge. The ADC's ENOB is 6.3 bit when sampling an 11 Hz sinusoid at 13.2 kSa/s. The ENOB is also limited by the noise across the  $C_K$  compensation capacitor.

The mean and gradient values in (5) and (6) are computed in an off-chip FPGA device. These computations, along with additional frame control logic, is expected to add a further 115  $\mu$ W (at 111 fps) of power consumption when implemented in 0.18- $\mu$ m standard cell library. Their total area is 24571  $\mu$ m<sup>2</sup>. If this is taken into account, the total power consumption in image compression mode will become 806  $\mu$ W, out of which 30% (238  $\mu$ W) is consumed in the ADC (predominantly) and pixel array, 27% (218  $\mu$ W) in the CTA, 3% (24  $\mu$ W) in the bit-image generator, and 40% in digital circuits. If the entire image compression is performed digitally, the additional





Fig. 14. Image captured from prototype chip (a) in compression mode and (b) in raw data mode. Column offset is calibrated off-chip.

digital circuit power consumption will be 198  $\mu$ W instead of 115  $\mu$ W. Furthermore, the ADC power consumption will have to quadruple, and a large memory array of 16  $\times$  8 b must store the pixel values in each column circuit. If a single transistor source-follower (SF) is used instead of the proposed CTA, the SF would need to consume 258  $\mu$ W of power in order to meet the slew-rate and settling requirements imposed by the ADC capacitor at the maximum frame-rate, and, even more importantly, the BIP will require additional buffer circuits which will have a power consumption on the same order as the SF.

The test images in Figs. 14 and 15 are captured by illuminating a projector film in front of an integrating sphere. The light intensity, after the attenuation of the film, is adjusted be approximately 2000 Lux. Images are captured in both compression mode and raw data mode. Off-chip column offset calibration is applied in both cases to remove the comparator offset in the SAR ADC. The compression mode has worse column fixed-pattern noise (FPN) artifacts due to gain errors in the CTA. The overall image quality is limited by noisy pixels with low SNR. A larger pixel area in a dedicated CIS process where PIN diode is available would help to mitigate this problem. The degradation in PSNR from raw mode to compression mode is 5 dB. It is interesting to note that the pepper noise in Fig. 14(b) is absent





Fig. 15. Image of boats parked by lagoon captured from prototype chip (a) in compression mode and (b) in raw data mode. Column offset is calibrated off-chip.

in Fig. 14(a) due to the inherent statistical filtering of the compression process. The compression mode also enjoys a wider dynamic range (DR) due to the higher available frame-rate which extends its DR under bright illumination. DR is defined as the ratio between maximum and minimum photodiode (PD) voltage drop over time of bright and dark illuminations, respectively:

$$DR = \frac{\frac{\Delta V_{PD}}{\Delta t|_{MAX}}}{\frac{\Delta V_{PD}}{\Delta t|_{MIN}}}.$$
 (9)

In this implementation, the rolling shutter readout speed is approximately four times higher in compression mode. For a fixed available  $\Delta V_{\rm PD}$ , this shorter  $\Delta t$  results in an increase in DR of 12 dB. The lower end of the reported DR in Table I is limited by the dark current in the photodiode. This is a device limitation in the standard process. A more expensive CIS process with dedicated PIN diode can significantly remedy this situation [14]. On the other hand, certain cost-sensitive applications such as disposable sensors [5], [6] may not require such a high DR, so a standard process device may be acceptable. In Fig. 15, it can be seen that, while global DR across the image is good (the boats in the foreground versus the black mountains in the background), some details are lost in the local  $4 \times 4$  image blocks due to compression and FPN.

Table II compares the proposed design against other image sensors in the literatures with on-chip image compression. The scalability of the proposed image sensor is evident by it having the largest pixel array and the second highest pixel throughput rate (limited by the testing setup) among its peers. It also has the

| Reference               | [3:            | 3]     | [46]         | [45]  |                | [19]         | [27]            |         | [26]          |       | This work    |      |  |
|-------------------------|----------------|--------|--------------|-------|----------------|--------------|-----------------|---------|---------------|-------|--------------|------|--|
| Year                    | 20             |        | 2007         | 1     | 2009           | 2011         |                 | 2008    |               | 2006  |              | 2013 |  |
| Algorithm               | C              | S      | Lossless     | Haar  | wavelet        | QTD          | SPIHT           |         | DCT           |       | VPIC         |      |  |
| Architecture            | Column level   |        | Column level | Colu  | mn level       | Chip level   | Pixel level Pix |         | Pixe          | level | Column level |      |  |
| ADC                     | $\Delta\Sigma$ |        | Single slope |       | $\Delta\Sigma$ | Single slope | none            |         | none          |       | SAR          |      |  |
| ADC resolution          | 12b            |        | 8b           |       | 8b             | 8b           | -               |         | -             |       | 9b           |      |  |
| Technology (µm)         | 0.15           |        | 0.35         |       | 0.35           | 0.35         | 0.5             |         | 0.5           |       | 0.18         |      |  |
| Supply (V)              | 3.3, 2.        | 0, 1.8 | 3.3          |       | 3.3            | 3.3          | -               |         | 3.3           |       | 3.3, 1.8     |      |  |
| Area (mm <sup>2</sup> ) | 2.9×3.5        |        | 2.6×6.0      | 4.    | 4×2.9          | 3.3×3.2      | 2.3×2.3         |         | 2.4           | ×1.8  | 2.16×1.36    |      |  |
| Resolution              | 256×256        |        | 80×44        | 12    | 8×128          | 64×64        | 33×25           |         | 104           | ×128  | 816 × 640    |      |  |
| Pixel pitch (µm)        | 5.5            |        | 32           |       | 15.4           | 39           | 69              |         | 13.5          |       | 1.85         |      |  |
| Fill factor (%)         | -              |        | 18           |       | 28             | 12           | 21              |         | 46            |       | 13           |      |  |
| Pixel circuit           | 4T pinned PD   |        | 8T           |       | 7T             | PWM DPS      | Heterogeneous   |         | Floating gate |       | 3T APS       |      |  |
| DR (dB)                 | 7              | 8      | -            | -     |                | >100         | -               |         | -             |       | 34           | 46   |  |
| Frame rate (fps)        | 120            | 1920   | 435          |       | 30             | -            | 10000           |         | 25            |       | 28.7         | 111  |  |
| Throughput (Mp/s)       | 7.9            | 125.8  | 1.5          |       | 0.5            | -            | 8.3             |         | 0.3           |       | 15.0         | 58.0 |  |
| Power (mW)              | 93.1           | 96.2   | 150          |       | 26.2           | 17 0.25 @    |                 | @ 30fps | 2             |       | 0.72         | 0.69 |  |
| Energy (pJ/pixel)       | 11838          | 765    | 21973        | 53304 |                | -            | 10101           |         | 6010          |       | 48           | 12   |  |
| Compression ratio       | 1              | 16     | <1.5         | 3.5   | 8              | 9.1          | 80              | 8       | 1.3           | 13.7  | 1            | 8    |  |
| PSNR (dB)               | -              | 32.5   | -            | 32    | 15             | 23           | 24.5            | 40      | 47.1          | 25.7  | 25           | 20   |  |

TABLE II
COMPARISON OF CMOS IMAGE SENSORS WITH ON-CHIP IMAGE COMPRESSION

lowest energy consumption per pixel. This is the benefit of combining an information efficient AIC architecture together with energy efficient circuit techniques. The AIC image sensor concept is therefore very suitable for ultra-low-power applications. The pixel fill-factor reported in Table II is relatively low when compared with other designs because the chosen pixel pitch is more than one order of magnitude smaller than most of the earlier works. This is a compromise made in exchange for much higher image resolution.

Oike and Gamal's design in [33] uses first-order algorithmic  $\Delta\Sigma$  ADCs similar to [45] to perform compressive sensing by block-wise random sampling. Its good image quality is contributed by three major factors: the availability of high-performance PIN diode at the device level, a pixel that is nine times bigger, and better gain matching due to the sharing of the same  $\Delta\Sigma$  ADC within a compression block. When compared with [33], the proposed design is more prone to column FPN between the CTAs, and it experiences more noise due to the smaller transistor sizes in the pixel and AIC, all in exchange for an energy consumption that is >18 dB lower. Furthermore, the proposed design uses a much simpler decoder, and the compressed data contains direct spatial information which makes compressed-domain image processing possible. In general, the imager in [33] and the proposed sensor offer different tradeoffs and are suitable for different applications.

## V. CONCLUSION

This paper detailed a prototype  $816 \times 640$  pixel AIC image sensor with real-time image compression. The proposed AIC utilizes a number of low-power circuit techniques such as fully dynamic CTA, SAR-SS ADC, and capacitive charge-pump to achieve the best-in-class energy consumption figures. The measured results show that the proposed image compression technique can lead to a fourfold reduction in energy consumption

per pixel and a fourfold improvement in frame-rate at the same time while the PSNR is only degraded by 5 dB. This makes the proposed design highly suitable for compact high resolution imaging applications which require low power consumption and bandwidth.

#### REFERENCES

- [1] S. Misra *et al.*, "A survey of multimedia streaming in wireless sensor networks," *IEEE Commun. Surveys Tutorials*, vol. 10, no. 4, pp. 18–39, 2008.
- [2] I. Akyildiz et al., "Wireless multimedia sensor networks: Applications and testbeds," Proc. IEEE, vol. 96, no. 10, pp. 1588–1605, Oct. 2008.
- [3] J. Ohta, Smart CMOS Image Sensors and Applications, Ser. Optical Science and Engineering. Boca Raton, FL, USA: CRC, 2007.
- [4] J. Ohta *et al.*, "Implantable CMOS biomedical devices," *Sensors*, vol. 9, no. 11, pp. 9073–9093, Nov. 2009.
- [5] J. Burghartz et al., "CMOS imager technologies for biomedical applications," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2008, pp. 142–602.
- [6] K. Kagawau et al., "A 3.6 pw/frame.pixel 1.35 v PWM CMOS imager with dynamic pixel readout and no static bias current," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2008, pp. 54–595.
- [7] A. Olyaei and R. Genov, "Mixed-signal CMOS wavelet compression imager architecture," in *Proc. 48th Midwest Symp. Circuits Syst.*, Aug. 2005, vol. 2, pp. 1267–1270.
- [8] J. Karlsson, "Wireless video sensor network and its applications in digital zoo," Ph.D. dissertation, Dept. Appl. Phys. Electron., Umea Univ., Umea, Sweden, 2010.
- [9] T. Teixeira et al., "An address-event image sensor network," in Proc. IEEE Int. Symp. Circuits and Syst., May 2006, pp. 4467–4470.
- [10] Axis Communications, "AXIS 211W Network Camera," Sep. 27, 2010 [Online]. Available: http://www.axis.com/products/cam\_211w/
- [11] W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, 1st ed. Norwell, MA, USA: Kluwer Academic, 1992.
- [12] J.-H. Woo et al., "A 152-mW mobile multimedia SoC with fully programmable 3-D graphics and MPEG4/H.264/JPEG," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 9, pp. 1260–1266, Mar. 2009.
- [13] H. Yamauchi *et al.*, "1440 × 1080 pixel, 30 frames per second motion-JPEG 2000 codec for HD-movie transmission," *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 331–341, Jan. 2005.
- [14] F. Tang et al., "Low-power CMOS image sensor based on column-parallel single-slope/SAR quantization scheme," *IEEE Trans. Electron Devices*, vol. 60, no. 8, pp. 2561–2566, Aug. 2013.

- [15] K.-Y. Min and J.-W. Chong, "A real-time JPEG encoder for 1.3 mega pixel CMOS image sensor SoC," in *Proc. 30th Annu. Conf. IEEE Ind. Electron. Soc.*, Nov. 2004, vol. 3, pp. 2633–2636.
- [16] Y. Nishikawa et al., "A high-speed CMOS image sensor with on-chip parallel image compression circuits," in Proc. IEEE Custom Integr. Circuits Conf., Sep. 2007, pp. 833–836.
- [17] M. Zhang and A. Bermak, "Compressive acquisition CMOS image sensor: From the algorithm to hardware implementation," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 18, no. 3, pp. 490–500, Mar. 2010.
- [18] D. Climie et al., "Frame based adaptation for CMOS imager," in Proc. 5th Int. Conf. Inf., Commun. Signal Process., Dec. 2005, pp. 301–304.
- [19] S. Chen et al., "A CMOS image sensor with on-chip image compression based on predictive boundary adaptation and memoryless QTD algorithm," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 4, pp. 538–547, Apr. 2011.
  [20] Y. G. Lee et al., "Low-complexity near-lossless image coder
- [20] Y. G. Lee et al., "Low-complexity near-lossless image coder for efficient bus traffic in very large size multimedia SoC," in Proc. 16th IEEE Int. Conf. Image Process., Nov. 2009, pp. 2329–2332.
- [21] K. Aizawa et al., "On sensor image compression," IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 3, pp. 543–548, Jun. 1997.
- [22] M. Zhang and A. Bermak, "Quadrant-based online spatial and temporal compressive acquisition for CMOS image sensor," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 9, pp. 1525–1534, Jul. 2011.
- [23] C. Posch et al., "A QVGA 143 dB dynamic range asynchronous address-event PWM dynamic image sensor with lossless pixel-level video compression," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech.* Papers, Feb. 2010, pp. 400–403.
- [24] S. Kawahito et al., "A compressed digital output CMOS image sensor with analog 2-D DCT processors and ADC/quantizer," in *IEEE Int.* Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 1997, vol. 453, pp. 184–185.
- [25] E. Artyomov and O. Yadid-Pecht, "Adaptive multiple-resolution CMOS active pixel sensor," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 53, no. 10, pp. 2178–2186, Oct. 2006.
- [26] A. Bandyopadhyay et al., "MATIA: a programmable 80 μw/frame CMOS block matrix transform imager architecture," *IEEE J. Solid-State Circuits*, vol. 41, no. 3, pp. 663–672, Mar. 2006.
- [27] Z. Lin et al., "A CMOS image sensor for multi-level focal plane image decomposition," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 55, no. 9, pp. 2561–2572, Oct. 2008.
- [28] D. Healy, "Analog-to-information (A-to-I) federal business opportunities," DARPA Broad Area Announcement DARPA BAA05-35, Jul. 2005.
- [29] R. Agarwal and S. Sonkusale, "Input-feature correlated asynchronous analog to information converter for ECG monitoring," *IEEE Trans. Biomed. Circuits Syst.*, vol. 5, no. 5, pp. 459–467, Apr. 2011.
- [30] R. Maleh et al., "Analog-to-information and the Nyquist folding receiver," *IEEE J. Emerging Select. Topics Circuits Syst.*, vol. 2, no. 3, pp. 564–578, Nov. 2012.
- [31] M. Trakimas *et al.*, "A compressed sensing analog-to-information converter with edge-triggered SAR ADC core," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 60, no. 4, pp. 1135–1148, Apr. 2013.
- [32] A. Olyaei and R. Genov, "Focal-plane spatially oversampling CMOS image compression sensor," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 1, pp. 26–34, Jan. 2007.
- [33] Y. Oike and A. Gamal, "A 256  $\times$  256 CMOS image sensor with  $\Delta\Sigma$ -based single-shot compressed sensing," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2012, pp. 386–388.
- [34] M. Dadkhah *et al.*, "Block-based compressive sensing in a CMOS image sensor," *IEEE Sensors J.*, vol. PP, no. 99, pp. 1–1, Sep. 2012.
- [35] M. Dadkhah et al., "Compressive sensing image sensors-hardware implementation," Sensors, Apr. 2013.
- [36] D. Chen and A. Bovik, "Visual pattern image coding," *IEEE Trans. Commun.*, vol. 38, no. 12, pp. 2137–2146, Dec. 1990.
- [37] Y.-C. Liu et al., "VLSI implementation of visual block pattern truncation coding," *IEEE Trans. Consumer Electron.*, vol. 44, no. 3, pp. 490–499, Aug. 1998.
- [38] D. G. Chen et al., "A low-complexity image compression algorithm for address-event representation (AER) PWM image sensors," in Proc. IEEE Int. Symp. Circuits Syst., May 2011, pp. 2825–2828.
- [39] S. Behnke, "Hierarchical neural networks for image interpretation," in Ser. Lecture Notes in Comput. Science. : Springer, 2003, vol. 2766.

- [40] K. Kotani et al., "CMOS charge-transfer preamplifier for offset-fluctuation cancellation in low-power A/D converters," *IEEE J. Solid-State Circuits*, vol. 33, no. 5, pp. 762–769, May 1998.
- [41] M. van Elzakker *et al.*, "A 10-bit charge-redistribution ADC consuming 1.9 μW at 1 MS/s," *IEEE J. Solid-State Circuits*, vol. 45, no. 5, pp. 1007–1015, May 2010.
- [42] D. G. Chen and A. Bermak, "A low-power dynamic comparator with digital calibration for reduced offset mismatch," in *Proc. IEEE Int.* Symp. Circuits Syst., May 2012, pp. 2825–2828.
- [43] R. Xu et al., "Digitally calibrated 768-kS/s 10-b minimum-size SAR ADC array with dithering," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2129–2140, Sep. 2012.
- [44] D. G. Chen et al., "A low-power pilot-DAC based column parallel 8b SAR ADC with forward error correction for CMOS image sensors," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 60, no. 10, pp. 2572–2583, Sep. 2013.
- [45] A. Nilchi et al., "Focal-plane algorithmically-multiplying CMOS computational image sensor," *IEEE J. Solid-State Circuits*, vol. 44, no. 6, pp. 1829–1839, Jun. 2009.
- [46] W. Leon-Salas et al., "A CMOS imager with focal plane compression using predictive coding," *IEEE J. Solid-State Circuits*, vol. 42, no. 11, pp. 2555–2572, Nov. 2007.



**Denis Guangyin Chen** (S'07–M'13) received the B.Eng. (1st Hons.) degree in electrical and electronic engineering from the University of Canterbury, Christchurch, New Zealand, in 2008, and the Ph.D. degree in electronic and computer engineering from the Hong Kong University of Science and Technology, Hong Kong, in 2013.

He is currently a Research Associate with the Smart Sensory Integrated System (S2IS) Laboratory, Hong Kong University of Science and Technology, Hong Kong. His research interests include CMOS

image sensors, compressive imaging, low-power SAR ADC with error-correction, biomedical implants, smart infrastructures, and laser Doppler imaging.



Fang Tang (S'07–M'13) received the B.S. degree from Beijing Jiaotong University, Beijing, China, in 2006, and the M.Phil and Ph.D degrees from Hong Kong University of Science and Technology, Hong Kong, in 2009 and 2013, respectively.

He was a Research Associate with Hong Kong University of Science and Technology, Hong Kong, until November 2013. Currently, he is a tenure-track Assistant Professor with the College of Communication Engineering, Chongqing University, Chongqing, China. His research interests include mixed-signal

circuit design for advanced smart sensor integrated circuit and system.



Man-Kay Law (M'11) received the B.Sc. degree in computer engineering and Ph.D. degree in electronic and computer engineering from Hong Kong University of Science and Technology, Hong Kong, in 2006 and 2011, respectively.

During his Ph.D. work, he performed research on ultra-low power/energy harvesting CMOS sensor designs for wireless sensing platforms. From February 2011, he joined Hong Kong University of Science and Technology, Hong Kong, as a Visiting Assistant Professor. He is currently an Assistant Professor with

the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao. His research interests are in the area of ultra-low power energy harvesting and sensing circuits for wireless sensing and biomedical systems, with special emphasis on smart CMOS temperature sensors, CMOS image sensors, ultra-low power analog design techniques, and integrated energy harvesting techniques.

Dr. Law currently serves as a member of the IEEE Circuits and Systems Society (CASS) technical committee on Sensory Systems as well as Biomedical Circuits and Systems.



Amine Bermak (M'99–SM'04–F'13) received the M.Sc. and Ph.D. degrees in electronic engineering from Paul Sabatier University, Toulouse, France, in 1994 and 1998, respectively.

During his Ph.D. work, he was part of the Microsystems and Microstructures Research Group, French National Research Center LAAS-CNRS, where he developed a 3-D VLSI chip for artificial neural network classification and detection applications in a project funded by Motorola. While completing his doctoral work, he joined the Ad-

vanced Computer Architecture research group at York University, York, U.K., where he was a Post-doctoral Researcher working on VLSI implementation of CMM neural network for vision applications in a project funded by British Aerospace. In 1998, he joined Edith Cowan University, Perth, Australia, first as a Research Fellow working on smart vision sensors, then as a Lecturer and a Senior Lecturer with the School of Engineering and Mathematics. In 2002, he joined Hong Kong University of Science and Technology, Hong Kong, where he is currently a Full Professor leading the Smart Sensory Integrated Systems (S2IS) Lab and also serving as the director of the M.Sc. degree in Integrated Circuit Design (ICDE). He has published over 230 papers on the above topics in various journals, book chapters and conferences. His research interests are

related to VLSI circuits and systems for signal, image processing, sensors and microsystems applications.

Dr. Bermak was the recipient of the 2004 "IEEE Chester Sall Award"; the "Best Paper Award" at the 2005 International Workshop on System-On-Chip for Real-Time Applications, the Student Best Paper Award in IEEE ISCAS 2010, the Michael Gale Medal in 2011, and the "Engineering Teaching Excellence Award" from the School of Engineering, Hong Kong University of Science and Technology, in 2004 and 2009. He held a number of technical program committees' memberships at a number of international conferences such as the IEEE Custom Integrated Circuit Conference and Design Automation and Test in Europe. He is the general co-chair of the IEEE International Symposium on electronic design test and applications, Hong Kong 2008, and the general co-chair of the IEEE Conference on Biomedical Circuits and Systems, Beijing, 2009. He has served and currently serving in many international journals' editorial board such as the IEEE TRANSACTIONS ON VERY LARGE-SCALE INTEGRATION (VLSI) SYSTEMS, the IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II—EXPRESS BRIEFS. He is a member of the IEEE Circuits and Systems (CAS) Society committee on sensory systems and CAS committee on biomedical circuits and systems as well as an IEEE Distinguished Lecturer.