# An Adaptive Pre-Distortion Technique to Mitigate the DTC Nonlinearity in Digital PLLs

Salvatore Levantino, Member, IEEE, Giovanni Marzin, Member, IEEE, and Carlo Samori, Senior Member, IEEE

Abstract-Digital fractional-N phase-locked loops (PLLs) are an attractive alternative to analog PLLs in the design of frequency synthesizers for wireless applications. However, the main obstacle to their full acceptance in the wireless-systems arena is their higher content of output spurious tones, whose level is ultimately set by the nonlinearity of the time-to-digital converter (TDC). The known methods to improve the linearity of the TDC either increase its dissipation and phase noise or require slow foreground calibrations. By contrast, the class of digital PLLs based on a one-bit TDC driven by a multibit digital-to-time converter (DTC) substantially reduces power dissipation and eliminates the TDC nonlinearity issues. Although its spur performance depends on DTC linearity, the modified architecture enables the application of a background adaptive pre-distortion which does not compromise the PLL phase-noise level and power consumption and is much faster than other calibration techniques. This paper presents a 3.6-GHz digital PLL in 65-nm CMOS, with in-band fractional spurs dropping from -39 to -52 dBc when the pre-distortion is enabled, in-band phase noise of -103 dBc/Hz and power consumption of 4.2 mW.

Index Terms—Adaptive signal processing, all-digital PLL (ADPLL), bang-bang, digital PLL (DPLL), digital-to-time converter (DTC), frequency synthesis, jitter, lead-lag, MOS integrated circuits, mixed analog-digital integrated circuits, noise cancellation, nonlinear distortion, phase-locked loop (PLL), TDC-less, radio-frequency integrated circuits.

## I. INTRODUCTION

IGITAL phase-locked loops (DPLLs) are establishing themselves as a potential alternative to their analog counterpart for the implementation of fractional-N frequency synthesizers for wireless systems. One of the main reasons driving this trend is the fact that area occupation and power consumption of analog PLLs do not scale down with the technology process. Furthermore, the adoption of the so-called "digiphase" scheme, that is widely used to suppress fractional spurs and  $\Delta\Sigma$  quantization noise, requires analog correlators and high-resolution digital-to-analog converters (DACs) [1]. As a result, the cancellation of noise and spurs may be incom-

Manuscript received December 03, 2013; revised February 04, 2014; accepted March 05, 2014. Date of publication April 17, 2014; date of current version July 21, 2014. This paper was recommended by Guest Editor Andrea Mazzanti

- S. Levantino and C. Samori are with the Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan 20133, Italy (e-mail: salvatore.levantino@polimi.it).
- G. Marzin was with the Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan 20133, Italy. He is now with Blue Danube Labs Inc., Warren, NJ 07059 USA.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2014.2314436

plete and power consuming. By contrast, DPLLs eliminate the charge pump and its power consumption. They use a digital loop filter whose area and power scale down in new technology nodes. The digiphase, as well as other calibration algorithms [2], is easily implemented with less design complexity, better accuracy and limited area occupation and power consumption [3].

Despite the clear advantages of DPLLs over analog PLLs, there is still some hesitancy in adopting them in high-performance wireless systems. In fact, it is often believed that a DPLL may be more prone to the generation of spurious tones, which represent a serious limitation in modern radios, both when several communication standards must coexist and when carrier aggregation is implemented. The time-to-digital converter (TDC) is typically the main source of such spurs, whose level is a function of the resolution and nonlinearity of its conversion characteristic. Several techniques have been proposed both to refine resolution and improve linearity [4]-[12]. Despite having achieved good results, most of these solutions consume more power than is desired or necessary. In other cases, the reduction of the spur level is either paid with a growth of random phase noise or with the adoption of time-consuming foreground calibrations, such as the code density test.

A recently proposed DPLL architecture based on the use of a digital-to-time converter (DTC) breaks these tradeoffs. The adoption of the DTC (essentially a digital-to-analog converter) at the output of the feedback integer-N divider allows a lower number of bits of the TDC [4], even down to the limit case of a single-bit TDC [13]. The single-bit approach has several advantages: 1) the finite resolution and linearity of the TDC are no longer an issue, as the latter operates as a threshold circuit in the time domain; 2) a DTC requires in principle less power than a TDC with the same number of bits, substantially reducing the overall power consumption of the synthesizer, and it naturally takes advantage of over-sampling and subranging; and 3) in contrast to a TDC, the effects of the nonlinearity of the DTC can be corrected relying on an adaptive pre-distortion technique in the digital domain, without increasing the level of phase noise and the overall power consumption. This paper describes how the DTC-based architecture of DPLL has evolved from the first version presented in [14] to the one presented here and discloses a new adaptive pre-distortion algorithm, which improves the spur performance over channel frequency, runs continuously in the background and is orders of magnitude faster to converge than calibrations based on code density tests.

In Section II, the main sources of spurious tones are briefly recalled. Section III introduces the concept of adaptive pre-distortion for the linearization of the DTC characteristic and describes



Fig. 1. Conventional digital PLL architecture with digiphase scheme.

how to achieve a practical implementation of such an algorithm with minimum hardware requirements. The simulated performance of the proposed algorithm is assessed in Section IV, whereas the experimental results are presented in Section V. Finally, Section VI draws the conclusions.

### II. GENERATION OF FRACTIONAL SPURS

# A. Conventional Digital PLLs

A conventional DPLL architecture is shown in Fig. 1 [3]. The digital  $\Delta\Sigma$  modulator driving the modulus control of the frequency divider interpolates the frequency control word (FCW) and the quantization error inserted by the  $\Delta\Sigma$  is subtracted from the TDC output. However, because of the TDC finite resolution, a periodic residue is produced after the cancellation and a number of concentrated tones, also known as fractional spurs, appear in the output spectrum. Such spurs are more powerful as the synthesized channels are closer to the boundaries of the integer-N channels (typically referred to as near-integer channels).

To estimate the power of these spurs, let us denote as n the number of fractional bits of the frequency control word (FCW) and m < n the number of TDC bits. For the sake of simplicity, we assume a first-order  $\Delta\Sigma$  modulator that dithers the divider modulus, but the same conclusions can be drawn even for higher order modulators. As long as the FCW has the LSB equal to 1 and the other (n-m) LSBs equal to 0, the  $\Delta\Sigma$  toggles the division factor from N to N+1 in one reference period every  $2^n$ periods. Thus, it injects a quantization error into the loop, whose integral q[k] is a sawtooth signal (shown in Fig. 1). Being deterministic, the sawtooth can be cancelled out by estimating its amplitude  $\hat{q}$  and subtracting it at the TDC output. Unfortunately, because of the limited resolution, the TDC produces a periodic staircase p[k], and the cancellation of p[k] and  $\hat{g}q[k]$  is incomplete. The residue e[k] is a sawtooth signal with peak-to-peak amplitude  $\Delta t$  (where  $\Delta t$  is the TDC time resolution), which produces tones at  $2^{(m-n)}f_r$  and its multiples in the PLL output spectrum. If this tone frequency is within the PLL bandwidth, the fundamental harmonic is unfiltered, and its power in the doubled-sided phase-noise spectrum is  $\mathcal{L}_{q,\mathrm{spur}} = (\Delta t/T_{\mathrm{dco}})^2$ , where  $T_{\rm dco}$  is the DCO output period [15]. This means that a DPLL with a 4 bit TDC has a theoretical in-band fractional spur



Fig. 2. Main signal waveforms of the conventional DPLL with a nonlinear TDC.

of about -24 dBc, while in a DPLL with 10 bit TDC it decreases to about -60 dBc. The above estimates match simulation results very well as long as the variance of the random noise at the TDC input is much lower than the fractional-spur power. Instead, if the two terms become comparable, the random noise dithers the periodic quantization error, and the simulated level of spurs is slightly lower than the theoretical value. Nevertheless, this analysis confirms that the level of fractional spurs in a DPLL strongly depends on the resolution of the TDC [4], [16], and this fact justifies the need for a high-resolution TDC in a fractional-N DPLL [17]–[20].

Unfortunately, the finite resolution of the TDC is not the only source of spurs. The nonlinearity of the TDC characteristic is very often the dominant one [16], [21]. Fig. 2 shows a simplified situation in which the TDC has an ideally infinite resolution but it is nonlinear. In this case, the TDC output p[k]differs from the ideal saw-tooth signal and the residue e[k] is a non-zero periodic signal which produces fractional spurs at the output. It is not straightforward to calculate analytically the level of such spurious tones, but, for a first-order estimate, we can assume a TDC characteristic with sinusoidal shape and integral nonlinearity  $\Delta t_{\rm inl}$ . In this case, the residual time error is a sinusoid with amplitude  $\Delta t_{\rm inl}/2$  and the fundamental spur is:  $\mathcal{L}_{\rm inl,spur} = (\pi^2/4)(\Delta t_{\rm inl}/T_{\rm dco})^2$ . In practice, to keep the spur due to nonlinearity below the spur due to finite resolution, the integral nonlinearity of the TDC must be kept below one LSB (i.e.,  $\Delta t_{\rm inl}$  lower than  $\Delta t$ ). Simulations and measurements confirm this result to a good extent [15].

Such a linearity requirement is extremely stringent if combined with the requirement of a dynamic range larger than 10 bits. Thus, several solutions have been recently proposed to mitigate the issue of TDC nonlinearity. Calibration methods such as those based on the *code density test* proved to be very effective in eliminating the adverse effects of nonlinearity [5]–[8], but they have long convergence times and, in most of the cases, require to be operated in the foreground. Thus, they need to be run again, to track environmental variations or to adapt the calibration after channel frequency steps. Other methods are based on the concept of *dynamic element matching* (DEM) technique, which can be applied to the TDC to trade linearity against noise. The delay elements of the TDC are shuffled, so that the effect of their mismatches is no longer periodic and the energy of fractional spurs is spread out in the spectrum [4]. A similar result

is achieved by means of the sliding scale technique, which consists of adding a dithering sequence to the input of the TDC and subtracting it at its output [9]. A more effective solution consists of shaping the spectral components, which arise from the mismatches of the TDC. Doing so, their energy is concentrated at high frequency and is filtered out by the loop filter. This behavior is obtained using the gated-ring-oscillator (GRO), proposed in [10]. Remarkable steps forward in this direction are the recent work in [11], which discusses and mitigates the issues related to the GRO design, and that in [12], which introduces a recirculating TDC based on a single delay cell. All of these solutions are effective to reduce the level of spurs induced by nonlinearity. However, they are power-consuming and substantially complicate design. They reduce the spur level at the expenses of either increased noise or long convergence times.

# B. Digital PLLs Based on DTCs

Instead of trading PLL performance with TDC's complexity, power, and area occupation, an alternative approach is to rethink the architecture of the DPLL in directions to relax the performance needed from the TDC. A step in this direction is taken in [4], where a 4 bit DTC, introduced in the loop after the integer-N divider and implemented as a controlled delay line, allows the TDC dynamic range to reduce to 4 equivalent bits and to enable a fully digital background correction of the DTC nonlinearity [22]. The idea is further developed in [13], reducing the number of TDC bits to one. Fig. 3 shows the basic block schematic of that circuit, in which the 1 bit TDC is essentially a bang-bang phase detector (BBPD) or, in practice, a coarse TDC with midrise quantization. That architecture solves the issues of TDC quantization and nonlinearity, since it replaces the multibit TDC with a simple threshold. Although at a first glance a step function may appear as a nonlinear one, in practice it behaves as a linear detector, as long as the input of this detector is dominated by random noise. This property is well known in the field of integer-N PLL, and it is conventionally exploited in clock-and-data recovery applications to avoid the insurgence of limit cycles, which degrade output jitter [23]. This regime is often referred to as random-noise regime, as opposed to the limit-cycle regime [24]. However, the application of this concept to fractional-N PLLs is not straightforward. The quantization error introduced by the  $\Delta\Sigma$  dithering of the divider modulus is as large as the DCO period  $T_{dco}$  (in terms of time delay), while the random jitter at the TDC input may range between 0.1 and 1 ps rms in high-performance frequency synthesizers. Thus, in a plain implementation, the BBPD would be overloaded by the deterministic component of jitter, the PLL would converge to a limit cycle, and large spurious tones would appear in the output spectrum. To recover the random-noise regime without increasing the level of noise, the amplitude of the deterministic jitter introduced by the  $\Delta\Sigma$  must be reduced below the random one. The DTC block at the output of the divider has this function. It is essentially a digitally controllable delay stage, which realigns the feedback signal to the reference clock by adding the correct delay to the divider's output. In practice, it cancels out the  $\Delta\Sigma$  quantization error in a fashion similar to the digiphase scheme in Fig. 1. The accumulated quantization error q[k] that is proportional to the deterministic jitter is first scaled by the gain Authorized licensed use limited to: PES University Bengaluru. Downloaded on April 25,2025 at 07:40:23 UTC from IEEE Xplore. Restrictions apply.



Fig. 3. Digital PLL based on 1 bit TDC and multibit DTC.

 $\hat{q}$  and then used to drive the DTC control word. The gain  $\hat{q}$  is obtained by accumulating the product of q[k] and e[k], that is, an approximation of the correlation between the two sequences and the resulting loop is a real-time sign-LMS estimation of  $\hat{g}$ , made continuously in the background of PLL operation. To improve the equivalent resolution of the DTC, the DTC control word is not controlled directly but it is driven by means of a secondary  $\Delta\Sigma$ , which exploits oversampling.<sup>1</sup>

The peak amplitude of the residual deterministic jitter at the TDC input is as large as the equivalent resolution of the  $\Delta\Sigma$ -DTC. Thus, the latter must be kept below the random component of jitter, to hold on the random-noise regime. In practice, a 10 bit DTC (i.e., resolution of about 300 fs and range of about 300 ps) is sufficient to achieve a theoretical level of fractional spurs below -65 dBc (even for near-integer channels) and output jitter of 300 fs rms in a 3-4 GHz PLL [13]. Although the issue of large dynamic range and linearity may seem just to be shifted from the multibit TDC of the original DPLL to the DTC of the alternative architecture, design is greatly relaxed and power consumption significantly reduced [27]. A 10 bit flash-type TDC would require more than one thousand delay stages and flip-flops, while the DTC may require just one variable-delay stage, that is a digital inverter loaded by a bank of switched capacitors.<sup>2</sup> With this topology, time resolutions below 100 fs can be easily achieved in the 65 nm node, at power consumption, in principle, a factor of one thousand lower with respect to a TDC with same resolution. Obviously, if the comparison were carried out against other types of TDCs, such as the oversampled in [10], the Vernier-ring in [29] or the pipelined TDC in [30], a lower number of delay stages and

<sup>&</sup>lt;sup>1</sup>This scheme for the cancellation of the quantization error has been recently exploited in analog PLLs [25], [26].

<sup>&</sup>lt;sup>2</sup>Another implementation of the DTC is the phase interpolator in [28] which has inherently better linearity than the delay stage with variable load capacitor but requires higher power consumption at same operating frequency.



Fig. 4. Transformation of the DTC characteristic f(x) after the multiplication by  $\hat{g}$  of the DTC control word: (a) linear f(x) and (b) nonlinear f(x).

flip-flops is required at the cost of higher design complexity and lower conversion speed.

To analyze the impact of nonlinearity of the DTC characteristic, it is useful to recall the effect of the scaling factor  $\hat{g}$  in the ideal linear case in Fig. 4(a). The conversion characteristic of the DTC is a function f(x), where x is the DTC control word in the domain [-M, M]. To cover PVT spreads, the output range of the DTC is intentionally designed to be larger than the required  $T_{\text{dco}}$ . The multiplication of x by  $\hat{g}$  (where  $\hat{g} < 1$ ) sets the maximum output delay of the DTC to the desired value. More rigorously, the multiplication by  $\hat{g}$  modifies the DTC characteristic from f(x) to the function composition  $f(\hat{q}x)$  sketched on the right side of Fig. 4(a). As a result, when the PLL synthesizes a fractional-N channel, the DTC provides the correct linear increase of the excess phase in the loop, after every increment of the divider modulus from N to (N+1). Thus, the waveform of the unwrapped excess time shift at the DTC output is the ideal ramp in Fig. 5(a). By contrast, if the DTC has a nonlinear characteristic f(x), as in Fig. 4(b), the multiplication by  $\hat{g}$  produces a function composition that is still nonlinear. Hence, the excess time shift at the DTC output is no longer linear, and it even shows periodic discontinuities at the time instants in which the modulus control is changed [Fig. 5(b)]. The sign-LMS algorithm minimizes the power of the residue e[k], which is the difference between the linear and the actual ramps, and so the output spurs with respect to the case of an unregulated characteristic, but it cannot null them.

# III. DIGITAL PRE-DISTORTION OF DTC

The nonlinear relationship between the DTC control word and the produced delay shift causes generation of unwanted fractional spurs in the PLL and, in general, folding of the spectrum of the  $\Delta\Sigma$  modulator. The conversion characteristic of the DTC block can be in principle linearized by adopting a pre-distortion algorithm. The pre-distortion circuit inversely models the DTC characteristic and, when combined with the DTC, produces an overall block that is more linear. In essence, inverse



Fig. 5. Excess time shift at the DTC output: (a) linear f(x) and (b) nonlinear f(x).



Fig. 6. Linearization of the DTC characteristic f(x) after mapping the DTC control word in the pre-distortion function  $\hat{c}(x)$ , that is the inverse function of f(x).

distortion is introduced into the DTC control word, thus cancelling any nonlinearity the DTC might have. In contrast to dithering techniques aiming to spread out the power of the spurs, this method produces no increase of phase noise. To adapt automatically to any environmental variations or channel frequency changes, this pre-distortion operation should be updated continuously in the background of the PLL operation. To illustrate the operating principle of the algorithm, we will first introduce an ideal approach which aims to correct every allowed level of q[k]. Though this approach is impractical in terms of hardware requirements, its illustration will be helpful to derive the piecewise-linear pre-distortion. The latter technique reduces the required hardware, while it does not compromise performance.

#### A. Ideal Pre-Distortion

The conversion characteristic of the DTC is mathematically defined by the function f(x) shown in Fig. 6. The domain of f(x), which is the set of all of its permitted values, is the set of on April 25 2025 at 07:40:23 UTC from IEEE Xplore. Restrictions apply



Fig. 7. Block schematic of the DPLL implementing the adaptive estimation of the inverse function of the DTC.

integer values between -M and +M:  $X=\{-M,\ldots,+M\}$ . The method of pre-distortion consists in mapping the domain X into a new set  $C=\{\hat{c}_{-M},\ldots,\hat{c}_{+M}\}$  by means of a function  $\hat{c}(x)$ , so that the composition  $f[\hat{c}(x)]$  gives the linear relationship  $\xi x+\delta$ , where  $\xi$  is the gain and  $\delta$  is the intercept point for x=0. In mathematics, such function  $\hat{c}(x):X\mapsto C$  exists, if f is monotonic, and it is given by  $f^{-1}(\xi x+\delta)$ , where  $f^{-1}$  is the inverse function of f. The function  $\hat{c}(x)$  along with the result of the pre-distortion operation is shown in Fig. 6.

The values of the set C can be automatically determined by means of the scheme in Fig. 7. In this case, the domain of f is given by the set of all of the permitted values of the quantization error q[k]. The scheme is based on a bank of digital integrators with gain  $\gamma < 1$ . At every occurrence of a certain integer value i of q[k], the error e[k] detected by the phase detector is accumulated in the i-th integrator. If the phase detector is a single-bit TDC (as in the case shown in Fig. 7), only the ith integrator is selected by the digital multiplexer and de-multiplexer, and integrates +1 or -1 depending on whether the reference signal (ref) leads or lags the divider signal (div). In the meantime, all of the other integrators that are not selected integrate a zero value. Doing so, the output  $\hat{c}_i$  of the ith integrator that drives the  $\Delta\Sigma$ -DTC input converges to the value which minimizes the error e[k]. This minimization



Fig. 8. Linearization of the DTC characteristic f(x) after mapping the DTC control word in the pre-distortion function  $\hat{c}(x)$ , that is a piecewise-linear approximation of the inverse function.

ne ith integrator that drives the  $\Delta\Sigma$ -DTC input converges occurs as all the integrators converge to the values of the inverse value which minimizes the error e[k]. This minimization function of the DTC characteristic. In order not to interfere with Authorized licensed use limited to: PES University Bengaluru. Downloaded on April 25,2025 at 07:40:23 UTC from IEEE Xplore. Restrictions apply.



Fig. 9. Block schematic of the DPLL implementing the adaptive estimation of the piecewise-linear pre-distortion function of the DTC.

the PLL, one of the (2M + 1) coefficients of set C is deliberately forced to 0 ( $\hat{c}_0 = 0$ , in the figure). In this way, the linearization algorithm does not alter the mean delay introduced by the DTC block and the mean delay between ref and div signals, which is instead set by the PLL loop.

Unfortunately, in a practical implementation of a fractional-N PLL, the FCW requires a high number of fractional bits. In wireless applications and, in general, when a fine regulation of the synthesized clock is required, 18 or more fractional bits may be required. As a result, the quantization error q[k] has 18 or more bits and the scheme in Fig. 7 would demand for an outrageous number of digital integrators (i.e.,  $2^{18}$  or more integrators). So, a technique to reduce the hardware requirement has to be found.

# B. Piecewise-Linear Pre-Distortion

Although the method illustrated in the previous subsection is impractical, it allows us to envision a modified version, which requires a much lower number of integrators without compromising performance. The concept is illustrated in Fig. 8. Essentially, instead of estimating all of the values of the inverse function, we approximate the inverse function with a piecewise-linear curve. In this way, we can just estimate the inverse

function in a limited number of points by means of a limited number of digital integrators and use an approximation of the inverse functions by linearly interpolating between the remaining points.

To get a mathematical representation of the algorithm, it is useful to separate the DTC control word x into a coarse and a fine component,  $x = x_c + x_f$ . The coarse component is given

$$x_c = \left\lfloor \frac{P}{M} x \right\rfloor$$

where  $|\cdot|$  is the floor function. Thus, as x is an integer number between -M and +M, its coarse component  $x_c$  is an integer number between -P and +P, with P < M.

The automatic estimation of the pre-distortion function  $\hat{c}(x)$ is made just in the (2P+1) permitted values of  $x_c$ . Let us denote as  $\hat{c}_{x_c}$  the estimated values of  $\hat{c}(x)$  for  $x = x_c$ . In this way, the number of digital integrators used for the estimation of the  $\hat{c}_{x_c}$ coefficients is reduced from 2M to 2P. Then, the expression of the function  $\hat{c}(x)$  for all x values is a piecewise linear function composed of straight lines connecting the  $\hat{c}_{x_c}$  coefficients

$$\hat{c}(x) = \hat{c}_{x_c} + \hat{g}_{x_c} \cdot x_f$$



Fig. 10. Block schematic of the fabricated DPLL, with sub-ranging DTC and pre-distortion block.



Fig. 11. Schematic of the sub-ranging DTC circuit.

where the gains  $\hat{g}_{x_e}$  are obtained from the vector of  $\hat{c}_{x_e}$  by means of an algebraic expression

$$\hat{g}_{x_c} = \frac{P}{M} (\hat{c}_{x_c+1} - \hat{c}_{x_c}).$$

In this way, the pre-distortion function  $\hat{c}(x)$  which is used to linearize the DTC characteristic is a piecewise-linear approximation of the inverse function  $f^{-1}(\xi x + \delta)$ , as illustrated in Fig. 8 for the case of P = 3. The condition for the existence of  $\hat{c}(x)$  is just the monotonicity of the function  $f(x_c)$  in the domain  $X_c = \{-P, \ldots, +P\}$ .

The above-described algorithm is implemented in the scheme in Fig. 9. The quantization error q[k] is separated into a coarse and a fine component by means of the quantizer Q. The coarse component  $q_c[k]$ , which has (2P+1) permitted values, selects the corresponding  $\hat{c}_{x_c}$ . The  $\hat{c}_{x_c}$  coefficients are estimated by 2P digital accumulators. The fine component  $q_f[k]$  of the quantization error is multiplied by the gain  $\hat{g}_{x_c}$ , which is calculated from

the vector of the  $\hat{c}_{x_c}$  coefficients, and then added to  $\hat{c}_{x_c}$  to get  $\hat{c}(x)$ .

Fig. 10 shows the final scheme in which the pre-distortion technique is combined with the subranging DTC introduced in [13], which significantly simplifies its implementation. The DTC whose schematic is drawn in Fig. 11 is a delay stage with two digitally controlled capacitor banks: a coarse thermometric-coded one with larger unit capacitance and a fine one with smaller unit capacitance. The thermometric coding guarantees the monotonicity of both the fine and the coarse characteristics of the DTC; whereas, the subranging approach greatly reduces the number of capacitors and control wires of the DTC. The fine delay range is intentionally designed to be much larger than the coarse delay resolution (let say a factor of  $2\times$ ) to cover process variations. To exactly match the resolution of the coarse word of the DTC, the fine control word is multiplied by a gain h (Fig. 10). This eliminates any bank-to-bank discontinuity [13].



Fig. 12. Integral nonlinearity (INL, with respect to full scale) of (a) the DTC converter obtained from circuit simulations and (b) the DTC with pre-distortion obtained from VHDL simulations.



Fig. 13. Phase-noise spectrum from VHDL simulations with: (a) no pre-distortion and (b) with pre-distortion.



Fig. 14. Pre-distortion function estimated by the scheme in Fig. 9, obtained from VHDL simulations.

## IV. SIMULATION RESULTS

To assess the performance of this adaptive pre-distortion algorithm, both the DPLL in Fig. 3 and the one in Fig. 10 are simulated and their performances are compared. The DTC characteristic f(x) is obtained from transistor-level simulations of the circuit in Fig. 11, taking into account a normal distribution of the capacitance values with a pessimistic standard deviation equal to 10% of the unit capacitance. Of course, the statistical



Fig. 15. Transient behavior of the  $\hat{c}_i$  coefficients.



Fig. 16. Die photograph.

distribution of the coarse unit capacitance dominates over the fine one. However, the nonlinearity is mainly dominated by the systematic component, which gives a more-than-linear relationship versus the control word. This is not surprising, since the delay of the following stage increases slightly as the slope of its input voltage decreases. The INL resulting from simulations is sketched as curve (a) in Fig. 12, which has a maximum value of about 1.2% with respect to the full-scale range given by the DCO period  $T_{\rm dco}$ , or equivalently about 12 times the LSB of the DTC.

Curve (a) in Fig. 13 is the simulated output spectrum of the DPLL in Fig. 3 (described in VHDL language) when a near-integer channel (290 kHz off the integer-N channel at 3.6 GHz) is synthesized from a 40 MHz reference. A fundamental fractional spur appears at about 290 kHz (falling within PLL bandwidth) together with its higher order harmonics. The level of such spur is -43 dBc. Moreover, an unexpected noise component is also visible in the 2-10 MHz range, which may be ascribed to folding of the  $\Delta\Sigma$  noise spectrum. Both spur and noise originate from the nonlinearity of the DTC. In fact, when the f(x) function



Fig. 17. Measured phase noise.

used in simulation is assumed to be ideally linear, the fundamental spur reduces to less than -65 dBc and the unexpected noise component disappears.

The DPLL in Fig. 10, employing the adaptive pre-distortion with 16 estimators (P = 8), is simulated with the same DTC nonlinearity f(x) used before. The  $\hat{c}_i$  coefficients settle to their steady-state value, shown in Fig. 14. The piecewise linear function  $\hat{c}(x)$ , which is composed of straight lines connecting the estimated  $\hat{c}_i$  coefficients and is also plotted in Fig. 14, shows a compression. This compression compensates the super-linear characteristic f(x) of the DTC. The estimation of the coefficients  $\hat{c}_i$  runs in the background and adapts the nonlinearity correction automatically to frequency steps or environmental variations. The settling of the 16 coefficients  $\hat{c}_i$  starting from the PLL start-up is sketched in Fig. 15. While  $\hat{c}_0$  is constant, the other 16 coefficients start from their nominal value and settle to lower values, because the DTC delay range is intentionally designed larger than the nominal one. As a result, the last coefficient,  $\hat{c}_8[k]$ , has the longest settling, which is limited by the overload of the 1 bit TDC. By contrast, after channel switching, the settling time reduces to less than 100  $\mu$ s.

To quantify the ability of the pre-distortion algorithm to linearize the DTC, we plot the INL of the function composition  $f[\hat{c}(x)]$  in Fig. 12 as curve (b). The residual nonlinearity is as large as 0.1% with respect to the full scale, or equivalently about one LSB of the DTC. This means that estimating only 16 coefficients out of the  $2^{18}$  (or, in general, more) permitted values of q[k] is sufficient to reduce the INL of the DTC from 1.2% to 0.1%. Curve (b) in Fig. 13 shows the output spectrum of the PLL with the adaptive pre-distortion, for the same near-integer channel used above. The fundamental spur lowers by 24 dB to about -67 dBc and the noise component ascribed to folding of  $\Delta\Sigma$  noise disappears. This result demonstrates that the piecewise-linear approximation of the inverse function



Fig. 18. Measured fractional spur without pre-distortion.

with 16 estimated points is effective to cancel out the dominant terms of DTC nonlinearity and the main source of fractional spurs.

#### V. MEASUREMENT RESULTS

The DPLL in Fig. 10 is now fabricated in a 65 nm CMOS process. It synthesizes frequencies in the 3–4 GHz range from a 40 MHz quartz oscillator with a resolution of approximately 76 Hz. Fig. 16 shows the die photograph, where the core circuit occupies 0.2 mm<sup>2</sup>. The total power dissipation (excluding crystal oscillator and pad driver) is 4.2 mW from a 1.2 V voltage supply without voltage regulators. The measured phase noise in Fig. 17 shows an in-band phase noise level of about



Fig. 19. Measured fractional spur with pre-distortion.



Fig. 20. Measured fractional spur versus synthesized channel frequency: (a) without pre-distortion and (b) with pre-distortion.

-103 dBc/Hz (at 100 kHz), which is mainly dominated by the thermal noise of the DTC and the 1 bit TDC.

To evaluate the proposed linearization technique, the test chip includes the possibility of disabling the pre-distortion block in Fig. 10 and replacing it with a multiplication of q[k] by a single gain  $\hat{g}$ , which is estimated via the LMS loop in Fig. 3. The measured fractional spur at about 3 kHz has a level of about -39 dBc when a channel 3 kHz off the integer-N one at 3.6 GHz is synthe sized and the pre-distortion is disabled (Fig. 18). This spur drops to -52 dBc when the same channel is synthesized and the pre-distortion is enabled (Fig. 19). Those measurements have been repeated at several channel frequencies between two integer-N channels and the level of the fundamental fractional spurs is plotted versus the offset frequency from the integer-N channel in Fig. 20. As the pre-distortion is disabled, the in-band fractional spur is about -39 dBc below the carrier [plot (a) in Fig. 20] that is close to the simulation results in Fig. 13(a). Moreover, the shape of plot (a) follows the low-pass transfer function with  $1/f^4$  roll-off typical of the input-output transfer function of a third-order PLL. This result agrees with the picture that the DTC systematic nonlinearity causes inaccurate cancellation of q|k| and the residue of this cancellation is injected into the feedback path. As the pre-distortion is enabled, the in-band spurs go down to about -52 dBc [plot (b) in Fig. 20], a value that is higher than the simulated level of -67 dBc in Fig. 13(b). Although this result suggests the existence of other mechanisms that are responsible for the residual spurs (such as electromagnetic coupling), it proves the usefulness of the proposed technique. The slope of plot (b) for out-of-band offsets is close to  $1/f^2$ , which departs from the ideal input-output transfer function of the PLL, observed in the case without pre-distortion. This means that the spurs at higher offsets are not related to disturbances injected into the feedback path. By contrast, they may be caused by a coupled disturbance (at a frequency equal to the offset frequency of the fractional spur), which modulates the DCO.

#### VI. CONCLUSION

DPLLs based on a one-bit TDC and a multibit DTC have significant advantages over conventional DPLLs based on a multibit TDC. Although the tight resolution and linearity requirements simply shifts from the multibit TDC to the multibit DTC, the latter: 1) can be implemented as a simple digitally controlled delay stage with lower power consumption; 2) exploits techniques such as sub-ranging and oversampling with insignificant design complexity; and 3) inherently enables a dither-less linearization of the DTC characteristic with fast convergence. A novel pre-distortion algorithm that compensates DTC nonlinearity and operates continuously in the background of the normal operation of the PLL has been presented here and demonstrated in silicon. The technique proves effective in cancelling fractional spurs arising from DTC nonlinearity at very low power and phase-noise level.

#### REFERENCES

- [1] A. L. Lacaita, S. Levantino, and C. Samori, *Integrated Frequency Synthesizers for Wireless Systems*. Cambridge, U.K.: Cambridge Univ., 2007
- [2] G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 20 Mb/s phase modulator based on a 3.6 GHz digital PLL with —36 dB EVM at 5 mW power," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 2974–2988, Dec. 2012.
- [3] C.-M. Hsu, M. Z. Straayer, and M. H. Perrott, "A low-noise wide-BW 3.6-GHz digital ΔΣ fractional-N frequency synthesizer with a noise-shaping time-to-digital converter and quantization noise cancellation," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2776–2786, Dec. 2008.
- [4] M. Zanuso, S. Levantino, C. Samori, and A. L. Lacaita, "A wideband 3.6 GHz digital ΔΣ fractional-N PLL with phase interpolation divider and digital spur cancellation," *IEEE J. Solid-State Circuits*, vol. 46, no. 3, pp. 627–638, Mar. 2011.
- [5] E. Temporiti, C. Welti-Wu, D. Baldi, R. Tonietto, and F. Svelto, "A 3 GHz fractional all-digital PLL with a 1.8 MHz bandwidth implementing spur reduction techniques," *IEEE J. Solid-State Circuits*, vol. 44, no. 3, pp. 824–834, Mar. 2009.
- [6] L. Vercesi, L. Fanori, F. De Bernardinis, A. Liscidini, and R. Castello, "A dither-less all digital pll for cellular transmitters," *IEEE J. Solid-State Circuits*, vol. 47, no. 8, pp. 1908–1920, Aug. 2012.
- [7] J. Borremans, K. Vengattaramane, V. Giannini, B. Debaillie, W. Van Thillo, and J. Craninckx, "A 86 MHz-12 GHz digital-intensive PLL for software-defined radios, using a 6 fJ/step TDC in 40 nm digital CMOS," *IEEE J. Solid-State Circuits*, vol. 45, no. 8, pp. 2116–2129, Dec. 2010.
- [8] A. Samarah and A. C. Carusone, "A digital phase-locked loop with calibrated coarse and stochastic fine TDC," *IEEE J. Solid-State Circuits*, vol. 48, no. 8, pp. 1829–1841, Aug. 2013.
- [9] E. Temporiti, C. Welti-Wu, D. Baldi, M. Cusmai, and F. Svelto, "A 3.5 GHz wideband ADPLL with fractional spur suppression through TDC dithering and feedforward compensation," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2723–2736, Dec. 2010.

- [10] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator TDC with first-order noise shaping," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, Apr. 2009.
- [11] C.-W. Yao and A. L. Willson, "A 2.8–3.2-GHz fractional-N digital PLL with ADC-assisted TDC and inductively coupled fine-tuning DCO," *IEEE J. Solid-State Circuits*, vol. 48, no. 3, pp. 698–710, Mar. 2013.
- [12] H. S. Kim, C. Ornelas, K. Chandrashekar, D. Shi, P. Su, P. Madoglio, W. Y. Li, and A. Ravi, "A digital fractional-N PLL with a PVT and mismatch insensitive TDC utilizing equivalent time sampling technique," *IEEE J. Solid-State Circuits*, vol. 48, no. 7, pp. 1721–1729, July 2013.
- [13] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 2.9-to-4.0 GHz fractional-N digital PLL with bang-bang phase detector and 560 fsrms integrated jitter at 4.5 mW power," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec. 2011.
- [14] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 2.9-to-4.0 GHz fractional-N digital PLL with Bang-Bang phase detector and 560 fsrms integrated jitter at 4.5 mW power," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2011, pp. 88–90
- [15] S. Levantino and C. Samori, "Nonlinearity cancellation in digital PLLs," in *Proc. Custom Integr. Circuits Conf.*, San Jose, CA, USA, Sep. 22–25, 2013, pp. 1–8.
- [16] R. Tonietto, E. Zuffetti, R. Castello, and I. Bietti, "A 3 MHz bandwidth low noise RF all digital PLL with 12 ps resolution time to digital converter," in *Proc. Eur. Solid-State Circuits Conf.*, 2006, pp. 150–153.
- [17] T.-K. Jang, X. Nan, F. Liu, J. Shin, H. Ryu, J. Kim, T. Kim, J. Park, and H. Park, "A 0.026 mm<sup>2</sup> 5.3 mW 32-to-2000 MHz digital fractional-N phase locked-loop using a phase-interpolating phase-to-digital converter," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2013, pp. 254–255.
- [18] D.-W. Jee, Y.-H. Seo, H.-J. Park, and J.-Y. Sim, "A 2 GHz fractional-N digital PLL with 1 b noise shaping  $\Delta\Sigma$  TDC," *IEEE J. Solid-State Circuits*, vol. 47, no. 4, pp. 875–883, Apr. 2012.
- [19] Y. Han, D. Lin, S. Geng, N. Xu, W. Rhee, T.-Y. Oh, and Z. Wang, "All-digital PLL with  $\Delta\Sigma$  DLL embedded TDC," *Electron. Lett.*, vol. 49, no. 2, pp. 93–94, Jan. 2013.
- [20] T. Tokairin, M. Okada, M. Kitsunezuka, M. Tadashi, and M. Fukaishi, "A 2.1-to-2.8-GHz low-phase-noise all-digital frequency synthesizer with a time-windowed time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2582–2590, Dec. 2010.
- [21] C. Weltin-Wu, E. Temporiti, M. Cusmai, D. Baldi, and F. Svelto, "Insights into wideband fractional ADPLLs: Modeling and calibration of nonlinearity induced fractional spurs," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 9, pp. 2259–2268, Sep. 2010.
- [22] C. Samori, M. Zanuso, S. Levantino, and A. L. Lacaita, "Multipath adaptive cancellation of divider non-linearity in fractional-N PLLs," in *Proc. Int. Symp. Circuits Syst.*, 2011, pp. 418–421.
- [23] N. Da Dalt, "Linearized analysis of a digital Bang-Bang PLL and its validity limits applied to jitter transfer and jitter generation," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 55, no. 11, pp. 3663–3675, Nov. 2008.
- [24] M. Zanuso, D. Tasca, S. Levantino, A. Donadel, C. Samori, and A. L. Lacaita, "Noise Analysis and Minimization in Bang-Bang Digital PLLs," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 56, no. 11, pp. 835–839, Nov. 2009.
- [25] T.-K. Kao, C.-F. Liang, H.-H. Chiu, and M. Ashburn, "A wideband fractional-N ring PLL with fractional-spur suppression using spectrally shaped segmentation," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 17–21, 2013, pp. 416–417.
- [26] S. Levantino, G. Marzin, C. Samori, and A. L. Lacaita, "A wideband fractional-N PLL with suppressed charge-pump noise and automatic loop filter calibration," *IEEE J. Solid-State Circuits*, vol. 48, no. 10, pp. 2419–2429, Oct. 2013.
- [27] N. Pavlovic and J. Bergervoet, "A 5.3 GHz digital-to-time-converter-based fractional-N all-digital PLL," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2011, pp. 54–56.
- [28] R. Nonis, W. Grollitsch, T. Santa, D. Cherniak, and N. Da Dalt, "digPLL-Lite: A low-complexity, low-jitter fractional-N digital PLL architecture," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3134–3145, Dec. 2013.
- [29] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit Vernier ring time-to-digital converter in 0.13 μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, Apr. 2010.

[30] K. Kim, W. Yu, and S. Cho, "A 9 bit, 1.12 ps resolution 2.5 b/stage pipelined time-to-digital converter in 65 nm CMOS using time-register," *IEEE J. Solid-State Circuits*, vol. 49, no. 4, pp. 1009–1017, Apr. 2014.



**Salvatore Levantino** (S'99–M'02) received the Laurea degree (*cum laude*) and Ph.D. degree in electrical engineering from the Politecnico di Milano, Milan, Italy, in 1998 and 2001, respectively.

From 2000 to 2002, he was a Consultant with Bell Labs, Lucent Technologies, Murray Hill, NJ, USA. Since 2005, he has been an Assistant Professor and subsequently Associate Professor of electrical engineering with Politecnico di Milano, Milan, Italy. In past years, he has contributed to the understanding of phase noise generation mechanisms in oscillators

and frequency dividers and to the development of new design methodologies for radio-frequency front-ends and frequency synthesizers. He is the coauthor of *Integrated Frequency Synthesizers for Wireless Systems* (Cambridge University Press, 2007). His current research includes wireless transceivers, frequency synthesizers, and data converters. He is a coauthor of approximately 80 papers and holds five patents.

Dr. Levantino served as an associate editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II (2012–2013). He is currently an associate editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I. He has been on the Technical Program Committee (since 2011) and on the Steering Committee (since 2014) for the IEEE Radio Frequency Integrated Circuits Symposium.



Giovanni Marzin (M'14) was born in 1983 in Latisana, Italy. He received the M.Sc. and Ph.D. degrees in electrical engineering from the Politecnico di Milano, Milan, Italy, in 2009 and 2013, respectively.

During his Ph.D. studies, he worked on all-digital RF transmitters for 4G applications. Since November 2014, he has been with Blue Danube Labs Inc., Warren, NJ, USA.

Dr. Marzin was the corecipient of the Dimitris N. Chorafas Foundation 2013 Award.



Carlo Samori (M'98–SM'08) was born in 1966. He received the Laurea degree in electrical engineering and Ph.D. degree in electronics and communications at the Politecnico di Milano, Italy, in 1992 and 1995, respectively.

In 1996, he was appointed an Assistant Professor and, since 2002, he has been an Associate Professor of electrical engineering with Politecnico di Milano, Milan, Italy. He initially worked on high-speed lownoise front-end circuits for photodetectors, and then in the area of design and analysis of integrated cir-

cuits for communications both in bipolar and in CMOS technology. Among his works, he contributed to a time-variant theory of phase noise generation in LC-tuned VCO, he collaborated to the design of several low phase noise VCO in bipolar and CMOS technology. He has contributed to the design of fractional-and integer-N PLLs for multistandard WLAN applications. From 1997 to 2002, he was Consultant with Bell Labs, Murray Hill, NJ, USA, and then with Agere Systems. Currently, his research interests are frequency synthesizers and data converters. He is the coauthor of approximately 100 papers in international journals and conferences. In 2007 he published, as a coauthor, the book *Integrated Frequency Synthesizers for Wireless Systems* (Cambridge University Press).

Dr. Samori is a member of the Technical Program Committee for the IEEE International Solid-State Circuits Conference and the European Solid-State Circuits Conference