# Wireless Time Transfer With Subpicosecond Accuracy Based on a Fully Integrated Injection-Locked Picosecond Pulse Detector

Babak Jamali<sup>®</sup>, Student Member, IEEE, and Aydin Babakhani<sup>®</sup>, Member, IEEE

Abstract—An injection-locked picosecond pulse detector is implemented in 65-nm CMOS technology. An on-chip slot planar inverted cone antenna receives picosecond pulses with a center frequency of 77 GHz and feeds the signal to a low-noise amplifier. A three-stage injection-locked frequency divider is used to lock the output signal to the 9.6-GHz repetition rate with an effective locking range of 142 MHz and a timing jitter of 0.29 ps<sub>rms</sub>. The injection-locked detector is utilized in a wireless time transfer setup to demonstrate its application in widely spaced synchronized distributed arrays. This fully integrated system consumes 42 mW from a 1.3-V supply and occupies a total area of 0.9 mm<sup>2</sup>, including the on-chip antenna and the pads.

*Index Terms*—CMOS, coherent combining, impulse receiver, injection-locking, jitter, millimeter-wave, on-chip antenna, pulse, synchronization, wireless time transfer.

#### I. INTRODUCTION

ILLIMETER-WAVE (mm-wave) pulses and wireless systems utilizing them have demonstrated their potentials in key applications such as high-resolution imaging radars [1]. Prior to developing CMOS-based mm-wave pulse systems, ultra-wideband (UWB) systems based on nanosecond pulses have been explored extensively mostly for their application in low-power communication systems [2]–[5] as well as in radars and localization systems [6]–[8]. The frequency-domain representation of a time-domain pulse with a duration of a few picoseconds is a spectrum with a wide bandwidth centered at mm-wave frequencies. Several methods have been introduced to radiate picosecond pulses from silicon-based integrated circuits. These methods include using nonlinear transmission lines [9], [10], switching high-frequency oscillators [11], [12], using multiple oscillators for waveform shaping [13], and current-switching direct digital-to-impulse radiators [14], [15].

Manuscript received May 2, 2019; revised July 17, 2019; accepted July 30, 2019. Date of publication August 22, 2019; date of current version January 13, 2020. This work was supported in part by the National Science Foundation under Grant ECCS-1642929. (Corresponding author: Babak Jamali.)

B. Jamali is with the Department of Electrical and Computer Engineering, University of California at Los Angeles, Los Angeles, CA 90095 USA, and also with the Department of Electrical and Computer Engineering, Rice University, Houston, TX 77005 USA (e-mail: babakjamali@ucla.edu).

A. Babakhani is with the Department of Electrical and Computer Engineering, University of California at Los Angeles, Los Angeles, CA 90095 USA (e-mail: aydinbabakhani@ucla.edu).

Color versions of one or more of the figures in this article are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TMTT.2019.2934452

An ideal impulse has a flat frequency spectrum that contains all frequencies. In practice, due to bandwidth limitations, a rectangular or Gaussian pulse may be generated with a certain cutoff frequency. The frequency-domain representation of such a pulse is centered at zero frequency but it can be shifted to high frequencies by implementing a switching or multiplication stage. For example, the pulses in [11] are produced by switching an oscillator and, therefore, multiplying a rectangle by a sinusoidal wave, which determines the center frequency of the spectrum. Monocycle pulses can be generated by differentiating a square-wave signal and, thus, multiplying its spectrum by  $j\omega$ , which shifts its center frequency to high frequencies and removes the dc component. In [14], a resonance frequency of the output stage sets the center frequency of the radiated picosecond pulses. Since a convolution operation in time domain is equivalent to multiplication in frequency domain, by repeating a single pulse with a fixed repetition rate  $(f_{rep})$ , the spectrum of a single pulse is sampled by an impulse train with the same rate. As a result, pulses with tunable repetition rates can generate precise frequency tones at intervals equal to the repetition rate over the bandwidth of

Systems based on ultrashort pulses exhibit several advantages over continuous-wave (CW) systems. For instance, an issue arising in time-varying wireless channels is phase instability of the received non-line-of-sight (NLOS) signals, which is more critical in CW links. In order to resolve this issue, an impulse-based synchronization link can be implemented. The narrow width of an impulse allows us to distinguish between the LOS and NLOS signals since they arrive at the receiver in different times. For instance, a receiver with two antennas at different locations can be used to differentiate the LOS signal from NLOS reflections [16]. Impulse-based systems could also facilitate novel applications in communication, imaging, and spectroscopy. For example, their tunable broadband spectra can be passed through chambers of trace gases to detect their absorption lines in the mm-wave/terahertz regime. In addition to broadband spectroscopy, high-frequency pulses can be used to achieve multi-Gb/s data rates in wireless communication [17], and they can be time-interleaved to increase the data rate up to terabits per second [18]. A high-resolution 3-D imaging radar is another example of performance enhancement using mm-wave signals [1]. Higher bandwidth improves the depth resolution in 3-D imaging but

0018-9480 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.



Fig. 1. Various pulse detection techniques using (a) high-speed ADC, (b) downconversion mixer with a high-frequency LO, (c) self-mixing, (d) direct injection-locking, and (e) multiple stages of ILFDs (this work).

larger apertures are required to improve the lateral resolution. On-chip arrays [19] and synthetic-aperture radars [20] have been presented to increase the effective aperture size but they suffer due to the large size and long acquisition time, respectively. Wireless time transfer among widely spaced chips is presented in this article as a solution to build larger and flexible apertures that are not limited to the chip size.

Wireless time transfer among multiple chips ensures synchronized operation in transmitting and receiving signals and reduces the acquisition time. Eliminating wires in the synchronization scheme facilitates the expansion of the array and increases flexibility in placing the array elements. Various interchip synchronization techniques have been investigated theoretically and experimentally [21]-[26]. Fig. 1 summarizes different techniques for pulse detection. An ultrahighspeed ADC can sample and digitize the pulse waveform but state-of-the-art ADCs are not fast enough to sample picosecond pulses [27]. Another detection method is to use mixers to downconvert high-frequency pulses to low-frequency pulses [11]. In this method, a synchronized high-frequency local oscillator (LO) signal is needed, which does not meet our wireless synchronization objective. Two methods have been utilized in [26] and [28] that are capable of wirelessly synchronizing the receiver by eliminating the need for an external LO or clock signal. These methods are based on using the received signal as the LO (self-mixing) and pulse-based injectionlocking. In this work, a multistage pulse-based injectionlocking technique is used to enhance the clock jitter and achieve the best-reported accuracy in wireless time transfer. A fully integrated pulse detector is proposed for this purpose to receive a picosecond pulse train and recover its clock rate. The objective of this system is to recover the repetition rate (clock) of these pulses from the signal itself. Therefore, this detector is designed at the center frequency of targeted pulses (80 GHz) to receive the frequency components of the pulses with the highest possible power. This solution allows the implementation of widely spaced synchronized arrays where multiple chips generate and detect picosecond pulses with tight synchronization, as shown in Fig. 2. Wireless time transfer using high-frequency pulses is the first step toward implementing large-aperture imaging arrays with enhanced depth and angular resolutions.

In this article, a picosecond pulse detector based on an injection-locking scheme is presented as a solution for precision wireless time transfer. A proof-of-concept circuit, which was originally reported in [29], is fabricated using 65-nm bulk CMOS technology. This chip consists of an on-chip slot planar inverted cone antenna (PICA) to receive radiated signals. The received signal passes through a low-noise amplifier (LNA) centered at 80 GHz. Three stages of injection-locked frequency dividers (ILFDs) divide a high-power tone near the center frequency of the pulse by eight to produce the synchronized repetition rate at the output. In addition to the measurement results reported in [29], this article discusses pulse-based wireless time transfer and the requirements considered in system design. In-depth simulation results of the on-chip antenna and circuit blocks are presented in this article. New measurement results on the performance of the injection-locked divider and a demonstration of wireless inter-chip time transfer are also included in this article.

The remainder of the rticle is organized as follows. Section II describes the architecture of the pulse detector and design details of system and the building blocks. In Section III, three test setups are presented. This section reports the measurement results and characterizes the chip performance. Wireless time transfer with subpicosecond accuracy is demonstrated in this section, and picosecond pulse detection measurements are reported. Finally, Section IV concludes this article.

## II. INJECTION-LOCKED PULSE DETECTOR

## A. System Design

A pulse train with a repetition rate of  $f_{rep}$  and a pulse duration of a few picoseconds have a broad frequency spectrum consisting of equally spaced tones with the spacing of  $f_{\text{rep}}$ . The center frequency and bandwidth of the spectrum depend on the duration and the shape of the pulse in time domain. As the signal becomes more broadband, the small ringing around the main pulses becomes smaller and the shape of the pulse gets closer to an ideal impulse with an infinite bandwidth. Fig. 2 summarizes the time-domain and frequency-domain shapes of the pulses and a wireless system composed of pulse-radiating and receiving nodes. Ideally, pulses with the smallest possible duration are preferred due to their larger bandwidth, which results in an imaging system with better depth resolution. Current silicon-based integrated circuit technologies enable generation and radiation of pulses as short as 2 ps [14], and the design of this receiver is targeted at detecting pulses radiated by the chip that was reported in [14].



Fig. 2. Master and slave nodes of a widely spaced distributed array synchronized with picosecond pulses and the architecture of the receiver used for pulse-based wireless time transfer.

To detect the mm-wave pulses radiated by the silicon chip in [14], an ILFD is presented. Since the frequency tones in the spectrum of radiated monocycle pulses are integer harmonics of the repetition rate  $(f_{rep})$ , they could be used in extracting  $f_{rep}$  with low jitter. Conventional ultrashort pulse systems based on optical lasers have limited repetition rates below gigahertz level [30]. The electronic method proposed in [14] enables generation and radiation of picosecond pulses with repetition rates of up to 10 GHz. Since higher clock frequencies result in faster acquisition time, the current system is designed for the highest feasible repetition rate so that the entire pulse-based wireless synchronization system is able to operate with a 10-GHz clock rate. Lower clock frequencies increase the acquisition time of the imaging system, which becomes more critical in scanning a large object that requires numerous measurements. Higher clock frequencies are not feasible due to the bandwidth limit of the bondwire, which transfers the synchronized clock from the output of the chip to the circuit board. Therefore, the clock frequency is selected to be close to 10 GHz and is extracted from the eighth harmonic of the frequency comb by implementing a divide-by-eight frequency divider. It should be noted that due to the frequency response of the pulse radiator and the receiver on-chip antenna, the first harmonics have very low powers and cannot be used to extract the clock rate. The 80-GHz frequency is close to the center frequency of the received pulses and has sufficient power for wireless time transfer.

An on-chip slot PICA with a back-side silicon lens is implemented to capture mm-wave pulses with high efficiency. The received signal is fed to an LNA centered at 80 GHz through a coplanar waveguide. Three stages of divide-bytwo ILFDs are implemented after the LNA. This architecture ensures the functionality of the divider with lower input powers, in comparison with a single-stage divide-by-eight divider. A matching circuit delivers the extracted clock to the output. This output clock, which is locked to the input clock of the pulse radiator, can be used to operate another pulse radiator in tight synchronization with the master pulse radiator. It can also be used to operate a high-speed sampling circuit to

TABLE I
ASSUMPTIONS FOR LINK BUDGET CALCULATIONS

| EIRP <sub>t</sub>         | -30 dBm |  |  |
|---------------------------|---------|--|--|
| Wavelength                | 4 mm    |  |  |
| RX Antenna Gain           | 9.5 dBi |  |  |
| LNA Noise Figure          | 7 dB    |  |  |
| LNA Gain                  | 20 dB   |  |  |
| Divider Input Sensitivity | -50 dBm |  |  |
| Input Bandwidth           | 1 GHz   |  |  |
| Clock Frequency           | 10 GHz  |  |  |
| Clock Frequency Range     | 125 MHz |  |  |

recover the waveform of the received pulses for applications such as spectroscopy.

To develop specifications of the building blocks of this receiver, link budget calculations are done by assuming the values in Table I. The receiver is designed to detect picosecond pulses radiated from the silicon-based transmitter presented in [14], which has an EIRP  $(P_t + G_t)$  of -30 dBm at 75–80 GHz. The detector is aimed to detect pulses with a repetition rate of 10 GHz, which is the highest end of the clock range in [14]. It is assumed that the center frequency of received pulses lies at 75 GHz so it is chosen as the center frequency of the input stages. An on-chip planar inverted-cone antenna, discussed in Section II-B, is implemented for pulse reception. For the following link budget simulation, the gain of this antenna is assumed to be 9.5 dBi, which is based on the antenna simulations.

Using the EIRP value of the transmitter chip, simulated gain of the receiver antenna, and the path loss of the operating wavelength, the relationship between the received power and the distance can be derived from the Friis transmission equation

$$P_{\rm r} = P_{\rm t} + G_{\rm t} + G_{\rm r} + 20\log\left(\frac{\lambda}{4\pi\,R}\right) \tag{1}$$



Fig. 3. Illustration of a widely spaced array made of multiple receiver chips in order to estimate the maximum number of array elements.

where  $P_{\rm r}$ ,  $P_{\rm t}$ ,  $G_{\rm t}$ ,  $G_{\rm r}$ ,  $\lambda$ , and R denote received power, transmitted power, TX antenna gain, RX antenna gain, wavelength, and distance, respectively. Hence, the estimated received power at the receiver antenna port is equal to  $-90.5~{\rm dBm} - 20{\rm log}~R$ .

Assuming that the frequency divider chain requires a minimum input power of -50 dBm to lock to the received signal, the desired sensitivity of the receiver, considering an LNA gain of 20 dB, would be -70 dBm. Based on the Friis calculation results, this sensitivity translates to a maximum TX/RX distance of 10 cm. This distance can be increased by using signal sources with higher radiated power levels. The signal-to-noise ratio (SNR) at the input of the divider chain can be calculated from the following equation:

$$P_{\rm r} = -174 \text{ dBm/Hz} + \text{NF} + 10 \log \text{BW} + \text{SNR}_{\text{min}} \quad (2)$$

where  $P_r$ , NF, and BW represent the received power, the LNA noise figure, and the input bandwidth, respectively. Considering a 1-GHz effective input locking range, which gives us 125-MHz clock frequency range at the output of the injection-locked divider, minimum received power of -70 dBm, and a typical LNA noise figure of 7 dB, the SNR at the divider input is estimated as 7 dB. A summary of the assumed parameters for link budget calculations is tabulated in Table I.

In implementing a wirelessly synchronized array, the maximum aperture size of an array, such as Fig. 3, is determined by the distance between the transmitter and the array (D), the beamwidth of the receiving antennas ( $\theta$ ), and the sensitivity of the receivers ( $P_{\rm sen}$ ). For a distance of D=10 cm and f=80 GHz, the received power at the receiver antenna port is estimated to be -70.5 dBm from the Friis transmission equation. Therefore, the furthest element of the array should have a realized gain of at least +0.5 dBi in order to detect the received signal. As a result, based on the antenna gain simulations in Section II-B, the maximum angle that the furthest element can be placed to achieve a +0.5-dBi gain is  $34^{\circ}$ . This means that for D=10 cm, the maximum aperture size of the array is  $13.5 \times 13.5$  cm<sup>2</sup>, and for an element spacing of 6.75 cm with the current chip specifications, a total number of nine elements



Fig. 4. Slot PICA and the chip assembly with silicon wafer and silicon lens.



Fig. 5. (a) Simulated impedance and (b) simulated VSWR of the on-chip antenna.

can be synchronized in this array, which would consume 378-mW dc power. Larger synchronized arrays can be implemented by increasing the radiated power from the transmitter.

## B. On-Chip Antenna

A slot PICA has been studied as a wideband antenna for UWB applications [31]. A mm-wave on-chip version of this antenna, shown in Fig. 4, is implemented in this work. The slot PICA consists of a leaf-shaped monopole structure implemented inside a larger slot with the same shape and a ground plane surrounding them. The shape of this structure is designed to have a wideband operation by modes higher than its first resonance, which makes it suitable for broadband usage. The structure and dimensions of this on-chip antenna are illustrated in Fig. 4. The lengths on the metallic piece and the slot are selected as 250 and 400  $\mu$ m, respectively, to ensure sufficient efficiency at 80 GHz and satisfy the area restrictions.

The substrate modes propagated in the p-doped  $250-\mu$ m-wide silicon substrate of the chip reduce the radiation efficiency of the antenna. One solution to increase the efficiency of the antenna in such conditions is to use a silicon lens to radiate the signal from the back of the chip with higher efficiency [32]. In this work, an undoped silicon wafer and a silicon lens with a diameter of 12 mm are mounted on the backside of the chip to increase the radiation efficiency by reducing the substrate modes.

This antenna structure is simulated in CST Studio Suite using a finite integration technique solver [33]. Fig. 5 shows the simulated impedance, VSWR, and radiation efficiency of the antenna. The simulated radiation efficiency and total efficiency of the antenna are plotted in Fig. 6, in which the mismatch losses reduce the total efficiency of the antenna. It is shown that the antenna operates in the 80–100-GHz range with sufficient efficiency and small reactance at the port. Fluctuations in the simulated impedance are due to the lens-air boundary in simulations. Reflections caused by different dielectric constants of the lens and air at the edge of the lens can create a rippled impedance pattern over frequency. A matching layer between the lens and the



Fig. 6. Simulated (a) radiation efficiency and (b) total efficiency of the on-chip antenna for different frequencies.



Fig. 7. Simulated (a) *E*-plane realized gain, (b) *H*-plane realized gain, and (c) directivity radiation pattern of the on-chip antenna at 80 GHz.



Fig. 8. Circuit schematic of the 80-GHz LNA.

can reduce this effect and increase the radiated power [34]. The simulated realized gain and 3-D radiation pattern of the antenna directivity at 80 GHz are reported in Fig. 7. When it is radiating from the backside ( $\theta$ =180°), the antenna has a maximum directivity of +13.3 dBi and realized gain of +6 dBi at 80 GHz. Based on Fig. 7, the simulated 3-dB beamwidth of the antenna is 46°, which is suitable for a distributed wireless time transfer array when multiple chips need to receive a reference pulse from a master node even if they are not directly placed in front of the transmitter.

## C. LNA

The schematic of the designed LNA is presented in Fig. 8. Three stages of common-source amplifier stages with optimized W/L sizes and transmission line loading amplifies the received signal around the center frequency with a small noise figure. Shielded microstrip lines with widths of 3  $\mu$ m and spacing of 6  $\mu$ m are used. The circuit has been simulated with gate bias voltages of 1 V, and the gain and noise figure results are reported in Fig. 9. The amplifier achieves a minimum reflection coefficient of -5.5 dB at 75 GHz and demonstrates a maximum voltage gain of 15 dB and a minimum noise figure of 7.1 dB. All three stages of this amplifier draw a total dc current of 23 mA in measurements.



Fig. 9. Simulated performance of the LNA. (a) Voltage gain with antenna as the input source. (b) Reflection coefficient. (c) Noise figure.

## D. Three-Stage ILFD

Pulse-driven injection-locking has been implemented for UWB pulses with megahertz repetition rates in [28] where numerous tones injection-lock a voltage-controlled oscillator (VCO). Fig 10 illustrates the schematic of the three-stage ILFD that is used to divide one high-power tone of the mm-wave pulse with 10 GHz  $\pm f_{LR}$  repetition rate by eight to extract the repetition rate. A divide-by-eight frequency divider can be realized with CML architecture [35] but the injection frequency in these architectures is limited to frequencies below millimeter-wave. In this work, cross-coupled VCOs are designed as ILFDs since the operating frequency can be scaled up to mm-wave frequencies. Each of the three VCOs divides its input frequency by two if it is within the locking range. The first VCO is centered at 40 GHz using 430-μm transmission lines to tune the frequency. The other two VCOs are centered at 20 and 10 GHz using on-chip symmetric inductors as their loads. MOS-based varactors are used in each of the VCOs to fine-tune the oscillation frequencies. The tail currents of the cross-coupled oscillators are biased at 3.3, 4.3, and 4.3 mA, respectively.

As discussed in [36], the input locking range in a cross-coupled oscillator is calculated with some approximation by

$$\omega_{\rm L} = \frac{\omega_0}{2Q} \cdot \frac{K \cdot I_{\rm inj}}{I_{\rm osc}} \tag{3}$$

where  $\omega_L$ ,  $\omega_0$ , Q,  $I_{\rm inj}$ , and  $I_{\rm osc}$  are the locking range, resonance frequency, quality factor, injected current to the tail, and oscillation current of the cross-coupled oscillator, respectively. Nonlinearity of the two cross-coupled transistors forms a mixer that converts the injected frequency to the oscillation frequency. K denotes the conversion gain of the mixer and it determines the minimum required  $I_{\rm inj}$  to achieve a certain  $\omega_L$  in the cross-coupled VCO. K is equal to  $2/\pi$  when  $\omega_{\rm inj} = 2 \cdot \omega_0$  assuming that the transistors are instantly switched. When  $\omega_{\rm inj} = 8 \cdot \omega_0$ , this value becomes smaller as the conversion gain gets smaller since we are relying on the eighth-order nonlinearity of the cross-coupled transistors. A mathematical analysis of ILFDs and detailed derivation of their locking range were done in [37], which confirms our conclusion that a smaller



Fig. 10. Circuit schematic of the three-stage ILFD.



Fig. 11. Simulated sensitivity of the 10-GHz cross-coupled VCO when it is used as divide-by-2 and divide-by-8 ILFDs.



Fig. 12. Simulated locking bandwidths of (a) designed 40-GHz ILFD and (b) 20-GHz ILFD.

injected power is required to lock a divide-by-two ILFD than a divide-by-eight ILFD with the same architecture. To verify this claim, the 10-GHz cross-coupled oscillator shown in Fig. 10 is simulated both as a divide-by-two and a divide-by-eight ILFD. The minimum required power to lock the VCO in these two cases is plotted and compared in Fig. 11, which confirm that larger power is required to lock the output signal when the structure is used as a divide-by-eight frequency divider. For example, an 83.84-GHz input signal needs to be 32 dB larger than a 20.96-GHz input signal in order to lock the VCO output to 10.48 GHz. Therefore, three stages of divide-by-two cross-coupled ILFDs were implemented in this system to have smaller input power requirement.

Simulated sensitivity results of 40- and 20-GHz ILFD blocks for a fixed set of tuning voltages are reported in Fig. 12. The main required condition to operate this system is to have sufficient power at the input in order to lock the 40-GHz VCO. Due to the large output power of the 40- and 20-GHz VCOs in comparison with the input signal, the locking requirements for the next two VCOs are easily satisfied.



Fig. 13. Chip micrograph of the fabricated injection-locked pulse receiver.

The desired clock frequency range can be achieved by tuning the VCOs simultaneously. Free-running frequencies of all three VCOs are designed to be more than 10, 20, and 40 GHz to maintain a margin for possible resonance frequency reduction due to parasitic effects.

## III. MEASUREMENT RESULTS

The injection-locked pulse detector is fabricated in the Globalfoundries 65-nm CMOS liquid phase epitaxy (LPE) process, and the die photograph is shown in Fig. 13. The total chip area is 0.9 mm<sup>2</sup>, and the active area of the circuitry is 0.46 mm<sup>2</sup>. A silicon lens with a diameter of 12 mm is attached to the backside of the circuit board and centered at the chip to enhance the radiation efficiency while an undoped silicon wafer is used in between. The pads on the chip are bonded to a PCB with a thickness of 250 µm and Rogers 4350B as the dielectric material. The dc power consumption of the receiver chip is 42.6 mW. A detailed breakdown of power consumption of this receiver is illustrated in Fig. 14. In this section, three test setups are used to characterize the performance of the chip, demonstrate its operation in a wirelessly synchronized distributed array, and detect a radiated pulse train with picosecond pulsewidth.

## A. Receiver Characterizations

In order to characterize the performance of the receiver, the chip is tested with a CW signal source, as shown in Fig. 15. An Active Multiplier Chain (AMC) in the W-band generates CW signals at the receiver frequency range by multiplying the PSG output frequency by six. The multiplier output, with a typical power of +11 dBm, is radiated by a standard



Fig. 14. Power consumption breakdown of the receiver.



Fig. 15. CW test setup for characterizing the receiver performance.



Fig. 16. Effect of tuning  $V_2$  and  $V_3$  bias voltages on the measured output frequency of the receiver.

pyramidical horn antenna with a gain of +24 dBi. The transmitter power value is based on the datasheet of the Millitech multiplier and it cannot be tuned. Based on the Friis transmission equation, the transmitter parameters, and considering a simulated receiver antenna gain of +9 dBi, the received power of a 76-GHz signal at a 10-cm distance is 250  $\mu$ W. The output of the receiver chip is inspected by a Keysight signal analyzer.

The free-running output of the receiver is measured when no signal is being radiated from the source. The output frequency of the 10-GHz VCO and its variations due to two tuning voltages are plotted in Fig. 16. This plot shows that when the output of the second 20-GHz VCO is within the locking range of the 10-GHz VCO and the last oscillator stage is locked to the previous one, changing  $V_3$  will not affect the output frequency. However, if  $V_2$  and  $V_3$  change substantially so that the last stage is not locked to the previous one anymore,  $V_3$  will be the only tuning voltage on the output frequency. The rest of the measurements in this section are done when all the oscillators are internally locked to each other, and the radiated signal is used to externally lock the receiver. With the output frequency tuned at 9.5 GHz, the input center frequency to injection-lock the system is 76 GHz. This setting is used to measure the maximum source/detector distance for radiated frequencies around 76 GHz, and the results are shown in Fig. 17. Fig. 18 demonstrates the input



Fig. 17. Maximum measurement distance to lock the receiver for different received frequencies.



Fig. 18. Measured locking range of the receiver input for different TX-RX distances.

locking range of the receiver for different distances between the source and the detector. When the tuning voltages are constant, the locking range of the system is small and it is limited to the overlapping of locking ranges of all three VCOs. Because of a number of issues, the overall locking range of the system with fixed tuning voltages is smaller than expected. PVT variation effects, among other issues, change the center frequency and locking range of the VCOs and reduce the overall locking range by reducing the overlapping of these three ranges. Other reasons for changes in the VCO frequencies include inaccuracies in the models of inductors and transmission line bends used in the oscillator tanks. The locking bandwidths and sensitivities of the VCOs could be altered by parasitic effects which modify the quality factor of the oscillator tanks. Unexpected attenuations in the received power could also influence the overall locking range for fixed tuning voltages. However, by adjusting the tuning voltages, the receiver is able to lock to a wider range of frequencies, as shown in Fig. 19. This effective locking range depends on the tuning ranges of the VCOs and their locking ranges. For each measured point in Fig. 19,  $V_1$ ,  $V_2$ , and  $V_3$  are set in a way that the output frequency is locked to an eighth of the input frequency. Based on the results in Fig. 19, the effective input locking range of the receiver is 1.13 GHz, which means that the output locking range is 142 MHz around 9.6 GHz.

## B. Wireless Interchip Time Transfer

As it was discussed in Section I, one of the key applications of the presented injection-locked pulse detector



Fig. 19. Measured effective locking range of the receiver input by adjusting the tuning voltages [29].



Fig. 20. Test setup to demonstrate high-accuracy wireless time transfer among widely spaced chips.

is in large-aperture distributed arrays where multiple widely spaced elements need to be synchronized with a reference signal with a minimum jitter. The test setup in Fig. 20 is used to demonstrate such application in a two-element prototype of this system. The same signal source configuration as in Section IV-A radiates a 76-GHz signal to two similar injection-locked receiver chips. Two mm-wave lenses made of polyethylene are placed to focus the radiated beam on the receiver chips. The outputs of these two chips are fed to a power combiner (Mini-Circuits ZFRSC-183-S+), and the output of the power combiner is analyzed with a Keysight N9030A signal analyzer. The measurement results of this wireless time transfer test are reported in Fig. 21. Fig. 21(a) and (b) are obtained when only one of the receiver chips is turned on and they show the outputs of each chip without combining with the other one. When both chips are turned on, the spectrum in Fig. 21(c) is measured, which is the coherently combined spectra of the two receiver outputs. The measured phase noise of this combined signal is plotted in Fig. 21(d) which indicated a phase noise of -109 dBc/Hz at 1-MHz frequency offset, verifying the accuracy of the wireless time transfer scheme.

## C. Picosecond Pulse Detection

Finally, the injection-locked pulse detector is tested with a picosecond pulse-radiating silicon chip, which was reported in [14]. In this test setup, shown in Fig. 22, a Keysight RF signal source provides the repetition rate to the pulse generator input. The receiver chip is placed in front of the pulse radiator



Fig. 21. Measured results of the wireless time transfer test. (a) Spectrum of the first receiver output. (b) Spectrum of the second receiver output. (c) Spectrum of the power combiner output. (d) Phase noise of the combined signal showing the accuracy of the coherently combined clock.



Fig. 22. Test setup for detection of picosecond pulses radiated from a silicon chip.



Fig. 23. Measured (a) spectrum, (b) phase noise, (c) time-domain waveform, and (d) rms jitter of the receiver output in the picosecond pulse test [29].

with adjustable distance, and its output is measured by a signal analyzer as well as a sampling oscilloscope. In the measurement results, presented in Fig. 23, the spectrum of the receiver output is measured with and without radiating picosecond pulses. It is shown in Fig. 23(a) that the radiated picosecond pulses, with a measured EIRP of -35 dBm at 80 GHz, can successfully lock the receiver output. The locking behavior can also be observed in Fig. 23(b) where the measured output of the receiver has a lower phase noise in the presence of radiated picosecond pulses, shown in red, in comparison with the case when no input signal is injected to the receiver, shown in green. The time-domain waveform of the receiver output and

|                                        | This Work                    | [22]                                   | [24]                                   | [28]          | [26]                         |
|----------------------------------------|------------------------------|----------------------------------------|----------------------------------------|---------------|------------------------------|
| Technology                             | 65nm CMOS                    | $0.13 \mu \mathrm{m} \; \mathrm{CMOS}$ | $0.18 \mu \mathrm{m} \; \mathrm{CMOS}$ | 90nm CMOS     | 0.13μm SiGe<br>BiCMOS        |
| Wireless Source                        | Picosecond Pulse<br>(80 GHz) | Continuous wave (17–18.7 GHz)          | Pulse (0.2–7 GHz)                      | Pulse (4 GHz) | Picosecond Pulse<br>(50 GHz) |
| Output Jitter (ps)                     | 0.29 (rms)                   | 5 (pk–pk)                              | 4.6 (rms)                              | 7.6 (rms)     | 0.38 (rms)                   |
| Clock Frequency (GHz)                  | 9.6                          | 2.25                                   | 3                                      | 0.5           | 3.1                          |
| Clock Frequency Range<br>(GHz)         | 0.14                         | 0.202                                  | Not Tunable                            | 0.07          | 8                            |
| Multi-Chip Wireless<br>Synchronization | Yes                          | No                                     | No                                     | No            | Yes                          |
| On-chip Antenna                        | Yes                          | Yes                                    | Yes                                    | No            | Yes                          |
| Die Area (mm <sup>2</sup> )            | 0.46                         | 1.14 (with antenna, without pads)      | 2.34 (with antenna)                    | 2 (with pads) | 1.89 (with pads and antenna) |
| Power Consumption (mW)                 | 42                           | 80 (RX)                                | 43                                     | 45            | 146                          |

TABLE II
PERFORMANCE COMPARISON OF WIRELESS SYNCHRONIZATION RECEIVERS

its measured rms jitter are also shown in Fig. 23(c) and (d). The mean jitter of the output with 64 averaged waveforms is 0.29 ps<sub>rms</sub>, which is only 30 fs larger than the jitter of the radiating source input signal that was measured under the same conditions. The added jitter is caused by the combination of the transmitter chip [14] and the receiver chip, which verifies the accuracy of the proposed picosecond-pulse-based wireless system for clock synchronization.

Performance of this injection-locked pulse detector is compared with other silicon-based receivers used for wireless time transfer in Table II. This work offers a new fully integrated solution based on a mm-wave pulse detector that consumes the smallest dc power among the references. The short pulsewidth of the time transfer signal has the potential to be used in imaging systems with enhanced depth resolution, while the multi-chip wireless time transfer capability can potentially increase the effective aperture size and, thus, the lateral resolution of such systems. Repetition rate of the detected pulses can vary over a 140-MHz range, which indicates the frequency range of the wirelessly transferred clock. The small jitter and phase noise of the receiver output, in comparison with the references, verify the accuracy of the synchronziation scheme among multiple elements in an array.

#### IV. CONCLUSION

A CMOS impulse detector with a center frequency of 77 GHz is presented to achieve low-jitter interchip wireless time transfer. The impulse detector, which includes an on-chip slot PICA, is based on a three-stage divide-by-8 ILFD. It is shown that a three-stage divider has better input sensitivity than a single-stage divide-by-8 divider. The output of the receiver is locked to the input repetition rate with an rms jitter of 0.29 ps. A wireless time transfer test with two impulse detector chips demonstrates that a low-jitter 9.5-GHz clock is distributed among widely spaced nodes in a large-aperture array.

## ACKNOWLEDGMENT

The authors would like to thank Dr. M. Assefzadeh for providing the pulse-radiating chip for measurements.

#### REFERENCES

- [1] K. B. Cooper *et al.*, "Penetrating 3-D imaging at 4- and 25-m range using a submillimeter-wave radar," *IEEE Trans. Microw. Theory Techn.*, vol. 56, no. 12, pp. 2771–2778, Dec. 2008.
- [2] T. Terada, S. Yoshizumi, M. Muqsith, Y. Sanada, and T. Kuroda, "A CMOS ultra-wideband impulse radio transceiver for 1-Mb/s data communications and ± 2.5-cm range finding," *IEEE J. Solid-State Circuits*, vol. 41, no. 4, pp. 891–898, Apr. 2006.
- [3] V. V. Kulkarni, M. Muqsith, K. Niitsu, H. Ishikuro, and T. Kuroda, "A 750 Mb/s, 12 pJ/b, 6-to-10 GHz CMOS IR-UWB transmitter with embedded on-chip antenna," *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 394–403, Feb. 2009.
- [4] L. Zhou, Z. Chen, C. C. Wang, F. Tzeng, V. Jain, and P. Heydari, "A 2-Gb/s 130-nm CMOS RF-correlation-based IR-UWB transceiver front-end," *IEEE Trans. Microw. Theory Techn.*, vol. 59, no. 4, pp. 1117–1130, Apr. 2011.
- [5] B. Vigraham and P. R. Kinget, "A self-duty-cycled and synchronized UWB pulse-radio receiver SoC with automatic threshold-recovery based demodulation," *IEEE J. Solid-State Circuits*, vol. 49, no. 3, pp. 581–594, Mar. 2014.
- [6] C. Zhang, M. J. Kuhn, B. C. Merkl, A. E. Fathy, and M. R. Mahfouz, "Real-time noncoherent UWB positioning radar with millimeter range accuracy: Theory and experiment," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 1, pp. 9–20, Jan. 2010.
- [7] D. C. Daly et al., "A pulsed UWB receiver SoC for insect motion control," *IEEE J. Solid-State Circuits*, vol. 45, no. 1, pp. 153–166, Jan. 2010.
- [8] D. Morche et al., "Double-quadrature UWB receiver for wide-range localization applications with sub-cm ranging precision," IEEE J. Solid-State Circuits, vol. 48, no. 10, pp. 2351–2362, Oct. 2013.
- [9] E. Afshari and A. Hajimiri, "Nonlinear transmission lines for pulse shaping in silicon," *IEEE J. Solid-State Circuits*, vol. 40, no. 3, pp. 744–752, Mar. 2005.
- [10] L. Zou, S. Gupta, and C. Caloz, "A simple picosecond pulse generator based on a pair of step recovery diodes," *IEEE Microw. Wireless Compon. Lett.*, vol. 27, no. 5, pp. 467–469, May 2017.
- [11] A. Arbabian, S. Callender, S. Kang, M. Rangwala, and A. M. Niknejad, "A 94 GHz mm-wave-to-baseband pulsed-radar transceiver with applications in imaging and gesture recognition," *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 1055–1071, Apr. 2013.

- [12] B. P. Ginsburg, S. M. Ramaswamy, V. Rentala, E. Seok, S. Sankaran, and B. Haroun, "A 160 GHz pulsed radar transceiver in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 49, no. 4, pp. 984–995, Apr. 2014.
- [13] X. Wu and K. Sengupta, "Dynamic waveform shaping with picosecond time widths," *IEEE J. Solid-State Circuits*, vol. 52, no. 2, pp. 389–405, Feb. 2017.
- [14] M. M. Assefzadeh and A. Babakhani, "Broadband oscillator-free THz pulse generation and radiation based on direct digital-to-impulse architecture," *IEEE J. Solid-State Circuits*, vol. 52, no. 11, pp. 2905–2919, Nov. 2017.
- [15] P. Chen, M. M. Assefzadeh, and A. Babakhani, "A nonlinear Q-switching impedance technique for picosecond pulse radiation in silicon," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 12, pp. 4685–4700, Dec. 2016.
- [16] H. Aggrawal, R. Puhl, C. Studer, and A. Babakhani, "Ultra-wideband joint spatial coding for secure communication and high-resolution imaging," *IEEE Trans. Microw. Theory Techn.*, vol. 65, no. 7, pp. 2525–2535, Jul. 2017.
- [17] A. Oncu and M. Fujishima, "19.2 mW 2 Gbps CMOS pulse receiver for 60 GHz band wireless communication," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2008, pp. 158–159.
- [18] S. Koenig et al., "Wireless sub-THz communication system with high data rate," Nature Photon., vol. 7, p. 977, Oct. 2013.
- [19] P. N. Chen, P. J. Peng, C. Kao, Y. L. Chen, and J. Lee, "A 94 GHz 3D image radar engine with 4TX/4RX beamforming scan technique in 65 nm CMOS technology," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2013, pp. 146–147.
- [20] V. Krozer et al., "Terahertz imaging systems with aperture synthesis techniques," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 7, pp. 2027–2039, Jul. 2010.
- [21] B. A. Floyd, C.-M. Hung, and K. K. O, "Intra-chip wireless interconnect for clock distribution implemented with integrated antennas, receivers, and transmitters," *IEEE J. Solid-State Circuits*, vol. 37, no. 5, pp. 543–552, May 2002.
- [22] X. Guo, D.-J. Yang, and R. Li, "A receiver with start-up initialization and programmable delays for wireless clock distribution," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2006, pp. 1530–1539.
- [23] C. N. M. Marins et al., "Precision clock and time transfer on a wireless telecommunication link," *IEEE Trans. Instrum. Meas.*, vol. 59, no. 3, pp. 512–518, Mar. 2010.
- [24] N. Sasaki, K. Kimoto, W. Moriyama, and T. Kikkawa, "A single-chip ultra-wideband receiver with silicon integrated antennas for inter-chip wireless interconnection," *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 382–393, Feb. 2009.
- [25] J. A. Nanzer, R. L. Schmid, T. M. Comberiate, and J. E. Hodkin, "Open-loop coherent distributed arrays," *IEEE Trans. Microw. Theory Techn.*, vol. 65, no. 5, pp. 1662–1672, May 2017.
- [26] B. Jamali and A. Babakhani, "A self-mixing picosecond impulse receiver with an on-chip antenna for high-speed wireless clock synchronization," *IEEE Trans. Microw. Theory Techn.*, vol. 66, no. 5, pp. 2313–2324, May 2018.
- [27] L. Kull et al., "A 90GS/s 8b 667mW 64 interleaved SAR ADC in 32 nm digital SOI CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2014, pp. 378–379.
- [28] C. Hu, R. Khanna, J. Nejedlo, K. Hu, H. Liu, and P. Y. Chiang, "A 90 nm-CMOS, 500 Mbps, 3–5 GHz fully-integrated IR-UWB transceiver with multipath equalization using pulse injection-locking for receiver phase synchronization," *IEEE J. Solid-State Circuits*, vol. 46, no. 5, pp. 1076–1088, May 2011.
- [29] B. Jamali and A. Babakhani, "A fully integrated injection-locked picosecond pulse receiver for 0.29 pSrms-jitter wireless clock synchronization in 65 nm CMOS," in *IEEE MTT-S Int. Microw. Symp. Dig.*, Jun. 2017, pp. 1210–1213.
- [30] Y. Sasaki, H. Yokoyama, and H. Ito, "Dual-wavelength optical-pulse source based on diode lasers for high-repetition-rate, narrow-bandwidth terahertz-wave generation," *Opt. Express*, vol. 12, no. 14, pp. 3066–3071, Jul. 2004.
- [31] S. Cheng, P. Hallbjorner, and A. Rydberg, "Printed slot planar inverted cone antenna for ultrawideband applications," *IEEE Antennas Wireless Propag. Lett.*, vol. 7, pp. 18–21, 2008.
- [32] A. Babakhani, G. Xiang, A. Komijani, A. Natarajan, and A. Hajimiri, "A 77-GHz phased-array transceiver with on-chip antennas in silicon: Receiver and antennas," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2795–2806, Dec. 2006.

- [33] CST Microwave Studio, CST-Comput. Simul. Technol., Darmstadt, Germany, 2016.
- [34] D. F. Filipovic, G. P. Gauthier, S. Raman, and G. M. Rebeiz, "Off-axis properties of silicon and quartz dielectric lens antennas," *IEEE Trans. Antennas Propag.*, vol. 45, no. 5, pp. 760–766, May 1997.
- [35] S. Cheng, H. Tong, J. Silva-Martinez, and A. L. Karsilayan, "A fully differential low-power divide-by-8 injection-locked frequency divider up to 18 GHz," *IEEE J. Solid-State Circuits*, vol. 42, no. 3, pp. 583–591, Mar 2007
- [36] B. Razavi, "A study of injection locking and pulling in oscillators," *IEEE J. Solid-State Circuits*, vol. 39, no. 9, pp. 1415–1424, Sep. 2004.
- [37] H. R. Rategh and T. H. Lee, "Superharmonic injection-locked frequency dividers," *IEEE J. Solid-State Circuits*, vol. 34, no. 6, pp. 813–821, Jun. 1999.



Babak Jamali (S'13) received the B.S. degree in electrical engineering from the Sharif University of Technology, Tehran, Iran, in 2013, and the M.S. degree in electrical and computer engineering from Rice University, Houston, TX, USA, in 2016, where he is currently pursuing the Ph.D. degree.

He was an RFIC Design Intern with Qualcomm, Inc., San Diego, CA, USA, in 2018. Since 2018, he has been a Visiting Graduate Researcher with the University of California at Los Angeles (UCLA), Los Angeles, CA, USA. His current research inter-

ests include millimeter-wave and terahertz integrated circuits, on-chip antennas, and integrated sensors for broadband spectroscopy and sensing.

Mr. Jamali was a recipient of the IEEE MTT-S Graduate Fellowship Award in 2018 and the Texas Instruments Distinguished Graduate Fellowship in 2013.



**Aydin Babakhani** (M'08) received the B.S. degree from the Sharif University of Technology, Tehran, Iran, in 2003, and the M.S. and Ph.D. degrees in electrical engineering from the California Institute of Technology, Pasadena, CA, USA, in 2005 and 2008, respectively.

From 2011 to 2016, he was an Assistant Professor of electrical and computer engineering, and from 2016 to 2017, he was a Louis Owen Junior Chair Assistant Professor with Rice University, Houston, TX, USA. He was a Post-Doctoral Scholar with the

California Institute of Technology in 2009. He was a Research Scientist with the IBM Thomas J. Watson Research Center, Ossining, NY, USA, in 2010. He was an Associate Professor with the Department of Electrical and Computer Engineering, Rice University, where he is currently the Director of the Rice Integrated Systems and Circuits Laboratory. He is also a member of DARPA Microsystems Exploratory Council and a co-founder of MicroSilicon, Inc. He is currently an Associate Professor with the Department of Electrical and Computer Engineering, University of California at Los Angeles (UCLA), Los Angeles, CA, USA, where he is also the Director of the Integrated Sensors Laboratory. He has authored or coauthored more than 85 articles in peer-reviewed journals and conference proceedings. He holds 21 issued or pending patents.

Dr. Babakhani was a recipient of multiple Best Paper Awards, including the Best Paper Award at the IEEE SiRF Conference in 2016, the Best Paper Award at the IEEE RWS Symposium in 2015, the Best Paper Award at the IEEE MTT-S IMS Symposium in 2014, and Second Place in the Best Paper Awards at the IEEE APS Symposium 2016 and IEEE MTT-S IMS Symposium 2016. His research is supported by NSF, DARPA, AFOSR, ONR, the W. M. Keck Foundation, SRC, and more than ten companies. He was also a recipient of the prestigious NSF CAREER Award in 2015, an Innovation Award from Northrop Grumman in 2014, a DARPA Young Faculty Award in 2012, the California Institute of Technology Electrical Engineering Department's Charles Wilts Best Ph.D. Thesis Prize for his work entitled "Near-Field Direct Antenna Modulation," the Microwave Graduate Fellowship in 2007, the Grand Prize in the Stanford-Berkeley-Caltech Innovators Challenge in 2006, the Analog Devices, Inc., Outstanding Student Designer Award in 2005, the California Institute of Technology Special Institute Fellowship, and an Atwood Fellowship in 2003. He was also a recipient of the Gold Medal Winner at both the National Physics Competition in 1998 and the 30th International Physics Olympiad, Padua, Italy, in 1999.