

# POWER OPTIMISATION FOR ADAPTIVE EMBEDDED WIDEBAND RADIOS

A Thesis submitted to the School of Computer Engineering of Nanyang Technological University

by

#### **Pham Hung Thinh**

in fulfillment of the requirement for the Degree of Doctor of Philosophy of for Computer Engineering

March 5, 2015

# **List of Figures**

| 2.1  | The spectrum of subcarriers in OFDM [61]                                              | 14 |
|------|---------------------------------------------------------------------------------------|----|
| 2.2  | OFDM transmission without cyclic prefix results ISI among adjacent symbol .           | 16 |
| 2.3  | OFDM transmission with cyclic prefix avoids ISI among adjacent symbol                 | 16 |
| 2.4  | Inserting Cyclic Prefix in the OFDM symbol                                            | 16 |
| 2.5  | the OFDM system model                                                                 | 17 |
| 2.6  | The block diagram of OFDM-based RF systems                                            | 18 |
| 2.7  | OFDM received symbol with timing offets of -1, 1, -5 and 5 in a, b, c, d,             |    |
|      | respectively                                                                          | 22 |
| 2.8  | Inter carrier interference (ICI) caused by frequency ofsset $\Delta f$                | 24 |
| 2.9  | The constellations of OFDM received symbol with frequency offets of 0.025,            |    |
|      | 0.5, 0.1 and 0.25 sub-carries spacing in a, b, c, d, respectively                     | 25 |
| 2.10 | The constellations of 5 consecutive OFDM received symbols with frequency              |    |
|      | offets of 0.025 and 0.05 in a, b respectively                                         | 26 |
| 2.11 | The constellations of an OFDM received symbol and 5 consecutive OFDM                  |    |
|      | received symbols with phase noise variance of $0.25  rad^2$ in (a), (b) respectively. | 27 |

# **List of Tables**

# **Research Introduction**

### **Background Literature**

#### 2.1 Software/Hardware Radio Platforms

Software Defined Radio (SDR) is a radio communication system where its components (e.g. mixers, filters, modulators/demodulators, etc.) are realised by means of software on a personal computer or embedded system instead of by specific hardware circuits. SDR provides a flexible platform on which application-specific radio systems can be implemented. The rapidly increasing level of flexibility and functionality provided by software platforms has led an increase of radio applications realizable by SDR. Cognitive radio (CR) is an application radio that became more practical and flexible based on the use of SDR. Early work in CR, led by Mitola [1], investigated primarily on upper layer adaptation, in which the radio platform can be changed to satisfy directly to anticipated user or application requirements. The radio finds out the required information and provides the needed services to the user or the application.

According to Mitola [2], the evolution of SDR to CR can be illustrated by the evolution through the three main capabilities [3]: awareness, adaptation, and cognition. Awareness can be used to predict or enhance the information from the known environment. For example, for RF-location aware, a wireless terminal correlates different sensing categories that RF spectrum occupation can be associated to its location. Adaptation may be performed once a terminal is aware of the environment. In RF-location awareness, spectrum occupation can be associated to location. When a terminal moves to a given location, adaptation could help the terminal prioritise spectrum free search in those bands that historically achieve the best performance at the location. Cognition learns from the environment and decides adaptation rules based on its

experience on a general set of objectives, for example obtaining the best quality of service or the lowest communication cost at a baseline service quality.

The work at Virginia Tech [4] presented an opposite motivation from Mitolas work. Instead of seeking to respond users goals, the CR performs based on the desire to exploit available technology. Rieser et al. [5,6] was investigated on the concept of a cognitive engine. Instead of seeking to respond a specific radio with intelligence, the research developed a software package that could be able to intelligently control any radios. The work was the first to separate the cognitive function from the radio platform and to study on cognition at the physical layer. While CRs are not strictly necessary to be built on a DSR platform, the flexibility of an SDR may be one motivation for trying the study of CR algorithms. Intelligent algorithms and capable radio platforms are needed for a CR to provide the ability to change the functionality to adapt dynamic conditions [4]. With a conventional specific platform, CRs have a limitation to only support simple tasks like spectrum sensing, where a set of parameters to hardware components can be adjusted. SDR platform allows cognitive systems to radically change configurations and roles according to a variety of stimuli. These types of systems are especially important in situations where there are finite resources or when the desired behavior can be re-defined at deployment time.

Recently, the CR is also motivated by the spectrum scarcity and inefficient spectrum usage [7,8]. The CR can allows that the spectrum allocated to the licensed users, known as Primary Users (PUs), can be reused by unlicensed users, referred to as Secondary Users (SUs), when the PU is not using it. Thus, the utilisation of the SUs on the locally unoccupied spectrum can make the overall utilisation efficiency of the licensed spectrum can be increased. Federal Communications Commission (FFC) has also given a definition for CR as "a radio or system that senses its operational electromagnetic environment and can dynamically and autonomously adjust its radio operating parameters to modify system operation, such as maximize throughput, mitigate interference, facilitate interoperability, access secondary markets." [9].

#### 2.1.1 Multiple Standard Cognitive Radios

The spectral resource demands of wireless telecommunication systems continues to increase. Cognitive Radios, that have ability to adapte to channel conditions, ensuring effective spectrum usage, are an important technology for achieving this. They are designed to transmit in currently unused spectrum without causing harmful interference to PUs or incumbent users (IUs). Apart from the critical issues of spectrum sensing and band allocation, the lower priority of SUs raises a challenge in terms of transmission capability and quality of service in cognitive radios. When the spectrum allowed for a CR system is fully occupied by PUs and IUs, the transmission of CRs can be blocked. Multiple Standard Cognitive Radios (MSCRs) are able to operate in multiple frequency bands with different specified standards. MSCRs are hence a more flexible generalisation of CRs as they can operate across different bands and standards.

Multi-carrier modulation techniques offer an ideal opportunity for such systems due to their regularity and parameterisation. OFDM and Filter Bank Multi Carrier (FBMC) are two types of multi-carrier modulations. OFDM modulation has been the dominant technique adopted for many standards and has been investigated in terms of spectral sensing and carrier allocation for CRs. Furthermore, OFDM system implementation is simple, low cost, and can be effectively parameterised in comparison to FBMC systems. OFDM is a suitable candidate for a MSCR system. The advantages of coupling OFDM modulation with an FPGA platform are investigated for the feasibility of implementing the proposed MSCR system. The flexibility requirement of a MSCR is to support existing standards like 802.11 [10], 802.16 [11], and 802.22 [12], as well as supporting future OFDM-based standards.

A feasible OFDM-based MSCR requires the ability to switch baseband processing from one standard to another. This means that the system needs to rapidly adapt its functionality to perform variable numbers of FFT/IFFT operations, insert CP of configurable length, and handle differend pilot vectors as well as different preambles. The hardware implementation of an OFDM-based MSCR can be based on the original architecture of an OFDM system discussed in followed Section. However, each sub-module of the system needs to be designed to perform with all parameters of the supported standards. This results in significantly increasing system complexity as well as reconfiguration time.

Beside the challenge of long configuration time for MSCR, there are two key challenges of the baseband related to the synchronizer and spectral shaper. OFDM systems typically tolerate a small carrier frequency offset (CFO) leading to strict constraints on the design of the RF front-end. In an MSCR system, the RF front-end should access a wide range of frequencies depending on the standard in operation. Such a precise and yet wide ranging frequency

requirement makes the RF front-end design difficult if not impossible or require very expensive components. CRs also demand small spectral leakage for both in-band and out-of-band transmitted signals to avoid causing harmful interference to primary users, while OFDM signals have intrinsically large side lobes leading to a potentially large degree of spectral leakage. Flexibility in the baseband can allow a frequency guard extending technique to achieve spectral leakage requirements.

The interface to the higher layer processing is another important factor. Many hardware radio platforms are extremely difficult to design for or to modify. Hence, only hardware experts can use them. While detailed optimisation of low level blocks is important, providing a general interface for implementing higher layer processing is also important. This ensures that radio experts may use the system to investigate cognitive radio techniques without the need for specific advanced low-level FPGA expertise.

#### 2.1.2 Existing Radio Platforms

There are many comprehensive efforts at some institutions or research labs in the areas of SDR and CR. The Information and Telecommunication Technology Center at the University of Kansas has built the Kansas University Agile Radio (KUAR) hardware platform [13]. The platform is a mature radio platform built around a fully-featured Pentium PC with a Xilinx Virtex II FPGA. In addition, they collaborating with Rutgers University and CarnegieMellon University have developed in the creation of an experimental protocol stack for CR networks known as CogNet [14]. More over, The Wireless Information Network Laboratory at Rutgers University is also involved to the CogNet with the development of their CR hardware platforms referred as WiNC2R [15]. Georgia Tech also has a research team whose works on CR are spanned from developing a radio-frequency (RF) chip [16], spectrum sensing [17] to higher layer issues such as cognitive mesh networks [18]. The Berkeley Wireless Research Center at the University of California Berkeley has engaged similarly broad efforts on the development of a CR network emulator based on FPGA [19] as well as DSA system design [20]. The Centre for Telecommunications Value-Chain Research in Ireland has conducted a similarly research including both radio platform development and the development of reconfigurable radio and network architectures known as IRIS for cognitive systems [21]. IRIS [22] did receive some

limited support for FPGAs and partial reconfiguration [23, 24] but be not suited to truly embedded implementations with high performance baseband processing. Van de Belt et al. [25] recently presented hardware acceleration for the IRIS on the Zyng platform. WARP [26] is a FPGA based platform developed at CMC lab, Rice University with a recent revision (v3) hosting a Xilinx Virtex 6 FPGA and up to 2 RF interfaces. The baseband can be implemented in the FPGA fabric, while the higher layers are coded in C as a standalone application on an embedded Microblaze processor. ZCluster developed by Lin and Chow [27] at Department of Electrical and Computer Engineering, University of Toronto is a cluster of Zyng ZC702 boards. The cluster employs the Hadoop framework to distribute MapReduce applications across the framework. The ARM cores run Xillinux, which is an Ubuntu 12.04 based operating system released by Xillybus. Microsoft Research Asia launched SORA [28], a PCI card designed to allow powerful radios to be implemented in Windows desktop computers. A Xilinx Virtex 5 FPGA is used to provide high bandwidth deterministic communication between the PC and the radio frontend, but the intent is for the physical layer to be implemented in software on high performance desktop class processors. GNU Radio [29] is a widely used platform in academia, and is typically coupled with the Ettus USRP radio front end. It is a software application designed to run on general purpose processors. While embedded implementation is possible, e.g. on the Ettus USRP E100, where GNU Radio is executed on the embedded ARM processor. The radio systems are modeled by flowgraphs that represent the flow of moving data through the components of the system. Flowgraphs are designed by connecting processing blocks, which are the basic elements of GNU Radio. GNU Radio Companion (GRC) is a graphical tool that provides graphic interface to easily build flowgraphs by dragging and dropping blocks onto a canvas and drawing their connections [30]. GReasy is a extension of GNU Radio that supports the computational benefits of FPGA acceleration targeting to reduce FPGA compile times [31]. Within the GReasy environment, FPGA-based processing units are complemented to the GNU Radio library allowing a user to arbitrarily insert optimized hardware and software modules to a given design. GReasy employs TFlow, a university-developed software toolset that allows the rapid assembly of FPGA accelerator modules through pre-compiled hardware library, for back-end bitstream generation, which places and routes parameterized pre-compiled modules into a final FPGA bitstream [32].

The application of FPGA in SDR and CR framework is more and more wide and dominant. In recent years, FPGAs capabilities have increased tremendously, such that recent devices can process at rates of over 5000 GMACs (integer multiply accumulates) per second. This enables highly advanced baseband systems to be implemented at a low power budget compared with other programmable architectures. The addition of rapidly and dynamical reconfiguration ability of FPGA processing components has provided high performance and energy efficiency levels that used to only be obtainable by ASICs or fully analog circuits. Embedded systems consisting of FPGA devices are an ideal candidate for SDR and CR thanks to their low power consumption and portability. Embedded systems also give the reliability and consistent performance that is a top priority for communications application.

#### 2.2 Field Programmable Gate Arrays

Field programmable gate arrays (FPGAs) are silicon devices with an architecture designed to be flexible enough to implement any type of circuit [33]. They consist of programmable logic components coupled with configurable routing, and flexible I/O for interfacing with external components. How these components and connections are set up is determined according to the contents of a configuration memory. To implement a circuit, the designer describes it typically using a hardware description language like Verilog or VHDL. Alternatively, tools exist that can map higher level languages like C or MATLAB Simulink directly [34]. This description is then processed through synthesis and implementation tools, to determine a suitable mapping of the circuit to the components on the FPGA, and the connections between them. This results in a bitstream that can be loaded into the configuration memory to set up the circuit described [35].

#### 2.2.1 FPGAs for radio platforms

On the recent FPGAs, additional components like DSP blocks for high-performance signal processing tasks are available. These can also be mapped to during the implementation phase, and the tools will generally determine when it is best to do so [36].

The most exciting recent FPGA architecture has been the emergence of new platforms that couple high-performance processors with a flexible FPGA fabric. For example, The Xilinx Zynq-7000 programmable SoC family is a new generation of reconfigurable devices, which has a tightly coupled hardened 1 GHz dual ARM Cortex-A9 MPCore microprocessor and reconfigurable fabric architecture along with several built-in peripherals [37]. In such systems,

the ARM can host a fully functioning software stack while the baseband can be implemented in the reconfigurable fabric. The connectivity between the two is very tight and offers high bandwidth and low latency. This represents an ideal platform for cognitive radio systems, offering both high computational performance and flexibility, while also facilitating a simple software programmable interface to higher layers of the radio.

Partial reconfiguration (PR) is also an advanced technique available in recent FPGAs which further provides the flexibility [38, 39]. By modifying only a portion of the configuration memory, it is possible to modify only some system functionality while the remainder continues to operate without interruption. This allows portions of a circuit to be changed at runtime, and is ideally suited to applications like cognitive radio [40, 41]. It is clear that PR has significant benefits for dynamic systems like cognitive radios. However, the design process remains suitable only for FPGA experts. Recent efforts have begun to bridge this gap, allowing a modular approach to PR design, with automated partitioning and floorplanning of the FPGA [42, 43]. Combined with the improving efficiency of high-level synthesis, this will enable radio designers with minimal FPGA experience to leverage their power and flexibility in the design of cognitive radios. Reduced power consumption is another benefit of PR that is important in radio systems. FPGA based systems consume power in proportion to the size of the device, resource utilisation and operating frequency. PR based designs save resources, and may even allow use of a smaller FPGA, hence saving power. The dynamical reconfiguration ability of FPGA processing components has provided high performance and energy efficiency levels that used to only be obtainable by ASICs or fully analog circuits [44].

#### 2.2.2 Power Dissipation on FPGA

The increasing computational requirements, and growing requirements for deployment in portable devices have prompted serious attention to reducing the power consumption of wireless communication systems, especially in the physical layer. High power dissipation by a system introduces high operating temperatures and follow-on effects that increase the cost of packaging as well as requiring larger batteries in the case of portable devices, or other power supply solutions. Power dissipation has thus come to be considered as one of the most important design metrics, along with area and performance. Reducing power consumption has been investigated at all levels of the digital design flow: from the system design stage to the layout stage.

At every stage, the designer can act to optimise the power consumption of the design. However, small changes at the lower levels may require more work and time to update the higher level optimisations, thus increasing the complexity of the optimization process. Moreover, at the system design level stage, the designer has more options to reduce power dissipation by a greater degree. At the lower level, the architecture of the system is designed in terms of data flow between registers, leaving few options to optimize the system. Thus, most power saving opportunities may exist at the highest level of design abstraction [45]. FPGAs, with their highly parallel architecture, are suitable for the increasing computational requirement of signal processing in wireless communication systems such as modulation, channel coding, and so on. In order to optimise the power dissipation of systems on FPGA, the power consumption of FPGAs needs to be clearly understood, and accurate, as well as flexible, power estimation tools are needed in order to provide power profiles of the system to help the designer makes the best decisions. In addition, low power design techniques must be investigated to optimise power consumption when a system design is implemented. Total FPGA power is calculated as follows:

$$P_{total} = P_{devicestatic} + P_{designstatic} + P_{dynamic}, (2.1)$$

where  $P_{total}$  represent the total power consumption of the FPGA,  $P_{devicestatic}$  refers to the leakage power in proportion to the size of the device, resource utilisation when the device is powered,  $P_{designstatic}$  is the additional power dissipation when the design is configured on the device and has no switching activity. The leakage current is the only source of this static power dissipation.  $P_{dynamic}$  represents the additional average power consumption from user logic utilization and switching activity caused by signal transitions in the design. Increasing operating speed leads to a rise in switching activity and thus results in increased dynamic power consumption. The most significant source of dynamic power dissipation is the charging and discharging of parasitic capacitance in signal transitions.

Efficient power-aware designs for FPGA-based systems require estimation tools that gauge power consumption at an early stage in the design flow. These tools allow design tradeoffs to be considered at a high level of abstraction, thus reducing design effort and cost. Significant research on FPGA power consumption has appeared in the literature [46–50]. These cited

papers have shown that power consumption of FPGA devices is predominantly confined to the programmable interconnection. In the Xilinx Virtex-II family, for instance, it was reported that between 50-70% of total power is dissipated on the interconnection network [46]. The majority of power dissipation in FPGAs is dynamic power dissipation [46] as characterized by:

$$P_{dynamic} = \frac{1}{2} \sum_{i \in allnets} C_i \cdot f_i \cdot V^2, \tag{2.2}$$

where  $P_{dynamic}$  represents average power consumption,  $C_i$  is the capacitance of a net,  $f_i$  is the average transition rate (switching activity) of corresponding net, and V is the voltage supply. Thus, estimating dynamic power (2.2) requires two parameters for each net: switching activity and capacitance. The net's capacitance is the parasitic effects on interconnection wires. It depends on the interconnection used resources. The switching activity represents the signal transitions on the net, depending on how circuit delays are accounted for. Zero delay activity can be calculated assuming logic and routing delays are zero. Logic delay activity can be calculated considering only logic delays. Routed delay activity can be calculated using complete logic and routing delays. When delays are accounted for, the presence of glitches, which are spurious logic transitions due to unequal path delays to the nets driving gate, leads to an increase in switching activity. The additional activity due to glitching actually causes a significant increase in dynamic power [47]. The path delays in FPGAs are also significantly dominated by interconnects, which have a different delay, compared to primitive logic delays. Therefore, the result of glitching on total power consumption may be more severe in FPGAs versus ASICs.

It is well known that the dominant source of optimising power dissipation on FPGA is optimising dynamic power dissipation. Dynamic power dissipation is proportional to switching activity and the equivalent capacitance of the circuit. There are many techniques and strategies to reduce the power dissipation in FPGA presented in the literature [51–55]. However, only the strategies which appear to be suitable for wireless systems are investigated. These strategies will be discussed in detail for specific implementations in the following chapters, but the general approach is discussed here.

Firstly, in signal processing dominated systems, storing and transferring data between functional modules consumes a large proportion of total energy [54]. The power consumption thus

heavily depends on the way a system is partitioned and modularised and hence the use of data buffers to synchronously transfer data between functional modules. In order to optimise power dissipation at a system level, such systems are often realised to support stream processing in which buffering data between functional modules requires much less memory. The energy for storing and transferring data is thus reduced significantly.

Secondly, the latch registers with enable signals must be used at the inputs of functional modules. They can reduce the switching activity inside functional modules when the modules must wait for synchronisation in a streaming process. Apart from the above strategies, in low power design, the power estimation tools must be used to evaluate the power dissipation of the system at each stage of the flow. This provides information for power dissipation optimisation.

#### 2.2.3 Power Estimation

Several studies investigating power estimation techniques at different levels from the circuit level up to the system level have appeared in the literature. At the circuit level, a circuit simulator such as SPICE [56] provides one of the most accurate power estimation methods but the computational overhead involved in this is not really suitable for highly integrated and dense FPGA based systems and complex designs.

Several studies have shown that the efficiency of power optimization is significantly better at the higher levels [50, 57–59]. High-level power estimation is needed to conveniently validate power budgets for the different parts of the design and identify the most power hungry part in the design and also evaluate quickly the effect of various optimized techniques to obtain the power budget of systems. Furthermore, the long run time and and the complexity of synthesizing and validating a gate-level netlist make lower level estimation approaches highly inefficient for exploring high-level design abstraction tradeoffs.

The contemporary Xilinx FPGA design suite supports two tools: XPower Estimator (XPE) and XPower Analyzer (XPA) to estimate and analyse the power dissipation of systems that are implemented on Xilinx FPGA devices. These tools provide the power profile of a system in the early stages when the RTL description is being implemented that helps the designer highly optimise power when implementing the system. XPE is a Microsoft Excel spreadsheet that is used to estimate power distribution typically used in the pre-design and pre-implementation phases of design flow [60]. XPE supports selecting the relevant power supply and thermal

management components based on the system's architecture evaluation and device selection. XPE reads the resource usage of the implemented system, toggle rates, I/O loading, and many other factors from a designer's input and combines these with the appropriate device models to estimate power distribution. The appropriate device models are obtained from measurements, simulation, and/or extrapolation. There are two primary components that contribute to the accuracy of XPE. One is the designer's input estimates such as toggle rates, I/O loading. The other is device data models integrated into the spreadsheet that are selected based on the device selection. Realistic input must be provided to obtain an accurate estimation.

XPE is used in the early stages when the RTL description is being implemented. However, XPE depends significantly on the designer's input estimates that result in poor accuracy of estimation. After the system is synthesised and implemented, XPA tool can be used for more accurate estimation and power analysis.

XPA analyses the design on real design data from the system such as the NCD file output from Place & Route (PAR) after the system is implemented. The latest version of XPA employs a vectorless estimation algorithm in which switching activity of nodes are assigned to appropriate values even if they are not defined in the input file. However, simulation activity files such as SAIF or the VCD file from simulation in timing mode are required for accurate power analysis.

XPA is used at the stage when a system design has successfully completed PAR and generated an NCD output file and been simulated to verify its functionality. The power profile of the system such as resource usage, capacitance of nodes and nets can be extracted from the NCD file while switching activities are computed with high accuracy based on timing simulation from SAIF or VCD files. Much more realistic power information is obtained at this stage, thus, XPA can achieve more accurate results in terms power estimation. XPA generates a text-based power report that shows the power distribution on the system that can used for optimisation.

### 2.3 Orthogonal Frequency Division Multiplexing

OFDM is a multicarrier modulation scheme used in both wireline and wireless communication in which a high-rate data stream is split into multiple parallel low rate streams that are modulated by multiple sub-carriers. The adjacent modulated sub-carriers are theoretically orthogonal with zero mutual interfere to each other. OFDM signals are modulated using sub-carriers



Figure 2.1: The spectrum of subcarriers in OFDM [61]

across the frequency range similar to frequency division multiplexing (FDM). But the main difference is that FDM conventionally multiplexes the signals into separatele small bands in which the signal in each band is modulated using a specific sinusoidal carrier, while OFDM signals are modulated using orthogonal sub-subcarriers. Each sub-carrier is mathematically represented by a sinc pulse, which is overlapped with other subcarriers in the frequency domain as shown in Fig. 2.1. Note that the sub-carrier of the *sinc* pulses will null at the centre points where the other subcarriers are located. Ideally, there is thus zero ICI in an OFDM signal.

An OFDM symbol signal can be expressed at baseband by a sum of modulated complex exponentials as:

$$s(t) = \sum_{k=0}^{N-1} X_k e^{i2\pi\Delta f t},$$
(2.3)

where  $X_k$  represents a data modulated symbol such as a BPSK, QPSK, or QAM symbol, and is a complex number modulated by the kth subcarrier of N subcarriers and  $\Delta f$  is the subcarrier spacing. Sampling this OFDM symbol signal with sampling period of  $T_S$  is expressed as:

$$s(nT_S) = \sum_{k=0}^{N-1} X_k e^{i2\pi\Delta f nT_S},$$
(2.4)

A sample of the OFDM signal is equivalent to an inverse N-point discrete Fourier transform (IDFT), taking  $X_k$  as a discrete point in the frequency domain. Inversely, the sampled OFDM

symbol signal can be demodulated using the discrete Fourier transform (DFT). The OFDM modulation and demodulation are hence performed by computing the IDFT and DFT, respectively, expressed as:

$$s[n] = \frac{1}{N} \sum_{k=0}^{N-1} X[k] e^{i2\pi \frac{k}{N}n}, \qquad (2.5)$$

$$X[k] = \sum_{n=0}^{N-1} s[n]e^{-i2\pi \frac{k}{N}n},$$
(2.6)

In order to achieve efficient computation, The inverse Fast Fourier Transform (IFFT) and Fast Fourier Transform (FFT) are implemented in an OFDM system to modulate and demodulate the signal instead of the IDFT and DFT, respectively. These optimised algorithms generally rely on the number of points, and hence carriers, being a power of 2.

#### 2.3.1 Cyclic Prefix

When transmitting OFDM symbols over a delay-dispersive multi-path channel, the received signal is the linear convolution of the transmitted symbol with the channel impulse response (CIR)

$$y[n] = h * s[n], \tag{2.7}$$

where h, assuming it has a length of L, denotes the equivalent impulse response of the channel. and \* is the convolution operation. The received symbols y[n] are the result of convolution between CIR h and transmitted symbols s[n] which has a length of N. So, y[n] has a length of N+L-1. In addition, the received signal is obtained by concatenating the received OFDM symbols. Because the received symbols, having a length of N+L-1, are overlapped with the adjacent received symbols, adding the overlap of adjacent received symbols leads to the introduction of the Inter Symbol Interference (ISI) in the received signal shown in Fig. 2.2.

In order to avoid ISI, a guard interval (or cyclic prefix), having a length of  $L_{CP}$ , has to be added before each OFDM symbol as demonstrated in Fig. 2.3. If the length of CIR, L, is



Figure 2.2: OFDM transmission without cyclic prefix results ISI among adjacent symbol



Figure 2.3: OFDM transmission with cyclic prefix avoids ISI among adjacent symbol

smaller than that of the guard interval,  $L_{CP}$ , adding the overlap of adjacent received symbols will not interfere with the succeeding received OFDM symbol. The ISI is hence missing in the received symbol. The guard interval, adopted in standards, can be commonly performed by a copy of the last  $L_{CP}$  samples of the symbol as shown in Fig. 2.4, that is called a cyclic prefix (CP)

In addition, the use of CP also guarantees the orthogonality of subcarriers avoiding the



Figure 2.4: Inserting Cyclic Prefix in the OFDM symbol



Figure 2.5: the OFDM system model

ICI. Performing DFT operation and a single-tap equalizer per subcarrier allows recovery of the transmitted symbols [61].

#### 2.3.2 OFDM-based system

In the above section, the OFDM modulation technique was presented, and inserting CP was introduced to improve the performance of OFDM in multi path channel. A OFDM-based system model can be equivalently considered as shown in Fig. 2.5.

In the transmitter, the data modulated symbols X[n] are grouped in a blocks of N subcarrier symbols known as an OFDM symbol, expressed by a vector  $X[n] = (X[1], X[2], ..., X[n])^T$ . Next, the OFDM symbol signal in the time domain is modulated by performing IDFT on each OFDM symbol, and a cyclic prefix of length  $L_{CP}$  is inserted at the begin of OFDM signal. So, the complex signal of m, the OFDM-symbol in baseband discrete time, can be expressed as

$$s_m[n] = \frac{1}{N} \sum_{k=0}^{N-1} X[k] e^{i2\pi k \frac{n-L_{CP}}{N}},$$
(2.8)

where n is the discrete time index, m denotes the index of the OFDM symbol. The complete transmitted signal in the discrete time domain, s[n], is given by the concatenation of all OFDM symbols,  $s_m[n]$ ,

$$s[n] = \sum_{m=0}^{\infty} s_m [n - m(N + L_{CP})], \qquad (2.9)$$



Figure 2.6: The block diagram of OFDM-based RF systems

When transmitting OFDM signals over a multi-path channel, the received signal is obtained through the linear convolution of the transmitted symbol with CIR h[i] and adding additive white Gaussian noise (AWGN) n. Assuming that the synchronisation between the transmitter and receiver are perfectly achieved, the channel fading is slow enough to consider as a time invariant channel during one OFDM symbol interval, and the length of cyclic prefix is longer than that of CIR (h[i] = 0 for  $i < 0i > L_{CP} - 1$ ),

$$y[n] = \sum_{i=0}^{L_{CP}-1} h[i]s[n-i] + n[n],$$
(2.10)

In the receiver, the incoming samples y[n] are synchronously grouped into block of OFDM symbols and then the cyclic prefix in each OFDM symbol is removed. The received symbols can be expressed in a vector  $y_m = (y_1, y_2, ...)$ , with  $y_m[n] = y[m(Nc + Ncp) + Ncp + n]$ . The received data symbols associated with  $m^{th}$  OFDM symbol  $R_m[n]$  are retrieved by performing a N-point DFT:

$$R_m[n] = \sum_{n=0}^{N-1} y_m[n] e^{-i2\pi \frac{nk}{N}},$$
(2.11)

Fig. 2.6 presents the common block diagram of an OFDM-based RF system. OFDM is perform in the baseband, IFFT and FFT blocks are used to compute IDFT and DFT for OFDM modulation and demodulation, respectively. In the transceiver, the channel coded data from

higher abstracted layers is modulated to data symbols (BPSK, QPSK, QAM, ...). The data symbols are then grouped together with the pilots to form N FFT points in parallel. After performing IFFT, the CP is inserted, and then the OFDM samples are serialised and split in to in-phase (I) and quadrature (Q) channels corresponding to the real and imaginary parts of the OFDM sample. The digital to analogue converted signals of I and Q channels are modulated by an intermediate frequency,  $f_{IF}$ , then the signal is up-converted to high frequency by an RF carrier,  $f_C$ . Before transmitting, the signal should be amplified by an low noise amplifier (LNA).

In the receiver, after down-converting and IQ demodulation, the signal is sampled. The samples are formed from I and Q channels corresponding to the real and imaginary parts of the OFDM sample. The timing and frequency synchronisation blocks detect the frame, recover timing of the frame and estimate the frequency offset. The received samples are then compensated by estimated frequency offset in the discrete time domain. After demodulating an OFDM symbol using FFT, estimating the channel and then channel equalisation as well as phase error compensation are performed to improve performance. The block of parallel samples in an OFDM symbol is serialised to data symbol sequences that are then demodulated, and coded data is sent to higher abstracted layers to decode.

#### 2.3.3 Evaluating OFDM

The main advantages of OFDM are its spectrally efficient usage and robustness against multi path propagation. This makes OFDM suitable for high performance wireless applications. OFDM uses multiple sub-carriers which are overlapped with other subcarriers in the frequency domain, resulting in greater spectral efficiency than FDM. Performing OFDM is equivalent to splitting a data stream into several parallel low-rate streams before transmission. This makes the OFDM signal more robust against fading when transmitted through the channel. Thanks to the cyclic prefix, the ISI and ICI caused by the multi-path channel can be eliminated. The CP creates a guard period for an OFDM symbol, which should be longer than the CIR to ensure no ISI. Repeating samples of the OFDM symbol in a guard period, the CP helps to maintain the orthogonality of subcarriers avoiding the ICI. Thus, performing the DFT and a single-tap equalizer per subcarrier allows recovery of the transmitted symbols.

On the other hand, OFDM has some disadvantages. Firstly, an OFDM signal is the sum of multiple modulated sub-carriers, and thus suffers a high peak-to-average power ratio (PAPR). This results in demand on high power and wide range linearity in amplifiers increasing the cost of OFDM-based systems. Secondly, the usage of guard period reduces bandwidth efficiency of OFDM. Last but not least, OFDM performance is sensitive to receiver synchronisation. Frequency offset causes inter-subcarrier interference and errors in timing synchronisation can lead to inter-symbol interference. Much effort is needed to improve the accuracy of both frequency and time synchronizers for OFDM.

#### 2.3.4 OFDM Synchronisation

OFDM performance is sensitive to receiver synchronisation. Frequency offset causes intersubcarrier interference, and errors in timing synchronisation can lead to intersymbol interference. Therefore, synchronisation is critical for good performance in OFDM systems. There are two main errors implicit in synchronisation: sample clock timing offsets and carrier frequency offsets. In order to obtain good synchronisation performance, timing offsets and frequency offsets must be studied in terms of their cause and effect on the degradation of OFDM received data symbols. Additionally, there are issues of common phase error (CPE), generated from clock jitter and phase noise, that causes a random rotation of the entire signal constellation. This must also be taken into account and compensated for in order to achieve good performance.

#### **Timing Offsets**

When sampling a signal at the receiver, the different times of sampling between samples in the receiver and transmitter are referred as timing error. In a single carrier system, the symbol clock in the transmitter can be recovered at the receiver using a phase-lock loop (PLL) [61]. This can correct a timing error in the receiver relatively easily.

In OFDM, however, timing errors can be considered in two categories: fractional and integer. Fractional timing error, that is errors that are smaller than one sample period, are caused by different phases between the sampling clock of the analogue to digital converter (ADC) in the receiver and the phase of the transmitted signal, while integer timing error is that which is greater than one sample period, causing index shifting, or offset, in the sample sequence.

The timing error in the time domain is equivalent to a phase rotation in the frequency domain expressed in Equ. (2.12),

$$s(t-\tau) \Leftrightarrow e^{-i2\pi f \tau} R(f),$$
 (2.12)

where  $\tau$  denotes timing error resulting in a phase shift of  $e^{-i2\pi f\tau}$ . s(t) is the received signal in the time domain, and R(f) is the spectrum of s(t) in the frequency domain. The phase shift is proportional to both time errors and the frequency of carriers. In the case of multi-carriers with increasing frequency, the phase shift is increased according to the carriers leading to the phase rotation of subcarriers. Carrier rotations caused by fractional timing error  $\Delta t$  like that caused by fading can be estimated by a channel estimator and compensated for after performing the DFT,

$$\widehat{R[n]} = R[n]e^{\frac{i2\pi n\Delta t}{N}},\tag{2.13}$$

where R[n],  $\widehat{R[n]}$  denotes received data symbols before and after compensation, respectively.  $\Delta t$  is estimated phase rotation, and N is the number of sub-carriers.

Moreover, the received samples in the receiver are synchronously grouped into blocks of OFDM symbols. Integer timing errors lead to a symbol timing offset (STO) referring to the difference between correct sample index and the actual sample index of received samples that causes a misaligned window for DFT demodulation in the receiver. If the timing offset is late, the samples of the following symbol are used for the current symbol, resulting in ISI and hence degrading the performance of the OFDM system. The effect of ISI caused by later timing offsets on the OFDM received symbol is illustrated in Fig. 2.7(a) and Fig. 2.7(c). If the timing offset is early, some samples in the CP of the current symbol are used to calculate the DFT, leading to sub-carrier rotation expressed in Equ.2.12 in the frequency domain. The effect of sub-carrier rotation caused by earlier timing offsets on the OFDM received symbol is illustrated in Fig. 2.7(b) and Fig. 2.7(d).

$$s[n-t_{off}] \Leftrightarrow R[n]e^{\frac{i2\pi nt_{off}}{N}},$$
 (2.14)

where s[n], R[n] denotes received data symbols in the timing domain and the frequency domain, respectively.  $t_{off}$  is a timing offset, and N is the number of sub-carriers.



Figure 2.7: OFDM received symbol with timing offets of -1, 1, -5 and 5 in a, b, c, d, respectively.

Fig. 2.7 illustrates the effect of timing offset on a single 256 subcarrier OFDM symbol utilizing QPSK subcarriers (based on the IEEE 802.16). As can be seen, the earlier timing, for instance timing offsets of 1 and 5, shown respectively in Figs. 2.7(b) and 2.7(d), cause a carriers rotation similar to that of fractional timing errors, and fading that can be estimated by a channel estimator. However, the later timing, for instance timing offsets of 1 and 5, as shown in Figs. 2.7(a) and 2.7(c), lead to ISI that prevents the OFDM constellation from being recovered. Therefore, timing synchronisation is required to correct the timing offset, and avoid ISI. Correlation operators are commonly performed to estimate timing offset. Some related works [62–64] investigate the correlation implementation for this purpose. One of the constributions in the thesis proposes a novel low-power correlation for OFDM synchronisation. The related work and proposed correlation are discussed later in Chapter 3. The accuracy of timming synchronisation also depends on the timing metrics and the algorithms applied on these metrics. A novel efficient and robust timing synchronisation based on the low-power correlation is proposed in this thesis. The related works [65–69] and the proposed method for timing synchronisation will be investigated in Chapter 4.

#### **Frequency Offset**

CFO refers to a difference in frequency between the receiver clock with respect to the 'correct' frequency of carriers in a transmitted OFDM symbol. CFO is introduced by an imperfect clock in the RF front-end part, as well as by frequency variation caused by the Doppler effect when a signal is transmitted through a frequency selective channel. This leads to the misalignment of sampling in sub-carriers in the frequency domain that causes a loss of orthogonaltity because at the point of frequency offeset in the sub-carrier, the other sub-carriers are not null as expected (shown in Fig. 2.8). The CFO is normalised by subcarrier spacing and usually divided into an integer part (IFO), as a multiple of subcarrier spacings, and a fractional part (FFO). IFO causes a circular shift of the subcarrier in the frequency domain while FFO results in ICI because of lost orthogonality between subcarriers.

With no frequency offset, the frequency bin of the DFT will be sampled at the value at the peak of each subcarrier, sinc(x) pulse, and other adjacent pulses are null at this point. However, if frequency offset is introduced, the frequency bin of the DFT will sum the energy from other sub-carriers. This means that the adjacent subcarrier introduces an interference component resulting in ICI.



Figure 2.8: Inter carrier interference (ICI) caused by frequency ofsset  $\Delta f$ .

As can be seen, the adjacent subcarrier introduces an interference component that is about half the amplitude of the subcarrier of interest. All other subcarriers introduce an interference component of much lower amplitude. This is known as a loss of orthogonality, and must be compensated for in order to properly demodulate the OFDM symbol. The effect of CFO can be easily considered in the time domain by taking an inverse Fourier transform expressed as follows:

$$R(f - \Delta f) \Leftrightarrow e^{i2\pi\Delta f t} s(t),$$
 (2.15)

In the discrete time domain, the signal sample sequence can be expressed:

$$s[n]' = s[n]e^{\frac{i2\pi\Delta fn}{N}},\tag{2.16}$$

where s[n]' and s[n] are the frequency offset samples and the original samples, respectively.  $\Delta f$  denotes the frequency offset, and N is the number of subcarriers.

The effect of frequency offset is shown in Fig. 2.9. Each plot illutrates the constellation of QPSK symbols demodulated from one 256 sub-carrier OFDM symbol based on the IEEE 802.16 standard.

As can be seen, OFDM performance is sensitive to even small frequency offsets. The effect of CFO causes dispersion, similarly to AWGN, and also phase rotation in the QPSK constellation demodulated from OFDM symbol. If multiple data symbols are transmitted in a packet, the phase rotation of each OFDM symbol increases, and even small CFO will lead to a large drift



Figure 2.9: The constellations of OFDM received symbol with frequency offets of 0.025, 0.5, 0.1 and 0.25 sub-carries spacing in a, b, c, d, respectively.



Figure 2.10: The constellations of 5 consecutive OFDM received symbols with frequency offets of 0.025 and 0.05 in a, b respectively.

of constellation points, shown in Fig. 2.10 degrading the performance of demodulation. CFO must be estimated and compensated for, in order to properly demodulate the OFDM symbol.

Almost published OFDM implementations limits its synchronisation only can perform with the small CFO that only presents FFO. Although IFO estimation methods are investigated by many related works [70–74] in theoretical, their implementation on hardware have still not been published in research literature because of the computational complexity. In Chapter5, A low cost, low power IFO estimation architecture is presented to enhance OFDM Synchronization.

#### **Phase Noise**

The intrinsic imperfection of the local clock at the RF front-end of a receiver or clock jitter of an ADC may introduce a parasitic phase noise which can affect the performance of baseband data symbols, such as QPSK, QAM, ..., during demodulation. Phase noise can be considered in two different parts: common phase error and inter-carrier interference [75]. The effect of phase noise can be expressed in the discrete time domain as:

$$s[n]' = s[n]e^{i\phi[n]},$$
 (2.17)

where s[n] and s[n] are the frequency offset samples and the original samples, respectively.  $\phi[n]$  denotes the phase noise.



Figure 2.11: The constellations of an OFDM received symbol and 5 consecutive OFDM received symbols with phase noise variance of  $0.25 \text{ } rad^2$  in (a), (b) respectively.

If the phase noise is varying more slowly than the OFDM symbol interval, it can be considered a constant phase term added to each sample resulting in CPE [75]. However, if the phase noise is varying much faster than the OFDM interval, different phase noise is added to each sample causing the loss of orthogonality, and thus, intercarrier interference. Fig. 2.11 illustrates the effect of phase noise on baseband data symbol demodulation in an OFDM received symbol and 5 consecutive OFDM received symbols.

As can be seen in Fig. 2.11, the degradation of OFDM demodulation includes two different phenomena affected by phase noise. First, the constellations of the data symbol are rotated similarly to the effect of fractional timing error or residual frequency offset. Second, the constellations of data symbols are dispersed like the effect of AWGN, that causes a loss of orthogonality between the sub-carriers. However, the difference from the effect of frequency offset will be a constant constellation rotation for all OFDM symbol instead of a different constellation rotation for each symbol in the case of frequency offset.

#### 2.3.5 Shaping OFDM Spectral Leakage

OFDM modulation techniques have been adopted for many wireless standards as well as enabling technology for cognitive radios. However, one of their main disadvantages is spectral leakage due to the summation of sinusoidal subcarriers and having been windowed by a rectangular function. In addition, some recent OFDM-based standards demand very strict requirements on spectral leakage to avoid inter-channel interference. This raises a significant challenge in how to shape the spectrum of the OFDM signal.

#### **Spectrum Leakage Specification of Recent Standards**

In 2009, the FCC issued regulatory rules for reusing the television white space (TVWS) spectrum. IEEE 802.11af was developed under the 802.11 Working Group as a standard that enables a Wi-Fi service in the TVWS spectral regions [76]. The scope of the standard is to define amendments to the high throughput 802.11's PHY and MAC layers to meet the requirements for channel access and coexistence in the TVWS regions. One of the main challenges is the stringent SEM requirements that are mandated by the FCC for these services.. The high throughput 802.11 scaled SEM has a large gap between current performance and the required spectral emissions shape for TVWS [77]. For instance, the 802.11 scaled SEM requires an attenuation of 20 dB at the edge of the channel whereas the equivalent this requirement for portable TV band devices (TVBD) is 55 dB.

In 2010, the IEEE defined a standard for PHY and MAC layers [78], named IEEE 802.11p, for Dedicated Short-Range Communications (DSRC), the wireless channel for new vehicular safety applications through vehicle-to-vehicle (V2V) and Road to Vehicle (RTV) communications. The PHY in 802.11p is largely inherited from the well-established IEEE 802.11a OFDM PHY, with several changes aimed at improving performance in vehicular environments. The advantage of building on 802.11a is a potential significant reduction in the cost and development effort necessary to develop the new 802.11p hardware and software. It also plays an important role in allowing backwards compatibility from 802.11p to 802.11a [79, 80]. Essentially, three changes are made in IEEE 802.11p [81]: First, 802.11p defines a 10 MHz channel width instead of the 20 MHz used by 802.11a. This extends the guard interval to address the effects of Doppler spread and inter-symbol interference in a VC channel. Secondly, 802.11p defines several improvements in receiver adjacent channel rejection performance to reduce the effect of cross channel interference that is especially important in dense vehicle communication channels. Finally, 802.11p defines four SEMs corresponding to class A to D operations that are specified and issued in FCC CFR47 Sections 90.377 and 95.1509. These are more stringent than for current 802.11 radios, in order to improve performance in urban vehicle scenarios. In addition, 802.11p will operate in the 5.9 GHz DSRC spectrum divided into seven

10 MHz bands. This channelization allows the MAC upper layer to perform multi-channel operations [82]. The mechanism allows safety and other applications to occupy separate channels to reduce interference. The four strict 802.11p SEMs are defined to reduce the effect of ICI between the channels. Wu et al. [83] showed that transmitters on adjacent service channels still cause inter-channel interference (ICI) in the safety channel, even if they satisfy the class C requirement. Shaping 802.11p spectral leakage is thus potentially important in helping to eliminate ICI.

#### **Dynamic Channel Requirements**

For static wireless devices operating in licensed spectral regions, the characteristics of communication systems that are licensed to occupy adjacent bands may be known. Hence, spectral leakage masks for ICI avoidance in neighbouring systems can be statically specified. However, in the case of shared and reused spectrum, the authors of [84] suggest that SEMs should be defined in a more general and flexible way. In other words, CRs operating in dynamic spectrum access (DSA) environments must adapt their current transmission SEMs based upon their current operating region – another argument for baseband digital filtering. More sophisticated examples of SEMs are studied in [85], which deals with the broader concept of dynamic SEMs.

Time-varying SEMs may also need to consider that neighbouring systems are themselves able to change their SEMs, and hence through negotiation with each other change their masks for in- and out-of-band emission levels separately in accordance with their mutual temporal variations (for example, to adapt to communications traffic density or spatial deployment density). In such future systems, an SEM defined by the regulator may simply be a starting point in a collaborative process in which neighbouring communication systems negotiate and renegotiate new SEMs as their status change (e.g. to optimise computational power or increasing throughput).

Unfortunately, the authors in [85] did not present a complete solution for dynamic filtering of spectral emissions, however they discussed deactivating sub-carriers or changing transmission power to satisfy the requirements of a dynamic SEM. The former solution leads to a reduction in throughput due to reduced spectral band occupancy, whereas the latter impacts range. A combined approach was presented in [86] where some sub-carriers are reduced in power instead of being deactivated, in order to reduce spectral leakage to adjacent channels.

#### **Filtering in OFDM Implementation**

Modern OFDM implementations tend to favour subsuming as much processing as possible within the baseband digital components, in order to simplify the front-end RF hardware. Although many alternative transmitter designs exist, direct up-conversion (DUC) architectures are commonly selected due to inherent implementation, cost and performance advantages [87]. Within the transmitter, orthogonal intermediate frequency (IF) signals are generated directly by digital baseband hardware, high-pass filtered and then quadrature up-converted to RF for transmission. This contrasts with more traditional digital radio implementations in which the digital hardware generates baseband signals which are up-converted to IF in one or more analogue steps before conversion to RF. Those systems would perform channel filtering predominantly with analogue filters, which require discrete precision components, and which tend to be inflexible in terms of carrier frequency and other characteristics. For cognitive radio (CR) systems, where frequency agility is a requirement, and in SDR (software defined radio), both up-conversion to IF as well as channelisation filtering are performed in the digital domain [88, 89]. Typically, this enables a relaxation of stringent IF and RF filtering requirements, which in turn allows a reduction in system cost, though requiring more complex signal processing. A further advantage of baseband filtering is agility and flexibility. In a CR context in particular, both channel and time agility are required, and this can be best achieved in the digital domain.

Within the baseband, OFDM symbols are constructed in the frequency domain and then transformed to a complex time domain representation through the IFFT. A critically sampled construction process requires a sample rate of double the signal bandwidth. This signal is upconverted to IF using an interpolation process, during which images of the original OFDM frequency response are created at integer multiples of the original sampling rate. Image rejection filtering must then be performed on this signal prior to being output by the digital-to-analogue converters (DACs) and subsequent transmission, since the images lie out-of-band (OOB) and hence are a cause of ICI.

However any such filtering induces time-domain smearing of the transmitted signals [90] which adds to the similar effects caused by the channel impulse response (CIR) between transmitter and receiver, all of which potentially induce inter-symbol interference (ISI).

OFDM systems combat ISI by dividing information finely in the frequency domain across sub-channels to implement narrow (in frequency) but long (in time) transmitted symbols, and then providing a guard interval between successive symbol blocks. The guard interval is determined by the duration of the expected channel and filter impulse responses that are traversed by each symbol on the path from transmitter baseband to receiver baseband. The guard interval typically contains a cyclic prefix (CP) which is inserted to combat another cause of ISI: received frequency components ringing during the CIR due to the abrupt onset of modulation at the beginning of each symbol. The nearly rectangular OFDM symbols in the time domain naturally have a frequency domain response consisting of overlapping sinc shapes, complete with large side lobes that lie outside the main frequency channel. These are another source of OOB interference which contributes to ICI. As noted previously, both 802.11p and 802.11af (in common with most OFDM-based standards) specify an SEM which requires that ICI is controlled.

#### 2.4 Summary

Reducing power dissipation has become a crucial issue in wireless communication systems, especially for portable devices. In this chapter, the power consumption of FPGA systems is discussed. The power estimation and analysis tools of Xilinx are also studied and employed in this research to evaluate the power dissipation of the researched system for power consumption optimisation. Some low-power design strategies are suggested. This chapter also provides the background of OFDM in terms of its mathematical representation and functionality, and then the advantages and limitations of OFDM are also discussed. The concept of a MSRC based on OFDM techniques is presented. The challenges of implementing the MSCR system are introduced regarding the architecture and its performance. The synchronisation effects on the OFDM performance are also considered. Last but not least, the challenge in terms of OFDM spectral leakage are discussed in case of the strict requirements imposed by recent wireless standards.

# Multiplierless Correlator Design for low-power systems

# A Method for OFDM Timing Synchronisation

# A CFO Estimation Method for OFDM Synchronisation

## Chapter 6

### **A Spectrum Efficient Shaping Method**

#### **Chapter 7**

## A Novel Architecture for Multiple Standard Cognitive Radios

# **Chapter 8**

#### **Conclusion and Future Work**

#### References

- [1] J. Mitola and G. Q. M. Jr., "Cognitive radio: making software radios more personal," *IEEE Personal Communications*, vol. 6, pp. 13–18, Aug 1999.
- [2] J. Mitola, Cognitive Radio Architecture: The Engineering Foundations of Radio XML. Wiley, 2006.
- [3] A. L. Recio, *Spectrum-Aware Orthogonal Frequency Division Multiplexing*. PhD thesis, Virginia Polytechnic Institute and State University, 2010.
- [4] A. MacKenzie, J. Reed, P. Athanas, C. Bostian, R. M. Buehrer, L. DaSilva, S. Ellingson, Y. Hou, M. Hsiao, J.-M. Park, C. Patterson, S. Raman, and C. da Silva, "Cognitive radio and networking research at virginia tech," *Proc. of the IEEE*, vol. 97, pp. 660–688, April 2009.
- [5] C. J. Rieser, T. W. Rondeau, C. Bostian, W. R. Cyre, and T. M. Gallagher, "Cognitive radio engine based on genetic algorithms in a network," 2007.
- [6] T. W. Rondeau, B. Le, C. J. Rieser, and C. W. Bostian, "Cognitive radios with genetic algorithms: Intelligent control of software defined radios," in *Proceeding Software Defined Radio Technical Conference*, 2004.
- [7] Federal Communications Commission, "Spectrum policy task force report," Tech. Rep. No. 02135, 2002.
- [8] A. Ghasemi and E. S. Sousa, "Spectrum sensing in cognitive radio networks: requirements, challenges and design trade-offs," *IEEE Communications Magazine*, vol. 46, no. 4, pp. 32–39, 2008.

- [9] Federal Communications Commission, "Notice of proposed rule making and order: facilitating oppurtunities for flexible, efficient, and reliable spectrum use employing cognitive radio technologies," Tech. Rep. 03-108, 2005.
- [10] IEEE std.802.11-2005, IEEE Standard 802 Part11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications.
- [11] IEEE std.802.16-2009, IEEE Standard for Local and Metropolitan Area Networks Part16: Air Interface for Fixed Broadband Wireless Access Systems.
- [12] IEEE std.802.22-2011, IEEE Standard for Wireless Regional Area Networks Part22:Cognitive Wireless RAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications: Policies and Procedures for Operation in the TV Bands.
- [13] G. J. Minden, J. B. Evans, L. Searl, D. DePardo, V. R. Petty, R. Rajbanshi, T. Newman, Q. Chen, F. Weidling, J. Guffey, D. Datla, B. Barker, M. Peck, B. Cordill, A. M. Wyglinsky, and A. Agah, "KUAR: A flexibile software-defined radio development platform," in *Symposium on New Frontiers in Dynamic Spectrum Access Networks (DySPAN)*, (Dublin, Ireland), 17-22 Apr. 2007.
- [14] D. Raychaudhuri, N. B. Mandayam, J. B. Evans, B. J. Ewy, S. Seshan, and P. Steenkiste, "Cognet: An architectural foundation for experimental cognitive radio networks within the future internet," in *Proceeding 1st ACM/IEEE Int. Workshop Mobility Evolv. Internet Architect*, pp. 11–16, 2006.
- [15] Z. Miljanic, I. Seskar, K. Le, and D. Raychaudhuri, "The WINLAB network centric cognitive radio hardware platform," in *Proc. Int. Conf. Cogn. Radio Oriented Wireless Netw. Commun. (CrownCom)*, 2007.
- [16] R. Mukhopadhyay, Y. Park, P. Sen, N. Srirattana, J. Lee, C.-H. Lee, S. Nuttinck, A. Joseph, J. D. Cressler, and J. Laskar, "Reconfigurable RFICs in Si-based technologies for a compact intelligent RF front-end," *IEEE Transaction on Microwave Theory and Techniques*, vol. 53, pp. 81–93, 2005.
- [17] G. Ganesan and Y. G. Li, "Cooperative spectrum sensing in cognitive radio networks," in *Proceeding IEEE Dynamic Spectrum Access Networks (DySPAN)*, 2005.

- [18] K. R. Chowdhury and I. F. Akyildiz, "Cognitive wireless mesh networks with dynamic spectrum access," *IEEE Journal of Selected Areas Communumications*, vol. 26, pp. 168–181, 2008.
- [19] H. So, A. Tkachenko, and R. W. Brodersen, "A unified hardware/software runtime environment for FPGA based reconfigurable computers using BORPH," *ACM Transactions on Embedded Computing Systems*, vol. 7, no. 2, 2008.
- [20] S. M. Mishra, A. Sahai, and R. W. Brodersen, "Cooperative Sensing among Cognitive Radios," in *IEEE International Conference on Communications (ICC '06)*, 2006.
- [21] P. Sutton, L. Doyle, and K. Nolan, "A reconfigurable platform for cognitive networks," in *Proceeding 1st International Conference Cognitive Radio Oriented Wireless Network Communications (CROWNCOM 2006)*, 2006.
- [22] P. Sutton, J. Lotze, H. Lahlou, S. Fahmy, K. Nolan, B. Özgül, T. Rondeau, J. Noguera, and L. Doyle, "Iris an architecture for cognitive radio networking testbeds," *IEEE Communications Magazine*, vol. 48, pp. 114–122, Sep 2010.
- [23] S. Fahmy, J. Lotze, J. Noguera, L. Doyle, and R. Esser, "Generic software framework for adaptive applications on FPGAs," in *Proceedings of IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM)*, pp. 55–62, 2009.
- [24] J. Lotze, S. A. Fahmy, J. Noguera, B. Ozgül, L. Doyle, and R. Esser, "Development framework for implementing FPGA-based cognitive network nodes," in *Proceedings of the IEEE Global Communications Conference (GLOBECOM)*, 2009.
- [25] J. van de Belt, P. D. Sutton, and L. E. Doyle, "Accelerating software radio: Iris on the Zynq SoC," in *IEEE 21st International Conference on Very Large Scale Integration (VLSI-SoC)*, 2013.
- [26] K. Amiri *et al.*, "WARP, a unified wireless network testbed for education and research," in *IEEE International Conference on Microelectronic Systems Education (MSE)*, (San Diego, CA, USA), 3–4 June 2007.

- [27] Z. Lin and P. Chow, "ZCluster: A Zynq-based Hadoop cluster," in 2013 International Conference on Field-Programmable Technology (FPT),, pp. 450–453, Dec 2013.
- [28] K. Tan, H. Liu, J. Zhang, Y. Zhang, J. Fang, and G. M. Voelker, "Sora: high-performance software radio using general-purpose multi-core processors," *Communications of the ACM*, vol. 54, no. 1, pp. 99–107, 2011.
- [29] Free Software Foundation, Inc., "GNU Radio The GNU Software Radio," 2009.
- [30] Free Software Foundation, Inc., "GNU Radio Companion," 2009.
- [31] R. Marlow, C. Dobson, and P. Athanas, "An Enhanced and Embedded GNU Radio Flow," in 24th International Conference on Field Programmable Logic and Applications (FPL), 2014.
- [32] A. Love, W. Zha, and P. Athanas, "In pursuit of instant gratification for FPGA design," in 23th International Conference on Field Programmable Logic and Applications (FPL), 2013.
- [33] I. Kuon, R. Tessier, and J. Rose, "FPGA architecture: Survey and challenges," *Found. Trends Electron. Des. Autom.*, vol. 2, pp. 135–253, Feb. 2008.
- [34] P. Coussy and A. Morawiec, eds., *High-Level Synthesis: From Algorithm to Digital Circuit*. Springer, 2008 ed., Oct. 2008.
- [35] Xilinx Inc., UG470: 7 Series FPGAs Configuration User Guide, Jan. 2013.
- [36] Xilinx Inc., DS190: Zynq-7000 All Programmable SoC Overview, Mar. 2013.
- [37] Xilinx Inc., UG585: Zynq-7000 All Programmable SoC Technical Reference Manual, Mar. 2013.
- [38] P. Sedcole, B. Blodget, T. Becker, J. Anderson, and P. Lysaght, "Modular dynamic reconfiguration in virtex FPGAs," *IEE Proceedings Computers and Digital Techniques*, vol. 153, pp. 157–164, May 2006.
- [39] E. McDonald, "Runtime fpga partial reconfiguration," in *IEEE Aerospace Conference*, no. 1–7, (Big Sky, MT), Mar. 2008.

- [40] J.-P. Delahaye, J. Palicot, C. Moy, and P. Leray, "Partial reconfiguration of FPGAs for dynamical reconfiguration of a software radio platform," in *Proceedings of IST Mobile and Wireless Communications Summit*, (Budapest), July 2007.
- [41] J. Delorme, J. Martin, A. Nafkha, C. Moy, F. Clermidy, P. Leray, and J. Palicot, "A FPGA partial reconfiguration design approach for cognitive radio based on NoC architecture," in *Joint 6th International IEEE Northeast Workshop on Circuits and Systems and TAISA Conference (NEWCAS-TAISA)*, pp. 355–358, June 2008.
- [42] K. Vipin and S. Fahmy, "A high speed open source controller for FPGA Partial Reconfiguration," in *International Conference on Field-Programmable Technology (FPT)*, pp. 61–66, Dec 2012.
- [43] K. Vipin and S. A. Fahmy, "Automated partitioning for partial reconfiguration design of adaptive systems," in *Proceedings of the Reconfigurable Architectures Workshop (RAW)*, (Boston, MA), May 2013.
- [44] C. Dobson, K. Rooks, and P. Athanas, "A Power-Efficient FPGA-Based Self-Adaptive Software Defined Radio," in 24th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), 2014.
- [45] A. Raghunathan, N. K. Jha, and S. Dey, *High-level power analysis and optimization*. Boston: Kluwer Academic, c1998., 1998.
- [46] L. Shang, A. S. Kaviani, and B. Kusuma, "Dynamic power consumption in Virtex-II FPGA family," in *ACM Int. Symposium on FPGAs*, pp. 157–164, ACM Press, 2002.
- [47] J. Anderson and F. Najm, "Interconnect capacitance estimation for FPGAs," in *Design Automation Conference*, 2004. Proceedings of the ASP-DAC 2004. Asia and South Pacific, pp. 713–718, 2004.
- [48] J. Anderson and F. Najm, "Power estimation techniques for FPGAs," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 12, no. 10, pp. 1015–1027, 2004.
- [49] E. Todorovich, E. Boemo, A. F., and V. J., "Statistical power estimation for FPGAs," in International Conference on Field Programmable Logic and Applications, 2005, pp. 515– 518, 2005.

- [50] A. Reimer, A. Schulz, and N. W., "Modelling macromodules for high-level dynamic power estimation of FPGA-based digital designs," in *Proceedings of the 2006 Interna*tional Symposium on Low Power Electronics and Design, 2006. ISLPED'06, pp. 151– 154, 2006.
- [51] K. Danckaert, K. Masselos, F. Cathoor, H. De Man, and C. Goutis, "Strategy for power-efficient design of parallel systems," *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems, vol. 7, Issue: 2, pp. 258 265, Jun 1999.
- [52] L. Varga, G. Hosszu, and F. Kovacs, "A low-power design technique for digital signal processing applications," in *10th Mediterranean Electrotechnical Conference, MEleCon* 2000, vol. II, 2000.
- [53] P. P. Czapski and A. Sluzek, "Power optimization techniques in FPGA devices: A combination of system and low levels," *International Journal of Electrical, Computer and Systems Engineering*, vol. 1, No.3, pp. 148 –154, 2007.
- [54] Q. Liu, G. Constantinides, K. Masselos, and P. Cheung, "Data-reuse exploration under an on-chip memory constraint for low-power FPGA-based systems," *IET Computers & Digital Techniques*, vol. 3, no. 3, pp. 235–246, 2009.
- [55] S. Ahuja, *High Level Power Estimation and Reduction Techniques for Power Aware Hardware Design*. PhD thesis, Virginia Polytechnic Institute and State University, 2010.
- [56] C. Deng, "Power analysis for CMOS/BiCMOS circuits," in *International Symposium on Low Power Electronics and Design*, 1994.
- [57] P. Landman and J. Rabaey, "Activity-sensitive architectural power analysis," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 15, no. 6, pp. 571–587, 1996.
- [58] S. Gupta and F. Najm, "Analytical models for RTL power estimation of combinational and sequential circuits," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, pp. 808–814, 2000.

- [59] A. Raghunathan, S. Dey, and N. K. Jha, "High-level macro-modeling and estimation techniques for switching activity and power consumption," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 11, no. 4, pp. 538–557, 2003.
- [60] "Xpower estimator userguide." Xilinx, San Jose, CA. http://www.xilinx.com/support/documentation/sw\_manuals/xilinx13\_2/ug440.pdf, 2012.
- [61] B. Farhang-Boroujeny, *Signal Processing Techniques for Software Radios*. Lulu Publishing House, 2008.
- [62] C. Dick and F. Harris, "FPGA implementation of an OFDM PHY," in *Conference Record* of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 905 909, Nov. 2003.
- [63] A. Fort, J.-W. Weijers, V. Derudder, W. Eberle, and A. Bourdoux, "A performance and complexity comparison of auto-correlation and cross-correlation for OFDM burst synchronization," in *IEEE International Conference on Acoustics*, Speech, and Signal Processing, 2003 (ICASSP'03), vol. 2, pp. II 341–4, april 2003.
- [64] K. Wang, J. Singh, and M. Faulkner, "FPGA implementation of an OFDM-WLAN synchronizer," in *Second IEEE International Workshop on Electronic Design, Test and Applications*, 2004. DELTA 2004, pp. 89 94, Jan. 2004.
- [65] T. Schmidl and D. Cox, "Robust frequency and timing synchronization for OFDM," *IEEE Transactions on Communications*, vol. 45, pp. 1613 –1621, Dec. 1997.
- [66] C. N. Kishore and V. U. Reddy, "A frame synchronization and frequency offset estimation algorithm for OFDM system and its analysis," *EURASIP Journal on Wireless Communications and Networking*, vol. 2006, pp. 1–16, 2006.
- [67] J. Guffey, A. Wyglinski, and G. Minden, "Agile radio implementation of OFDM physical layer for dynamic spectrum access research," in *IEEE Global Telecommunications Conference*, 2007. GLOBECOM '07, pp. 4051 –4055, Nov. 2007.

- [68] Z. Huang, B. Li, and M. Liu, "A proposed timing synchronization method for 802.16e downlink," in *International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)*, 2010, pp. 1 –4, Dec. 2010.
- [69] A. Recio and P. Athanas, "Physical layer for spectrum-aware reconfigurable OFDM on an FPGA," in *13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools (DSD), 2010*, pp. 321 –327, Sept. 2010.
- [70] E.-S. Shim and Y.-H. You, "OFDM integer frequency offset estimator in rapidly time-varying channels," in *Asia-Pacific Conf. on Commun.*, pp. 1–4, 2006.
- [71] M. Morelli and M. Moretti, "Integer frequency offset recovery in OFDM transmissions over selective channels," *IEEE Transactions on Wireless Communications*, vol. 7, pp. 5220–5226, Dec. 2008.
- [72] Y.-H. You and K.-W. Kwon, "Multiplication-Free Estimation of Integer Frequency Offset for OFDM-Based DRM Systems," *IEEE Signal Processing Letters*, vol. 17, pp. 851–854, Oct 2010.
- [73] K. Lee, S.-H. Moon, S. Kim, and I. Lee, "Sequence Designs for Robust Consistent Frequency-Offset Estimation in OFDM Systems," *IEEE Transactions on Vehicular Technology*, vol. 62, pp. 1389–1394, March 2013.
- [74] M. Morelli, L. Marchetti, and M. Moretti, "Integer frequency offset estimation and preamble identification in WiMAX systems," in *IEEE International Conference on Communications (ICC)*, pp. 5610–5615, June 2014.
- [75] A. G. Armada and M. Calvo, "Phase noise and sub-carrier spacing effects on the performance of an OFDM communication system," *IEEE Communications Letters*, vol. 2, 1998.
- [76] "IEEE 802 Standard; Part 11; Amendment 5: Television White Spaces (TVWS) Operation," Dec. 2013.
- [77] S. Shellhammer, A. Sadek, and W. Zhang, "Technical challenges for cognitive radio in the TV white space spectrum," in *Information Theory and Applications Workshop*, pp. 323–333, Feb 2009.

- [78] "IEEE 802 Standard; Part 11; Amendment 6: Wireless Access in Vehicular Environments," Jul. 2010.
- [79] W. Vandenberghe, I. Moerman, and P. Demeester, "Approximation of the IEEE 802.11p standard using commercial off-the-shelf IEEE 802.11a hardware," in *International Conference on ITS Telecommunications (ITST)*, pp. 21–26, 2011.
- [80] J. Fernandez, K. Borries, L. Cheng, B. Kumar, D. Stancil, and F. Bai, "Performance of the 802.11p Physical Layer in Vehicle-to-Vehicle Environments," *IEEE Transactions on Vehicular Technology*, vol. 61, no. 1, pp. 3–14, 2012.
- [81] D. Jiang and L. Delgrossi, "IEEE 802.11p: Towards an International Standard for Wireless Access in Vehicular Environments," in *IEEE Vehicular Technology Conference* (VTC), pp. 2036–2040, 2008.
- [82] "IEEE Standard for Wireless Access in Vehicular Environments-Multi-Channel Operations," Dec. 2010.
- [83] X. Wu, S. Subramanian, R. Guha, R. White, J. Li, K. Lu, A. Bucceri, and T. Zhang, "Vehicular Communications Using DSRC: Challenges, Enhancements, and Evolution," *IEEE Journal on Selected Areas in Communications*, vol. 31, no. 9, pp. 399–408, 2013.
- [84] I. Macaluso, B. Ozgul, T. Forde, P. Sutton, and L. Doyle, "Spectrum and Energy Efficient Block Edge Mask-Compliant Waveforms for Dynamic Environments," *IEEE Journal on Selected Areas in Communications*, vol. 32, pp. 307–321, February 2014.
- [85] T. Forde, L. Doyle, and B. Ozgul, "Dynamic Block-Edge Masks (BEMs) for Dynamic Spectrum Emission Masks (SEMs)," in *IEEE Symposium on New Frontiers in Dynamic Spectrum*, pp. 1–10, April 2010.
- [86] P. Kryszkiewicz and H. Bogucka, "Dynamic determination of spectrum emission masks in the varying cognitive radio environment," in *IEEE International Conference on Communications (ICC)*, pp. 2733–2737, June 2013.
- [87] C. Masse, "A direct-conversion transmitter for WiMAX and WiBro applications," *RF Design*, vol. 29, no. 1, pp. 42–46, 2006.

- [88] Y.-M. Chen and I.-Y. Kuo, "Design of lowpass filter for digital down converter in OFDM receivers," in *International Conference on Wireless Networks, Communications and Mobile Computing*, vol. 2, pp. 1094–1099 vol.2, June 2005.
- [89] J. Dowle, S. H. Kuo, K. Mehrotra, and I. V. McLoughlin, "FPGA-based MIMO and space-time processing platform," in *EURASIP Journal of Applied Signal Processing*, no. 34653, Jan. 2006.
- [90] M. Faulkner, "The effect of filtering on the performance of OFDM systems," *IEEE Transactions on Vehicular Technology*, vol. 49, pp. 1877–1884, Sep 2000.