

WP458 (v2.0) October 29, 2015

# Leveraging UltraScale Architecture Transceivers for High-Speed Serial I/O Connectivity

By: Brandon Jiao

Xilinx® UltraScale™ architecture-based transceivers deliver real value to the designer through their unprecedented synergy of leading-edge hardware and interconnect IP.

#### **ABSTRACT**

Encompassing both UltraScale FPGAs (20nm) and UltraScale+ MPSoCs and FPGAs (16nm), the UltraScale architecture is a revolutionary approach to creating programmable devices capable of addressing the massive I/O and memory bandwidth requirements of next-generation applications, while efficiently routing and processing the data brought on-chip. UltraScale and UltraScale+™ devices address a vast spectrum of high-bandwidth, high-utilization system requirements through industry-leading technical innovations. UltraScale and UltraScale+ devices share many building blocks to provide optimized scalability across the product range, as well as numerous new power reduction features for low total power consumption.

For years, programmable devices with high-speed serial I/O have been a clear choice for systems requiring bandwidth, density, performance, flexibility, and cost effectiveness. Designed to scale from 20nm to 16nm, UltraScale and UltraScale+ FPGAs and MPSoCs are equipped with a portfolio of transceivers suitable for a range of applications, from commercial video displays to ultrahigh bandwidth wired telecommunications backplanes and optical interfaces.

This white paper gives an overview of the UltraScale architecture's transceivers and elaborates on the industry-leading features that help system designers to:

- Increase system performance
- Improve system margin and robustness
- Reduce BOM cost
- Reduce total power consumption
- Accelerate productivity

<sup>©</sup> Copyright 2014–2015 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Vivado, Zynq, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. PCI, PCIe, and PCI Express are trademarks of PCI-SIG and used under license. CPRI is a trademark of Siemens AG. All other trademarks are the property of their respective owners.



## **Bandwidth Growth and High-Speed Serial Interfaces**

Global IP traffic has increased eightfold over the past five years, and is likely to increase threefold over the next five years. The number of devices connected to IP networks is estimated to be nearly three times as high as the global population in 2018. The Cisco Visual Network Index (VNI) tracks and forecasts bandwidth growth. Global IP traffic is projected by VNI to reach 132 Exabytes (EB) per month in 2018, as shown in Figure 1.



Figure 1: Global IP Traffic Forecast from Cisco VNI for 2013–2018

Electronic systems must scale with this bandwidth demand. The challenge faced by system designers is not only processing the incoming data but managing the data flow in and out of semiconductor devices. Because the number of pins on these devices does not scale proportionally with transistor count and logic capacity, each pin must allow more traffic to accommodate more total traffic. With the advantages of fewer pins, flexible system clocking, lower cost, and lower EMI, high-speed serial interfaces are ideal for managing traffic on- and off-chip.

For systems that need high bandwidth, high density, high performance, design flexibility, and low cost, programmable devices have been a clear choice for decades. Accordingly, high-speed serial interfaces are considered a must-have feature for these devices.

## **Serial Connectivity Application Classifications**

In reference to connectivity, applications for serial transceivers can be categorized as the following:

- Serial backplanes
- Optical interfaces
- Host and peer communication
- Chip-to-chip communication



Applications face different design challenges depending on transceiver connectivity and the performance target, resulting in different criteria for transceiver selection. For example, for a low performance chip-to-chip application, the most important criteria are protocol efficiency and easy design and integration. For a high-performance backplane system, signal integrity, system robustness, and IP support are the most critical.

## UltraScale Architecture Transceiver Portfolio

The UltraScale architecture is scalable and optimized to allow all programmable logic elements to target applications with IP portability from family to family. The transceiver portfolio extends this concept of scalability and portability. With line rates of 500Mb/s to 32.75Gb/s, the portfolio delivers a performance range that more than doubles that of any FPGA solution from previous generations. The UltraScale architecture transceiver portfolio is described in Table 1.

Table 1: UltraScale Architecture Transceiver Portfolio

| Transceiver | Devices                                                                                             | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|-------------|-----------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| GTY         | Virtex® UltraScale Virtex UltraScale+(1) Kintex® UltraScale(2) Kintex UltraScale+ Zynq® UltraScale+ | The GTY transceiver provides unprecedented levels of performance. It is ideal for a wide range of applications, such as 400+Gb/s systems, large-scale emulation, and high-performance computing.  The GTY transceiver in the UltraScale devices (20nm) supports line rates from 500Mb/s to 30.5Gb/s. It provides high performance for optical and backplane applications, and 30G transceivers for chip-to-chip, chip-to-optics, and 28G backplanes.  The GTY transceiver in UltraScale+ devices (16nm) support line rates from 500Mb/s to 32.75Gb/s. It provides maximum performance for the fastest optical and backplane applications; 33G transceivers for chip-to-chip, chip-to-optics, |
| GTH         | Virtex UltraScale Kintex UltraScale Kintex UltraScale+                                              | and 28G backplanes.  Supporting line rates from 500Mb/s to 16.375Gb/s, the GTH transceiver is optimized for low power and high performance for high loss backplane channels.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| GTR         | Zynq UltraScale+ Zynq UltraScale+                                                                   | The GTR transceiver supports integration of five common protocols to the Processor System (PS) in Zynq UltraScale+ MPSoCs.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |

#### Notes:

- 1. GTY up to 16.375Gb/s. Refer to data sheet for details.
- 2. GTY up to 16.375Gb/s. Refer to data sheet for details.



The transceiver architecture in UltraScale and UltraScale+ devices comprises:

- Physical Medium Attachment Sublayer (PMA)
  - The PMA includes a serial/parallel interface (PISO, SIPO), phase-locked loop (PLL), clock data recovery (CDR), pre-emphasis, and equalization blocks.
- Physical Coding Sublayer (PCS)
  - The PCS contains logic-to-process parallel data and includes FIFO, coding/decoding, and gearbox functionality.
- Package
- FPGA logic interface

Figure 2 illustrates an abstract diagram of a transceiver channel.



Figure 2: GTH Transceiver Channel Architecture in DFE Mode

The implementation of the blocks in each channel is different based on connectivity and performance requirements. <u>UG576</u>, the *UltraScale Architecture GTH Transceiver User Guide*, contains recommended use modes for the protocols listed in data sheets <u>DS892</u>, *Kintex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics* and <u>DS893</u>, *Virtex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics*. <u>UG578</u>, the *UltraScale Architecture GTY Transceiver User Guide*, contains recommended use modes for the protocols also listed in <u>DS892</u> and <u>DS893</u>. The transceiver wizard provides the recommended settings for optimal usage of protocol-specific characteristics.

The PS-GTR is a new type of transceiver offered in Zynq UltraScale+ MPSoCs. The use modes supported by PS-GTR are restricted to those shown in Table 2.

| Standard               | Data Rate (Gb/s) | Configuration |
|------------------------|------------------|---------------|
| PCIe® 2.0              | 2.5              | v1 v2 v4      |
| PCIEW 2.0              | 5.0              | x1, x2, x4    |
|                        | 1.5              |               |
| SATA 3.1               | 3.0              | x2            |
|                        | 6.0              |               |
|                        | 1.62             |               |
| DisplayPort<br>TX-Only | 2.7              | x1, x2        |
|                        | 5.4              |               |
| USB 3.0                | 5.0              | x2            |
| SGMII                  | 1.25             | х4            |

Table 2: Zynq UltraScale+ MPSoC PS-GTR Transceiver Protocol List

## Key Enablers in UltraScale Architecture Transceivers

When choosing a device for a serial-based design, developers must consider a diverse set of criteria. The UltraScale architecture silicon leads the industry in line rates, aggregate bandwidth, signal integrity, and easy link tuning. Together with the intuitive design tools, broad IP support, and plentiful resources, the UltraScale architecture transceivers provide an unprecedented programmable device transceiver solution.

#### Line Rate, Density, and System Bandwidth

The transceiver offerings cover the gamut of today's high-speed protocols. The GTH and GTY transceivers provide the low jitter required for demanding optical interconnects and feature world-class auto-adaptive equalization with PCS features required for difficult backplane operation. Table 3 shows the range of support within each device family.

|                    | Туре           | Max<br>Performance     | Max<br>Transceivers <sup>(1)</sup> | Peak<br>Bandwidth <sup>(2)</sup> |
|--------------------|----------------|------------------------|------------------------------------|----------------------------------|
| Virtex UltraScale+ | GTY            | 32.75Gb/s              | 128                                | 8,384Gb/s                        |
| Kintex UltraScale+ | GTH/GTY        | 16.3 / 32.75Gb/s       | 44/32                              | 3,530Gb/s                        |
| Zynq UltraScale+   | PS-GTR/GTH/GTY | 6.0 / 16.3 / 32.75Gb/s | 4/44/28                            | 3,316Gb/s                        |
| Virtex UltraScale  | GTH/GTY        | 16.3 / 30.5Gb/s        | 60/60                              | 5,616Gb/s                        |
| Kintex UltraScale  | GTH            | 16.3Gb/s               | 64                                 | 2,086Gb/s                        |

Table 3: Range of Support within UltraScale and UltraScale+ Families

#### Notes:

- 1. Max transceiver count found across multiple device families
- 2. Combined transmit and receive

The protocol configurations in the PS-GTR transceiver are restricted to those shown in Table 4.

| Protocol | Lane0  | Lane1  | Lane2  | Lane3  |
|----------|--------|--------|--------|--------|
| PCIe     | PCIe.0 | PCIe.1 | PCIe.2 | PCIe.3 |
| SATA     | SATA0  | SATA1  | SATA0  | SATA1  |
| USB0     | USB0   | USB0   | USB0   |        |
| USB1     |        |        |        | USB1   |
| DP       | DP.1   | DP.0   | DP.1   | DP.0   |
| SGMI10   | SGMI10 |        |        |        |
| SGMII1   |        | SGMII1 |        |        |
| SGMI12   |        |        | SGMI12 |        |
| SGMI13   |        |        |        | SGMII3 |

Table 4: PS-GTR Transceiver Protocol Configuration

## Flexible PLL Selection and Optimal Jitter Performance

The phase-locked loop (PLL) is a fundamental part of the transceiver because it largely contributes to overall transceiver quality:

- The PLL frequency range determines transceiver line rate
- The PLL jitter determines overall jitter performance and affects system margin

The type of oscillator used in the PLL is particularly important. There are two common oscillator choices in the transceiver industry: the ring oscillator and the LC tank oscillator.

- A ring oscillator has a wide frequency range and simple topology; it requires a smaller physical area than an LC tank oscillator. However, its jitter performance at higher line rates makes it a less than optimal choice for high-speed protocols and channels.
- A wide-band LC tank oscillator has excellent jitter performance. However, its topology requires more physical space.

Transceivers in UltraScale architecture-based devices offer a mix of LC tank and ring oscillators to cover a wide variety of applications and connectivity. The GTH and GTY transceivers are implemented as *Quads*. Figure 3 describes the Quad architecture of GTH and GTY transceivers.



Figure 3: UltraScale Architecture GTH and GTY Transceiver Clocking

Each Quad consists of four transceiver channels and contains two LC tank-based Quad PLLs (QPLL) and four ring-based Channel PLLs (CPLL). The clock for each transmitter or receiver can be selected from either the QPLL or dedicated CPLL. For more flexibility the reference clock for the PLLs can come from three clock sources:

- The two dedicated reference clocks within the Quad. Both of the reference clock pins can be used for recovered clock output.
- Two North paths and two South paths to ±2 Quads. This structure provides great flexibility and allows efficient use of all available transceiver channels, meeting various system performance requirements.

Either QPLL can be shared by the serial transceiver channels within the same Quad, but cannot be shared by channels in other Quads. The QPLL0/1 VCO operates within two different frequency bands. Table 5 through Table 7 describe the nominal operating ranges for these bands in GTH and GTY transceivers. Overlapping between 9.8GHz and 13.0GHz enables the GTH and GTY transceivers to support close, though not equal, frequencies within the same Quad.

Table 5: UltraScale Architecture GTH Transceiver QPLL0/1, Nominal Operation Range

| QPLL  | Frequency (GHz) |
|-------|-----------------|
| QPLL0 | 9.8 – 16.375    |
| QPLL1 | 8.0 – 13.0      |

Table 6: UltraScale Device GTY Transceiver QPLLO/1, Nominal Operation Range

| QPLL  | Frequency (GHz) |
|-------|-----------------|
| QPLL0 | 9.8 – 15.25     |
| QPLL1 | 8.0 – 13.0      |

Table 7: UltraScale+ Device GTY Transceiver QPLL0/1, Nominal Operation Range

| QPLL  | Frequency (GHz) |
|-------|-----------------|
| QPLL0 | 9.8 – 16.375    |
| QPLL1 | 8.0 – 13.0      |

## **Line Rate Coverage: GTH and GTY Transceivers**

The dual-LC based PLL architecture enables the QPLL to provide continuous line rates from 500Mb/s to 16.3Gb/s using UltraScale devices (20nm) with GTH transceivers, and from 500Mb/s to either 30.5Gb/s or 32.75Gb/s using UltraScale (20nm) or UltraScale+ (16nm) devices with GTY transceivers, respectively. Figure 4 and Figure 5 summarize the line rate coverage of UltraScale and UltraScale+ families' GTH and GTY transceivers.



Note 1: /16 for LC is not shown for simplicity

WP458\_04\_102915

Figure 4: UltraScale Architecture GTH Line Rate Coverage



Note 1:/32 for LC is not shown for simplicity

WP458\_05\_100115

Figure 5: UltraScale Architecture GTY Line Rate Coverage

### **PS-GTR Transceiver Clocking Architecture**

The reference clock reviver (RCR) comes with an in-built termination resistance that can be enabled or disabled by a control bit. A total of four reference clock receivers are present. Figure 6 illustrates the PS-GTR transceiver reference clock distribution.



Figure 6: PS-GTR Transceiver Reference Clock Distribution

The reference clock frequencies required to support various protocols are listed in Table 8.

Table 8: Reference Clock Configuration

| Protocol               | Reference Clock Frequency (MHz)                   |
|------------------------|---------------------------------------------------|
| PCIe (multilane)       | 100.0                                             |
| SATA (multicore)       | 125.0                                             |
| USB3.0                 | 19.2/24.0/25.0/26.0/38.4/48.0/50.0/52/100.0/104.0 |
| DP (harmonic of 27MHz) | 27.0/54.0/108.0                                   |
| SGMII                  | 125.0                                             |

## Fractional-N PLL Architecture and Application

A fractional-N PLL (fPLL) allows frequency resolution that is a fractional portion of the reference frequency. All LC PLLs in UltraScale architecture-based transceivers use fractional-N architecture (with the exception of the GTH transceiver in 20nm devices). Figure 7 represents the GTY LC PLL architecture.



Figure 7: Transceiver QPLL0/1 Details

The input clock can be divided by a factor of M before it is fed into the phase frequency detector. The feedback divider N determines the VCO multiplication ratio. For line rates below 16.375Gb/s, a fractional-N divider is supported. The effective divide ratio is a combination of the N factor plus a fractional part capable of a fine divide ratio producing up to 24-bit resolution (i.e., frequency resolution of 0.06 ppm). For example, in Figure 7:

- 1. The QPLLn\_FBDIV is set for the generation of the integer part and the SDMnDATA and SDMnWIDTH registers for the fractional part.
- 2. The charge pump converts the logic outputs of the phase frequency detector into analog current signals.
- 3. The loop filter converts the current coming from the charge pump to control voltage and smooths it before feeding to voltage-controlled oscillator (VCO).

As described in the previous section, there are in total five Quads (two North paths and two South paths to  $\pm 2$  Quads) that can share one reference clock source, and there are 2 QPLLs per Quad. Consequently, the fPLL architecture in the QPLLs enables maximum of ten simultaneous PLL rates from one clock.

Figure 8 depicts an fPLL usage example of supporting multiple protocols across 5 Quads sharing one reference clock. In this example, the usage of fPLLs saves four onboard Voltage Controlled



Crystal Oscillators (VCXOs) and helps to reduce the bill of materials (BOM) cost and onboard power consumption.



Figure 8: Example fPLL Usage: Multiple Protocols Supported across Five Quads with Single Reference Clock

Figure 9 depicts an fPLL usage example of supporting four line rates with a single reference clock. Using a 155.5625MHz reference clock, the required feedback divider settings for OC-192/STM-64, 10GE, OTU2, and OTU2e protocols can be generated:

- OC-192/ STM-64: 9.956Gb/s using feedback divider of 64
- **10GE:** 10.3125Gb/s using feedback divider of 66.29168
- OTU2: 10.7Gb/s using feedback divider of 68.78264
- OTU2e: 11.05Gb/s using feedback divider of 71.03254



Figure 9: Example fPLL Usage in GTY Transceiver: Four Line Rate Support with Single Reference Clock

## **Equalization for Backplanes and Optics**

When traversing serial links with optics or backplanes, high-speed signals are degraded by impairments in the link, such as insertion loss, reflections, crosstalk, and optical dispersion. As the serial signaling rate increases, channel-driven signal degradation increases. The techniques used to compensate for this, therefore, become critical elements in the design. This compensation is usually called de-emphasis in the transmitter and equalization in the receiver.



To enable creation of a robust high-speed serial link system with these transceivers running at low power, equalization techniques are built into each transceiver based on its targeted line rate and targeted applications. See Table 9.

| Table 9: UltraScale Architecture | Transceiver Equalization | Capabilities |
|----------------------------------|--------------------------|--------------|
|----------------------------------|--------------------------|--------------|

| Feature GTH                    |                | GTY            |  |
|--------------------------------|----------------|----------------|--|
| Transmit/Receive Equalization: |                |                |  |
| TX FIR                         | 3 Taps         | 3 Taps         |  |
| RX CTLE                        | 3 Stages       | 3 Stages       |  |
| RX DFE                         | 11 Taps        | 15 Taps        |  |
| RX EQ Adaptation               | Fully Adaptive | Fully Adaptive |  |

#### **Transmit De-emphasis**

TX de-emphasis has been used for many years as a simple solution to combat insertion loss. In the UltraScale architecture, all transceivers have a three-tap FIR to implement TX de-emphasis. The three taps include the precursor, main cursor, and post-cursor. Because insertion loss causes high-frequency signal components to lose more energy than low-frequency components, the TX post-cursor de-emphasis suppresses low-frequency components at the transmitter. When a 0-to-1 or 1-to-0 transition occurs, regular (nominal) voltage swing is applied to the symbols that have a new value after the transition. The symbols that keep the same value after the transition, however, use a reduced voltage swing. The precursor de-emphasis is similar to post-cursor. The difference is that the symbol with the high swing is the one *before* the transition rather than *after* the transition. The amount of swing voltage reduction is referred to as the "tap weight" of the de-emphasis.

#### **Receive Equalization**

In the GTY and GTH receivers in UltraScale and UltraScale+ devices, a similar equalization technique is used to attenuate the low-frequency component of the signal while boosting the high-frequency component. Illustrated in Figure 10, Continuous Time Linear Equalization (CTLE) at the receiver end is one of the most popular linear equalization techniques.



Figure 10: GTY/ GTH Transceivers Three-Stage CTLE

It is well known that linear equalization techniques such as RX CTLE have one major limitation: noise amplification. When noise (such as reflections or crosstalk) is present on the channel, CTLE amplifies the high-frequency noise right along with the data.

Decision Feedback Equalization (DFE) is a proven technique to mitigate inter-symbol interference (ISI) without amplifying noise. It works by directly removing the ISI from previous bits, allowing the current bit to be correctly sampled. The DFE starts with a "decision slicer" to determine whether the current symbol is High or Low. The resulting symbol goes through unit delays and multiplies with the tap weights.

The weighted delayed signals are added together and then subtracted from the input analog signals, as shown in Figure 11.



Figure 11: Simplified DFE Block Diagram

If the tap weights are well selected to cancel the ISI at each following symbol, the result of this feedback loop is able to compensate for as many taps of ISI as the DFE has, as illustrated in Figure 12. Also, the DFE only compensates post-cursor ISI and cannot remove precursor-induced ISI.



Figure 12: Single Bit Response with DFE+CTLE/AGC

#### **Equalization Capabilities for Optics**

The body of the optical landscape continues to focus on the 10Gb/s arena, while the most cutting-edge applications are moving to the latest 100G, 200G, or even 400G options. The most popular SFP+ and XFP optics are supported directly by the UltraScale architecture GTH and GTY transceivers. The latest 100G CFP4 is supported by the UltraScale architecture GTY transceiver. As shown in Table 10, GTY and GTH transceivers in UltraScale and UltraScale+ devices support these cutting-edge optics by providing the lowest transmit jitter and a huge receive margin to overcome optical dispersion.



| Optics | Optical Standard | Electrical Standard | Re-timer | UltraScale and<br>UltraScale+<br>Device<br>Support |
|--------|------------------|---------------------|----------|----------------------------------------------------|
| 10G    | •                |                     |          |                                                    |
| SFP+   | SR/LR/ER/misc    | SFF-8431 and others | No       | ✓                                                  |
| XFI    | OTU/OC           | INF8077i and others | Yes      | ✓                                                  |
| 40G    |                  |                     |          |                                                    |
| QSFP   | LR4              | XLPPI               | No       | ✓                                                  |
| 100G   |                  |                     |          |                                                    |
| CFP    | LR4              | CAUI-10             | Gearbox  | ✓                                                  |
| CFP2   | LR4/SR10         | CAUI-10/4           | Yes/No   | ✓                                                  |
| CFP4   | LR4              | CAUI-4              | Yes      | ✓                                                  |
| QSFP28 | LR4              | CAUI-4              | Yes      | ✓                                                  |
| 400G   | •                |                     | 1        |                                                    |
| CDFP   | N/A              | OIF-CEI-28G-VSR     | No       | ✓                                                  |

Table 10: GTY/GTH Transceiver Equalization Capabilities for Optics

## Adaptive Equalization and On-Chip Eye Scan

For serial systems beyond 3Gb/s, multiple equalization techniques are needed to compensate for signal distortion in the serial link. Each technique offers multiple settings to handle different channel conditions. The matrix of TX de-emphasis, RX DC gain, RX CTLE, and RX DFE settings can easily go to thousands or millions of combinations. For example, for a simple receiver with 4 DC gain settings, 16 CTLE settings, and 3-tap DFE (where each tap has 4 settings), there are 4,096 combinations of settings. For high-quality 10Gb/s transceivers, such as the GTH/GTX transceivers, there are millions of settings for the receiver alone. To assure the highest possible performance, Xilinx FPGAs provide unique *auto-adaptive receiver equalization* built directly in hard logic.

All equalization techniques in the receiver (AGC, CTLE, and DFE) are available in three modes: (1) manual, (2) one-time calibration, and (3) continuous adaptation. Xilinx recommends continuous adaptation. A backplane with multiple slots for line cards is an example where continuous adaptation is critical. Figure 13 shows such a backplane. The transmitters and receivers connect to two line cards respectively. The channel condition varies when the line cards plug into different slots. Usually, different equalization settings must be applied for different slots to ensure a robust serial link. In manual mode, the system designer has to set the receiver equalization settings by hand. With one-time calibration, every time the line card plugs into a different slot, the system designer has to recalibrate. With continuous adaptation, the same transceiver setting can build a robust serial link regardless of the slot because the adaptation adjusts the gain control, CTLE peaking, and DFE, and thus automates what had previously been a manual task for the designer.

Additionally, voltage and temperature changes impact transceiver performance and channel characteristics. Continuous adaptation offers protection for link margin, which manual mode and one-time calibration mode cannot provide.



Figure 13: Multiple Line Card Slots with Backplane

In addition to equalization, all UltraScale architecture GTY and GTH transceivers incorporate non-destructive 2-D eye scan, which captures the post-equalization signal immediately before data recovery takes place at the receiver. It visualizes the effects of equalization and helps determine link margin, thereby accelerating the debug process. Figure 14 shows an example of a 2-D eye scan result for a GTH receiver. The wide-open eye proves that the link under test (16.3Gb/s through a 30-inch TE StradaWhisper Backplane with more than 25dB insertion loss) has sufficient margin.



Figure 14: UltraScale Architecture GTH Transceiver 2-D Eye Scan Sample

The 2-D eye scan can be used in board validation as part of a system debug methodology. The auto-adaptive equalization and the on-chip non-destructive 2-D eye scan can dramatically reduce the workload for system bring-up, maintenance, and debug.



## PS-GTR Equalization and On-Chip Eye Scan

The PS-GTR RX module in UltraScale+ devices supports multiple protocol-USB3.0, PCIe 2.0, SATA3.1, and SGMII. The RX data rates ranges from 1.25Gb/s to 6Gb/s. It supports both DC and AC coupled operations and supports 5000ppm spread spectrum modulation at a rate of 30–33kHz.

In the RX module, the equalization is implemented as a CTLE, which is fully adaptive.

For on-chip eye scan, the PS-GTR receivers have replica phase interpolator (PI), which enables the on-chip, non-destructive 2-D eye scan.

## **Design Tools and Resources**

A complete transceiver solution requires more than just silicon to create a user-friendly, reliable, efficient environment. Equally important are simulation model kits, evaluation boards, design tools, and reference designs. Xilinx continues to provide such a platform for UltraScale and UltraScale+devices.

#### Simulation Model Kits and Evaluation Boards for Signal Integrity Analysis

Since signal integrity is critical, signal link margin analysis at an early stage of the system design phase is usually necessary before selecting a device. Xilinx provides IBIS-AMI model kits for all UltraScale architecture-based transceivers. The IBIS-AMI model kits provide both fast simulation speed (minutes for 1 million bits) and HSPICE-correlated accuracy. They are also fully IBIS 5.0 compatible, making them highly transportable, flexible, and usable. Figure 15 shows an example of a time domain IBIS-AMI simulation result with KeySight ADS 2015.01.



Figure 15: UltraScale Architecture GTY Transceiver IBIS-AMI Time-Domain Simulation Sample Result



Designers can also validate link margin and perform analysis on real hardware with a Xilinx transceiver evaluation board, leveraging the IBERT GUI with on-chip 2-D eye scan.

## **User-Friendly Design Suite for Productivity**

When signal integrity analysis shows sufficient link margin, an easy-to-use design suite with great flexibility is needed to develop a specific design. Xilinx provides configuration wizards for all UltraScale architecture transceivers to serve both the mainstream user and advanced transceiver expert. A designer can easily select protocols, line rates, and the number of channels from drop-down menus. Within a few clicks, the wizard can complete the rest of the configuration process automatically with the default implementation. Advanced users can customize the design in the configuration wizard for specific system requirements. See Figure 16.



Figure 16: UltraScale Architecture Transceiver IBERT Wizard

## Targeted Design Platforms and Targeted Reference Designs for Fast Development Time

Xilinx development kits provide out-of-the box design solutions that significantly cut development time and enhance productivity. Targeted Design Platforms (TDPs) go one step further by providing an evaluation board, Vivado® Design Suite, IP cores, reference designs, and FPGA Mezzanine Card (FMC) support—so designers can begin application development immediately. Both development kits and TDPs are available across the Virtex UltraScale and UltraScale+ and Kintex UltraScale and UltraScale+ product families.



Figure 17 illustrates the Virtex® UltraScale™ FPGA VCU108 Evaluation Kit that is optimized for quickly prototyping applications using Virtex UltraScale FPGAs with access to the following features.

- 4 MB RLD3 Component Memory Interface
- Two 4 GB DDR4 Component Memory Interfaces
- 4 x 28 Gb/s CFP2 & QSFP28 Optical Interfaces
- PCIe Gen3 x8
- 2X FPGA Mezzanine Card (FMC) interface for I/O expansion

For more development kits, go to <a href="https://www.xilinx.com/products/boards-and-kits/">www.xilinx.com/products/boards-and-kits/</a>.



Figure 17: Virtex UltraScale FPGA VCU108 Evaluation Kit

## **Broadest Range of Serial IP Protocol Support**

The UltraScale architecture transceivers support a wide range of line rates and numerous serial protocols associated within this range. Xilinx and its ecosystem partners, including hundreds of IP design houses, provide a comprehensive serial connectivity IP portfolio for the most popular design protocols, such as Ethernet, PCIe, Interlaken, and OTU. Table 2, page 5 and Table 3, page 5 show examples of Xilinx IP support. For a comprehensive list of protocols addressed by UltraScale architecture transceivers, visit <a href="https://www.xilinx.com/products/technology/high-speed-serial/">www.xilinx.com/products/technology/high-speed-serial/</a>.



## **UltraScale Architecture Transceiver Design Examples**

Figure 18 depicts an example of a 400GE line card with a 4x100GE MAC-to-Interlaken Bridge design showing three different uses of the transceivers on a Virtex UltraScale device to achieve 400G throughput on a single device.



Figure 18: Transceiver Use Case Example: 400GE MAC-to-Interlaken Bridge

- 1. GTY transceivers (16 x 25.78Gb/s) connect directly to CFP4 optics via an OIF VSR channel
- 2. GTY transceivers (20 x 5.78Gb/s) connect to backplane or a chip-to-chip Interlaken interface
- 3. GTH transceivers (32 x 15Gb/s) connect to HMC

Figure 19 illustrates a design example for an 8x8 100MHz TD-LTE Remote Radio Unit with a Zynq UltraScale+ MPSoC. The ZU15EG device has four connections running 10.1Gb/s Common Public Radio Interface (CPRI) through optical modules. It also talks to Analog-to-Digital Converters (ADCs) and Digital-to-Analog Converters (DACs) through JESD204B at 12.5Gb/s.



Zyng UltraScale+ MPSoC Solution

WP458\_19\_100115

Figure 19: Zynq UltraScale+ MPSoC Wireless Radio Example

## Summary

The UltraScale architecture offers a portfolio of transceivers to deliver exceptional value to customers with these powerful new features:

- PS-GTR, a new transceiver type in Zynq UltraScale+ MPSoCs
- PCIe 3.0 hardened PCS and PIPE interface for GTY and GTH transceivers in UltraScale devices
- PCIe 4.0 hardened PCS and PIPE interface for GTY and GTH transceivers in UltraScale+ devices
- Async 64B/66B gearbox with built-in latency measurement in UltraScale architecture GTY and GTH transceivers
- Ethernet IEEE Std 1588 and CPRI applications supported by GTY and GTH transceivers
- Two QPLLs (LC Tank) per Quad in UltraScale architecture GTH transceivers
- Two fractional QPLLs per Quad in GTY and GTH transceivers
- AGC, CTLE, and DFE with auto-adaptation
- 16+Gb/s backplane support for GTH transceivers
- 28+Gb/s backplane support for GTY transceivers
- Dedicated output for recovered clock for GTY and GTH transceivers
- New programmable divider in GTY and GTH transceivers simplifies system clocking design
- Power reduction

For more information on UltraScale architecture transceivers, go to: <a href="https://www.xilinx.com/products/technology/high-speed-serial/">www.xilinx.com/products/technology/high-speed-serial/</a>



## **Revision History**

The following table shows the revision history for this document:

| Date       | Version | Description of Revisions                                                                              |
|------------|---------|-------------------------------------------------------------------------------------------------------|
| 10/29/2015 | 2.0     | Major update (data, text, table, and figures) to include UltraScale+ devices (both FPGAs and MPSoCs). |
| 04/21/2015 | 1.1     | Update Figure 3, Figure 4, Fractional-N PLL Architecture and Application, Figure 10, and Figure 20.   |
| 02/08/2015 | 1.0.1   | Updated Figure 10.                                                                                    |
| 12/17/2014 | 1.0     | Initial Xilinx release.                                                                               |

## Disclaimer

The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display the Materials without prior written consent. Certain products are subject to the terms and conditions of Xilinx's limited warranty, please refer to Xilinx's Terms of Sale which can be viewed at http://www.xilinx.com/legal.htm#tos; IP cores may be subject to warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products in such critical applications, please refer to Xilinx's Terms of Sale which can be viewed at http://www.xilinx.com/ legal.htm#tos.

## **Automotive Applications Disclaimer**

XILINX PRODUCTS ARE NOT DESIGNED OR INTENDED TO BE FAIL-SAFE, OR FOR USE IN ANY APPLICATION REQUIRING FAIL-SAFE PERFORMANCE, SUCH AS APPLICATIONS RELATED TO: (I) THE DEPLOYMENT OF AIRBAGS, (II) CONTROL OF A VEHICLE, UNLESS THERE IS A FAIL-SAFE OR REDUNDANCY FEATURE (WHICH DOES NOT INCLUDE USE OF SOFTWARE IN THE XILINX DEVICE TO IMPLEMENT THE REDUNDANCY) AND A WARNING SIGNAL UPON FAILURE TO THE OPERATOR, OR (III) USES THAT COULD LEAD TO DEATH OR PERSONAL INJURY. CUSTOMER ASSUMES THE SOLE RISK AND LIABILITY OF ANY USE OF XILINX PRODUCTS IN SUCH APPLICATIONS.