

# ZeBu-Server3 Overview



October 2013

# ZeBu Today

Highest Performance - Lowest cost of ownership

- Scalable architecture & best roadmap
- First emulator to break the billion ASIC-gate barrier (on actual design)
- Parallel & incremental compiler
- Fastest execution speed, up to 5MHz
- Comprehensive debug capabilities with major focus on:
  - Sytem-level Debugging (HW-SW)
  - Fast Debugging (ICE)
  - Off-Line Debugging (Limited debug on line)
  - Tight Integration with Verdi
- Small footprint
- Highest reliability (MTBF > 10 Years for 1 fully loaded unit)







### The Roadmap of ZeBu

- ZeBu-Server3 is based on 28nm Virtex7-2000-T FPGA's
- ZeBu-Server2 was based on 40nm Virtex6-LX760
- ZeBu-Server1 was based on 65nm Virtex5-LX330



<sup>\*</sup> For small configurations – Up to 20-25 V7



## **ZeBu Server-3 Performance Advantages**

# Faster Performance

- Larger FPGAs with fewer hops on critical paths
- Higher bandwidth at module and unit level
- Faster Mux Clocking

Increased Capacity

- Larger FPGAs
- Higher FPGA filling rates
- More routing resources



#### **Zebu Server-3 Customer Benchmarks**

### Early Data

| Design/Application | # V7<br>FPGA | ZS-1 user clock frequency | ZS-3 user clock frequency |
|--------------------|--------------|---------------------------|---------------------------|
| Graphics           | 5            | 0.6 MHz                   | 3.35 MHz                  |
| Graphics           | 9            | 0.6 MHz                   | 3.45 MHz                  |
| Mobile processor   | 8            | 0.65 MHz                  | 1.7 MHz                   |
| Mobile processor   | 3            | 1.25 MHz                  | 3 MHz                     |
| Mobile processor   | 5            | 0.85 MHz                  | 2.0 MHz                   |
| Processor          | 9            | 0.75 MHz                  | 1.65 MHz                  |

Some of the ZS3 data above yet to be fine-tuned further for performance



# Comparison of Logic Resources for DUT Mapping

- Comparison of Virtex7, 6, 5 resources
  - 1x Virtex7 ~ 5x Virtex5
- Actual capacity values based on filling rates used by FPGA P&R in the ZeBu compiler

|         | REGs       | LUTs       | RAMB<br>(36kbits) | DSPs<br>(25x18mul, add, acc) |
|---------|------------|------------|-------------------|------------------------------|
| VIRTEX7 | 600k (25%) | 720k (60%) | 1,030 (80%)       | 2,160 (80%)                  |
| VIRTEX6 | 230k (23%) | 270k (55%) | 580 (80%)         | 864 (80%)                    |
| VIRTEX5 | 120k (60%) | 140k (70%) | 286 (100%)        | 192 (80%)                    |

NOTE: Above percentages are recommended values but users can try higher values



# ZeBu-Server3: Hardware Configurations – Single-Chassis Systems

- Same chassis size of ZeBu-Server1 but...
  - New backplane, new power supply, new power-board, new fans





#### ZeBu-Server3:

#### Hardware Configurations – Multi-Chassis System



- Up to 10 chassis for a max capacity of 3BG System
  - Up to 0.9TB DDR3 DRAM for DUT
  - Up to 49 PCs/Concurrent Users
  - Up to 450 V7 FPGAs for 1 design





## Comparison of Modules in ZS1, ZS2, ZS3

|                             | Max Capacity | Local<br>Mem     | Design<br>Mem  | System<br>Mem  | High<br>Connectivity | ICE |
|-----------------------------|--------------|------------------|----------------|----------------|----------------------|-----|
| ZS3 9F High<br>Connectivity | 60MGates     | 4x 512MB<br>DDR3 | 2x 8GB<br>DDR3 | 2x 8GB<br>DDR3 | Yes                  | No  |
| ZS3 9F-DICE                 | 60MGates     | 4x 512MB<br>DDR3 | 2x 8GB<br>DDR3 | 2x 8GB<br>DDR3 | No                   | Yes |
|                             |              |                  |                |                |                      |     |
| ZS2 17F                     | 40MGates     | NA               | 2x 2GB<br>DDR2 | 2x 2GB<br>DDR2 | N/A                  | No  |
| ZS1 16C                     | 20MGates     | NA               | 2x 2GB<br>DDR2 | 2x 2GB<br>DDR2 | N/A                  | No  |

ZeBu-Server3 9F Module w/ High Connectivity

- 9 Design FPGAs
- Larger inter-FPGA routing resources for higher performance



| Module | DUT             | Max      |                 |               |               |     |  |
|--------|-----------------|----------|-----------------|---------------|---------------|-----|--|
| Name   | LX-2000<br>FPGA | Capacity | Local           | Design        | System<br>(*) | ICE |  |
| 9F     | 9               | 60MG     | 4x512MB<br>DDR3 | 2x8GB<br>DDR3 | 2x8GB<br>DDR3 | NO  |  |

(\*) System memory is shared between trace and cache memory



## ZeBu-Server3 9F Module w/ High Connectivity



# Inter-FPGA Routing Resources ZS3 9F Module w/ High Connectivity

|    | IF | F0 | F1  | F2 | F3  | F4 | F5  | F6 | F7  | F8 |
|----|----|----|-----|----|-----|----|-----|----|-----|----|
| IF |    | 24 | 24  | 24 | 24  | 24 | 24  | 24 | 24  | 24 |
| F0 | 24 |    | 84  | 84 | 84  |    |     | 36 |     | 24 |
| F1 | 24 | 84 |     | 84 | 132 |    |     |    | 84  | 24 |
| F2 | 24 | 84 | 84  |    | 84  | 84 |     |    |     | 24 |
| F3 | 24 | 84 | 132 | 84 |     |    | 132 |    |     | 24 |
| F4 | 24 |    |     | 84 |     |    | 84  | 84 | 84  | 24 |
| F5 | 24 |    |     |    | 132 | 84 |     | 84 | 132 | 24 |
| F6 | 24 | 36 |     |    |     | 84 | 84  |    | 84  | 24 |
| F7 | 24 |    | 84  |    |     | 84 | 132 | 84 |     | 24 |
| F8 | 24 | 24 | 24  | 24 | 24  | 24 | 24  | 24 | 24  |    |

- LVDS pairs at 1 GBps
- 256 LVDS Pairs on the backplane
- 7 Primary low skew clock per FPGA



#### ZeBu-Server3 9F-DICE Module

- 9 Design FPGAs + DICE
- ICE is partially compatible with ZSE:
  - Same DirectICE report board, but with a subset of ERNI connectors
  - Pin-out is different for some connectors



| Module | DUT       | <br>  Max | I               |               |               |               |
|--------|-----------|-----------|-----------------|---------------|---------------|---------------|
| Name   | X-2()()() | Local     | Design          | System<br>(*) | ICE           |               |
| 9F     | 9         | 60MG      | 4x512MB<br>DDR3 | 2x8GB<br>DDR3 | 2x8GB<br>DDR3 | 2x284<br>I/Os |

(\*) System memory is shared between trace and cache memory



#### **ZeBu-Server3 9F-DICE Module**

MegArray MegArray DICE DICE 2x256MB 2x256MB DDR3 DDR3 F1 F<sub>0</sub> FC **FS** Backplane Interface F2 F3 System Pairs) **FPGA** Control 8GB DDR3 System (256 LVDS **FPGA** Memory (x2, 2 pcb sides) 8GB DDR3 F5 F4 Design Memory FM (x2, 2 pcb sides) IF Mem DUT F6 **F7** Controller + DUT MegArray DICE **RTB** 2x256MB 2x256MB DDR3 DDR3

#### **ZeBu-Server3 DirectICE Interface**

- Two modules (top) with DirectICE
  - 94 IOs for F3 & F5, 190 IOs for F1 & F7
  - 284 IOs per module (or 568 IOs for 1 module)
  - 25 IOs per 50 pins ERNI connector (like ZeBu-Server1&2)



# Inter-FPGA Routing Resources ZS3 9F-DICE Module

|    | IF | F0 | F1 | F2 | F3  | F4 | F5  | F6 | F7 | F8 |
|----|----|----|----|----|-----|----|-----|----|----|----|
| IF |    | 24 | 24 | 24 | 24  | 24 | 24  | 24 | 24 | 24 |
| F0 | 24 |    | 84 | 84 | 84  |    |     | 24 |    | 24 |
| F1 | 24 | 84 |    | 84 | 84  |    |     |    | 24 | 24 |
| F2 | 24 | 84 | 84 |    | 84  | 72 |     |    |    | 24 |
| F3 | 24 | 84 | 84 | 84 |     |    | 120 |    |    | 24 |
| F4 | 24 |    |    | 72 |     |    | 84  | 84 | 84 | 24 |
| F5 | 24 |    |    |    | 120 | 84 |     | 84 | 84 | 24 |
| F6 | 24 | 24 |    |    |     | 84 | 84  |    | 84 | 24 |
| F7 | 24 |    | 24 |    |     | 84 | 84  | 84 |    | 24 |
| F8 | 24 | 24 | 24 | 24 | 24  | 24 | 24  | 24 | 24 |    |

- LVDS pairs at 1 GBps
- 256 LVDS Pairs on the backplane
- 7 Primary low skew clock per FPGA



### ZeBu-Server Compiler

- Fast and automatic compilation of ASIC RTL, including:
  - Fast Synthesis with behavioral capabilities (DPI, \$display, SVA, ...)
  - Clock handling
  - Clustering/Design Distribution
  - Probe and Assertion Insertion
  - Power-aware randomization insertion
  - Timing-Driven Place & Route
- Parallel and incremental multi-core flow
  - Keeps compilation times flat with increasing design size and complexity
  - Typically 5-30MG/hr (on a PC farm); up to 100MG/hr on large multi-core designs!
- New Graphical User Interface
  - Project definition has been rationalized
  - Easier to use it
  - Better log/report management



