



an URI / NEU collaboration

# A Scalable Billion Transistor CPU With High IPC

submitted to
International Symposium on Microarchitecture (Micro-35) 2002

student David Morano
advisors Professor David Kaeli

**Professor Augustus Uht** 

NUCAR talk 02/06/21

#### **Outline**



- high IPC problems
- Levo solutions
- Levo physical
  - silicon layout
- results
  - bus delay
  - bus spans
  - machine geometry
  - machine IPC
- summary

## high IPC problems



#### microarchitecture can't support high IPC

- not enough parallelism
- limited bets with conditional branch predictions
- limited data prediction
- high interconnection cost
  - centralized resources cause interconnection congestion
- can't scale in space
  - bus lengths
  - propagation delays (RC delay)
  - contention for resources

#### Levo solutions



#### segmented buses

- scalable in space
- limited signal lengths (constant)
- limited data prediction
- instruction predication within the microarchitecture
  - change control dependencies into flow dependencies (like data)
- speculative program flow
  - control, data, memory
  - DEE
- time-tags for dependency enforcement
  - control
  - data (register)
  - memory



# Levo layout (overview)





## Levo layout (more detailed)





## bus delay performance impact



IDEAL F/M geo=8-8-8 R\_M\_P=4\_4\_4



## bus span performance impact



IDEAL F/M geo=8-8-8 R\_M\_P=4\_4\_4







R\_M\_P=2\_2\_1

#### machine IPC





R\_M\_P=2\_2\_1

### summary



- something to do with 1 billion transistors
- executes a single control-flow thread at higher IPC
- used a variety of mechanisms to achieve high IPC
  - instruction predication in microarchitecture
  - multipath execution using DEE
  - time-tags for dependency enforcement
  - speculative execution flow: control, data, memory
  - resource flow computing
- scales in space