#### Systolic Architectures

- Replace single processor with an array of regular processing elements
- Orchestrate data flow for high throughput with less memory access



- Different from pipelining
  - Nonlinear array structure, multidirection data flow, each PE may have (small) local instruction and data memory
- Different from SIMD: each PE may do something different
- Initial motivation: VLSI enables inexpensive special-purpose chips
- Represent algorithms directly by chips connected in regular pattern



Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

#2 lec # 1 Spring 2003 3-11-2003

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time



T = 1

EECC756 - Shaaban

b2,2

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time





T=2

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time



T = 3

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time



T = 4

EECC756 - Shaaban

a2,2

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time



Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

T=5

#7 lec # 1 Spring 2003 3-11-2003

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time



T = 6

- Processors arranged in a 2-D grid
- Each processor accumulates one element of the product

Alignments in time

Done

T=7

