## Driving Forces for Multiprocessor/Multicore Systems CS6560: Parallel Computer Architecture 40 Years of Microprocessor Trend Data 10<sup>7</sup> Introduction to Parallel Computer Architecture 10<sup>6</sup> 10<sup>5</sup> 10<sup>4</sup> 10<sup>3</sup> Madhu Mutyam PACE Laboratory Department of Computer Science and Engineering Indian Institute of Technology Madras 10<sup>2</sup> 10<sup>1</sup> Jan 31 - Feb 2, 2018 1980 1990 Year Original data up to the year 2010 collected and plotted by M. Horo New plot and data collected for 2010-2015 by K. Rupp



Transistors (thousands)

Single-Thread

Performance (SpecINT x 10<sup>3</sup>)

Frequency (MHz)





Compilation or library

Communication hardware

Physical communication medium

OS support

PΕ

Effectiveness of shared memory

bandwidth of data transfer

memory access latency

system depends on:

Interconnection networks:

► Communication operations:

load and store

Bus, Crossbar, Ring, Mesh, ...

PE



Communication abstraction User/system boundary

Hardware/software boundary



- ► For shared memory systems:
  - ► the CA (Communication Assist) is tightly integrated with the memory system
- ► For message passing systems:
  - the CA needs to initiate the messages quickly and respond to incoming messages
- ► For data parallel systems:
  - ► the CA needs to support fast global synchronization



Thank You



Jan 31 - Feb 2, 2018

12

T 21 E 1 2 2010

13/13