## BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI FIRST SEMESTER 2015 - 2016 COURSE HANDOUT (PART II)

Date: 03 / 08 / 2016

In addition to Part I (General Handout for all courses appended to the timetable) this portion gives further specific details regarding the course.

Course No : MEL G624

Course Title : Advanced VLSI Architecture

Instructor—Goa: K.R.Anupama (anupkr@goa.bits-pilani.ac.in)
Instructor – Hyd: J.Sowmya (soumyaj@hyderabad.bits-pilani.ac.in)

Instructor- Pilani : Prof. Chandrasekhar (chandra.shekhar@pilani.bits-pilani.ac.in),

: Vineet Kumar (vineet@pilani.bits-pilani.ac.in)

#### 1. Scope and Objective:

The course aims at familiarizing students with advanced parallel processing architectures suitable for high-performance computing. It deals with three levels of parallelism – Instruction-Level, Data Level and Thread Level.

#### 2. Text Book:

- T1. Computer Architecture: A Quantitative Approach, by J.L. Hennessy & D.A. Patterson, Morgan Kaufmann., 3<sup>rd</sup> Ed, 2006.
- T2. Modern Processor Design: Fundamentals of Superscalar Processors, John Paul Shen & Mikko.H.Lipasti , Tata McGraw Hill,2011.
- T3. Advanced Computer Architecture: A Design Space Approach, Sima, Fountain, Kacsuk, Pearson, 2012.

### 3. Reference Books:

- (R1) Parallel Computer Architecture: A Hardware / Software Approach, David E Culler & Jaswinder Pal Singh., Morgan Kauffmann / Harcourt India, 2002.
- (R2) Computer Architecture Pipelined& Parallel Processor Design, M.J.Flynn, Narosa Publishing House, 2006
- (R3) DSP Processor Fundamentals, Phil Lapesly, Jeff Bier, Amit Shoham, Edward.A.Lee, Wiley India Edition, 2011.
- (R3) Journals & Conference Proceedings

<sup>\*</sup> It is assumed that students have a working knowledge of MIPS Architecture

# 4. Course Plan:

| No.   | Topics to I                           | Reference                              |                    |
|-------|---------------------------------------|----------------------------------------|--------------------|
| 01    | Introduction to P                     | Class Notes                            |                    |
|       |                                       | Reading Assignment 1- CPU Architecture |                    |
|       | Introduction (At the end of           |                                        |                    |
| 02-04 | Introduction to ILP                   | Pipeline Dependencies                  | T1- Ch-3, T2 – Ch1 |
|       |                                       | Arithmetic &                           |                    |
|       |                                       | Architectural Pipelines                |                    |
|       |                                       | Pipeline Idealism                      |                    |
|       |                                       |                                        |                    |
| 05-07 | Pipeline architectures –              | Typical RISC Pipeline                  | T2-Ch2             |
|       | Design of RISC Pipeline               | Design                                 |                    |
|       |                                       | CISC Pipeline                          |                    |
|       |                                       | Pipeline Examples                      |                    |
|       | Reading Assignment 2 – N              | Memory Design (At the end              | T2 Ch2             |
|       | of 3 <sup>rd</sup> weel               |                                        |                    |
| 08-11 | Superscalar                           | Widening of Pipeline                   | T2-Ch4, 5          |
|       | Architectures -Pipeline               | Parallel Fetch & Decode                |                    |
|       | Design – Data Path                    | Instruction Dispatch &                 |                    |
|       |                                       | Issue                                  |                    |
|       |                                       | Register Renaming &                    |                    |
|       |                                       | Tomsulo                                |                    |
|       |                                       | ROB                                    |                    |
|       |                                       | Superscalar Pipeline                   |                    |
|       |                                       | Operation - Examples                   |                    |
|       | Reading Assignment 3 – V              | LIW Architectures (At the              | Ch1 – Appendix H   |
|       | end of 6 <sup>th</sup> week of course | work)                                  |                    |
| 12-16 | Superscalar                           | Basic Branch Prediction                | T2- Ch 9, Ch 10    |
|       | Architectures – Branch                | Schemes                                |                    |
|       | Prediction                            | BTA & Misprediction                    |                    |
|       |                                       | Penalty & Recovery                     |                    |
|       |                                       | Advanced Branch                        |                    |
|       |                                       | Prediction – Correlated                |                    |
|       |                                       | Branch Prediction                      |                    |
|       |                                       | Advanced Branch                        |                    |
|       |                                       | Prediction – Hybrid                    |                    |
|       |                                       | Advanced Branch                        |                    |
|       |                                       | Prediction – Tournament                |                    |
|       |                                       | Predictors                             |                    |
|       |                                       | Value Prediction -                     |                    |
|       |                                       | Introduction                           |                    |
|       | Reading Assignment 5-                 | Relevant Journal &                     |                    |
|       | Advanced Branch Predicto              | Conference Papers                      |                    |
|       | of cours                              |                                        |                    |
|       | Reading Assignment 6 – V              | T2 – Ch 10                             |                    |
|       | of 11 <sup>th</sup> week o            |                                        |                    |

| 17                              | Instruction level Data                                      |                          |                           |  |  |  |
|---------------------------------|-------------------------------------------------------------|--------------------------|---------------------------|--|--|--|
|                                 | Introd                                                      |                          |                           |  |  |  |
| 18-20                           | SIMD Architectures Fine Grained Par                         |                          | T1-Ch 4 & Class Notes     |  |  |  |
|                                 |                                                             | SIMD                     |                           |  |  |  |
|                                 |                                                             | Coarse Grained SIMD      |                           |  |  |  |
|                                 |                                                             | Examples of SIMD         |                           |  |  |  |
|                                 |                                                             | operation                |                           |  |  |  |
| 21-22                           | Vector Processors                                           | VMIPS Architecture       | T1-Ch4                    |  |  |  |
|                                 |                                                             | Multi-Lane Systems       |                           |  |  |  |
|                                 |                                                             | Performance Analysis of  |                           |  |  |  |
|                                 |                                                             | vector Systems           |                           |  |  |  |
| 23-24                           | GPU                                                         | SIMD Extensions          | T1-Ch4                    |  |  |  |
|                                 |                                                             | NVIDIA GPU               |                           |  |  |  |
|                                 |                                                             | Architectures - SIMT     |                           |  |  |  |
|                                 | Reading Assignment 7 – CUDA (At the end of 13 <sup>th</sup> |                          |                           |  |  |  |
|                                 | week of co                                                  | urse work)               |                           |  |  |  |
| 25                              | Thread & Process Leve                                       | T1-Ch5                   |                           |  |  |  |
|                                 | Introd                                                      | uction                   |                           |  |  |  |
| 26                              | Multi-threaded                                              | Shared Memory &          | T1-Ch5, T2 Ch-11          |  |  |  |
|                                 | architectures                                               | Distributed Memory       |                           |  |  |  |
|                                 |                                                             | Architecture             |                           |  |  |  |
|                                 |                                                             | Cache in TLP             |                           |  |  |  |
| 27-29                           | Cache Architectures                                         | Snoopy Cache Protocols   | T1-Ch5                    |  |  |  |
|                                 |                                                             | MSI, MESI                |                           |  |  |  |
|                                 |                                                             | MESIF, MOSIF             |                           |  |  |  |
|                                 |                                                             | 4C of Cache              |                           |  |  |  |
|                                 |                                                             | Directory based Cache    |                           |  |  |  |
| 30                              | Multi-Threaded                                              | Explicit Multi-Threading | T2 – Ch11                 |  |  |  |
|                                 | Architectures                                               | Implicit Multi-Threading |                           |  |  |  |
|                                 | Reading Assignment 8- Interest                              | erconnection Network in  | T1- Appendix F + Relevant |  |  |  |
|                                 | Multi-core Processors (At t                                 | the end of penultimate   | Papers                    |  |  |  |
|                                 | week of course)                                             |                          |                           |  |  |  |
| 31 CPU vs ASIC : qualitative ar |                                                             |                          | Class Notes + Relevant    |  |  |  |
|                                 | (speed) and energy consu                                    | •                        | Papers                    |  |  |  |
| 32                              | CPU vs ASIC : quantitative modeling of speed and            |                          | Class Notes + Relevant    |  |  |  |
|                                 | energy consumption of fur                                   | Papers                   |                           |  |  |  |
| 33                              | Application Specific Instruction-set Processor (ASIP) -     |                          | Class Notes + Relevant    |  |  |  |
|                                 | the via media between CPU and ASIC.                         |                          | Papers                    |  |  |  |
| 34-35                           | Techniques for identifying Application Specific             |                          | Class Notes + Relevant    |  |  |  |
|                                 | instructions.                                               |                          | Papers                    |  |  |  |
| 36-40                           | Design approaches for ASI                                   | Ps - examples and cases  | Class Notes + Relevant    |  |  |  |
|                                 |                                                             |                          | Papers                    |  |  |  |

#### Note:

- The material in the text will be supplemented with papers from Journals. Class Notes will include journal papers, e-material.
- Reading Assignments will be evaluated based on Activity in On-line Discussion Forum on EdX – based on Reading Assignment Topics of Discussion / Open Ended problems will be put up.
- All students will have to do Reading Assignment 1.
- Students can pick 3 out of the remaining Reading Assignments. Maximum number of students/reading assignment will be decided based on Class Strength.

#### 5. Evaluation Scheme:

| EC | Evaluation   | Duration (min) | Weightage | Date,<br>Time | Nature of   |
|----|--------------|----------------|-----------|---------------|-------------|
| No | Component    | (min)          | (200)     | Time          | Component   |
|    |              |                |           |               |             |
| 1  | Test I       | 60             | 25        | TBA           | Closed Book |
| 2  | Test II      | 60             | 25        | TBA           | Closed Book |
| 3  | Class Room & | ••••           | 80        | 15            | Open Book   |
|    | Online       |                |           |               |             |
|    | Interactions |                |           |               |             |
|    | 4- Reading   |                |           | 20            |             |
|    | Assignments  |                |           |               |             |
|    | GEM5         |                |           | 45            |             |
|    | Assignment   |                |           |               |             |
|    | Problems     |                |           |               |             |
| 4  | Comprehensi  | 180            | 70        |               | Closed /    |
|    | ve           |                |           |               | Open Book   |
|    |              |                |           |               | (25+45)     |

- **6. Chamber Consultation Hours:** To be Announced for Goa Campus. For students of Pilani & Hyd Campus I will be usually available on-line on EdX forum or via Mail.
- **7. Make-up Policy:** Make Up for any component will be given only in genuine cases. In all cases prior intimation must be given to IC.
- **8.** Plagiarism & Copying: <u>Plagiarism and copying will be dealt with severity.</u> Any student who <u>Plagiarizes or copies will automatically be given 0 for all assignment and class room interaction components.</u>
- **9. Notices:** Notices regarding the course will be displayed on EdX site.