## Thapar University, Patiala Computer Science and Engineering Department Mid-Semester Examination

| B.E. (COE, CML, SEM, CAG)          | UCS608 (Parallel and Distributed Computing) |  |  |
|------------------------------------|---------------------------------------------|--|--|
| September 23, 2017                 | Saturday, 1:00 pm to 3:00 pm                |  |  |
| Time- 2 Hours , Maximum Marks - 25 | Name of Faculty: Miss Navneet Kaleka        |  |  |

## Note

- · All questions are compulsory.
- Answer all parts of the question at same place only.
- · Assume any necessary assumption if needed. Quote examples too.

| Q.No | Questions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |              |                 | Marks   |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------------|---------|
| 1.   | Consider the execution of an object code with 200,000 instructions on a 40-MHz processor. The program consists of four major types of instructions. The instruction mix and the number of cycles(CPI) needed for each instruction type are given below based on the result of a program trace experiment:                                                                                                                                                                                                                                                                               |              |                 | (2,2)   |
|      | Instruction Type                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | CPI          | Instruction mix |         |
|      | Arithmetic and Logic                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 1            | 60%             |         |
|      | Load/store with cache hit                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 2            | 18%             |         |
|      | Branch                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 3            | 12%             |         |
|      | Memory reference with cache miss                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 4            | 10%             |         |
| 2    | uniprocessor with the above trace results.  (b) Calculate the corresponding MIPS rate based on the CPI obtained in part (a).                                                                                                                                                                                                                                                                                                                                                                                                                                                            |              |                 |         |
| 2.   | (a) Consider a 16- node hypercube network. Based on the E-cube routing algorithm show how to route a message from node (0110) to node (1001). All intermediate nodes must be identified on the routing path.                                                                                                                                                                                                                                                                                                                                                                            |              |                 |         |
| 3.   | Consider the execution of a program of 15,000 instructions by a linear pipeline processor with a clock rate of 20 Mhz. Assume the instruction pipeline has five stages and that one instruction is issued per clock cycle. The penalties due to branch instructions and out- of-sequence executions are ignored.  (a) Calculate the speed up factor in using this pipeline to execute the program as compared with the use of an equivalent nonpipelined processor with an equal amount of flow through delay.  (b) What are the efficiency and throughput of this pipelined processor? |              |                 |         |
| 4.   | Differentiate the following terms:  i. UMA and NUMA  ii. SIMD and MIMD  iii. Implicit Parallelism and Explicit Parallelism                                                                                                                                                                                                                                                                                                                                                                                                                                                              |              |                 | (2,2,2) |
| 5.   | Suppose through experimentation, it was ve<br>time was spent on parallelizable execution.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | erified that |                 | (2)     |

|    | that can be achieved with 6 processors?                                                                                                                                                                                                            |     |
|----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 6. | <ul> <li>(a) Explain the cache – coherence problem and the reasons why it occurs in multiprocessing environment.</li> <li>(b) Name and explain one protocol to cope with multicache inconsistency problem in network connected systems.</li> </ul> |     |
| 7. | Is the routing latency in wormhole routing dependent on the distance (number of nodes traversed)? Give reasons for your answer.                                                                                                                    | (2) |