## 计算机组成原理与系统结构

## 第十二章作业

1. Both instructions and data are stored in the internal memory, then how CPU can distinguish them?

**CPU在取指令周期（FI）从内存取到的是指令，在执行指令周期（EX）从内存取到的是数据。根据CPU所处的周期不同，可以区分从内存取到的指令和数据。**

1. Suppose a pipeline with 5 stages: fetch instruction (FI), decode instruction (DI), execute (EX), memory assess (MA) and write back (WB).
   1. Please draw the spatio-temporal diagram for a sequence of 12 instructions, in which there are no conflicts and no data dependencies.
   2. Under this situation, what is throughput of this pipeline and the speedup of this pipeline?(Suppose the clock frequency is 100ns)
2. **t 🡪**

|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | **1** | **2** | **3** | **4** | **5** | **6** | **7** | **8** | **9** | **10** | **11** | **12** | **13** | **14** | **15** | **16** |
| **I1** | FI | DI | EX | MA | WB |  |  |  |  |  |  |  |  |  |  |  |
| **I2** |  | FI | DI | EX | MA | WB |  |  |  |  |  |  |  |  |  |  |
| **I3** |  |  | FI | DI | EX | MA | WB |  |  |  |  |  |  |  |  |  |
| **I4** |  |  |  | FI | DI | EX | MA | WB |  |  |  |  |  |  |  |  |
| **I5** |  |  |  |  | FI | DI | EX | MA | WB |  |  |  |  |  |  |  |
| **I6** |  |  |  |  |  | FI | DI | EX | MA | WB |  |  |  |  |  |  |
| **I7** |  |  |  |  |  |  | FI | DI | EX | MA | WB |  |  |  |  |  |
| **I8** |  |  |  |  |  |  |  | FI | DI | EX | MA | WB |  |  |  |  |
| **I9** |  |  |  |  |  |  |  |  | FI | DI | EX | MA | WB |  |  |  |
| **I10** |  |  |  |  |  |  |  |  |  | FI | DI | EX | MA | WB |  |  |
| **I11** |  |  |  |  |  |  |  |  |  |  | FI | DI | EX | MA | WB |  |
| **I12** |  |  |  |  |  |  |  |  |  |  |  | FI | DI | EX | MA | WB |

1. **Speedup = 12 \* 5 / ( 5 + 12 – 1 ) = 3.75**

**Throughput = 12 / (16 \* 100ns) = 7.5 \* 10^6 /s**

1. A pipelined processor has a clock rate of 2.5GHz and executes a program with 1.5 million instructions. The pipeline has five stages, and instructions are issued at a rate of one per clock cycle. Ignore penalties due to branch instructions and out-of-sequence executions.
2. What is the speedup of this processor for this program compared to a nonpipelined processor, making the same assumptions used in Section 14.4?
3. What is throughput of the pipelined processor?
4. **Speedup = 1.5 \* 10^6 \* 5 / ( 5 + 1.5 \* 10^6 – 1 ) ≈ 5**
5. **Throughput = 1.5 \* 10^6 / [ ( 1.5 \* 10^6 + 4 ) \* ( 1 / 2.5GHz ) ]**

**≈ 2.5G /s**