## 《计算机体系结构》课程试卷

| 考证                                   | 代时间:                                                         | <u>120</u>                           | 5                                      | 钟 开                                                                        | 课学院_                                    | <i>计隽</i>                 | <b>红机</b>              | 任课                             | 教师                    |                                  |  |
|--------------------------------------|--------------------------------------------------------------|--------------------------------------|----------------------------------------|----------------------------------------------------------------------------|-----------------------------------------|---------------------------|------------------------|--------------------------------|-----------------------|----------------------------------|--|
| 姓名                                   |                                                              | 学号                                   |                                        |                                                                            |                                         | 班级                        |                        |                                |                       |                                  |  |
| Γ                                    |                                                              | T                                    | .                                      | <u> </u>                                                                   | <u> </u>                                |                           | <u> </u>               |                                |                       | Total                            |  |
|                                      | 22                                                           | <u> </u>                             |                                        | 三(1)<br>12                                                                 | 三(2)<br>15                              |                           | 三(3)<br>15             | 三(4)<br>20                     |                       | Total<br>100                     |  |
|                                      | 22                                                           |                                      |                                        | 12                                                                         |                                         |                           | 13                     |                                |                       |                                  |  |
|                                      |                                                              |                                      |                                        | Ansv                                                                       | wer table fo                            | r Section                 | One:                   |                                | I                     |                                  |  |
|                                      | 1                                                            | 2                                    | 3                                      | 4                                                                          | 5                                       | 6                         | 7                      | 8                              | 9                     | 10                               |  |
|                                      | 11                                                           | 12                                   | 13                                     | 14                                                                         | 15                                      | 16                        | 17                     | 18                             | 19                    | 20                               |  |
|                                      | 21                                                           | 22                                   |                                        |                                                                            |                                         |                           |                        |                                |                       |                                  |  |
| •                                    |                                                              |                                      |                                        |                                                                            |                                         |                           |                        |                                |                       |                                  |  |
| cution is<br>ned by us<br>a. overall | limited by                                                   | the fracti<br>ular featu<br>mode \ 1 | ion of the<br>are. The m<br>/(1-F)     |                                                                            |                                         | can be limited by B. enl  | used. Am               | dahl's Law<br><br>de \ overall | defines the $1/(1-F)$ | some faster mone speedup that of |  |
| Vhich of<br>load-st                  |                                                              | ng archite                           |                                        | sed in MIPS<br>er-register                                                 | <b>S</b> ?                              | C. reg                    | ister-memo             | ory                            | D. r                  | nemory-memory                    |  |
| A. With to B. The fire C. The second | wo-Level cac<br>st-level cac<br>cond-level c<br>cond-level c | iche, we d<br>he should<br>cache gen | can decrea<br>l be large<br>erally use | t two-level<br>ase miss per<br>enough to c<br>as a bigger b<br>ge enough a | nalty.<br>obtain a sma<br>olock size th | ıll miss ra<br>ıan that o | ate.<br>f the first-le |                                |                       | y accesses in the                |  |
| Vhich of<br>A. RAR                   | the following                                                | ng type of                           | f data haz<br>B. WAW                   | ards will oc                                                               | cur in MIPS                             | S 5 stages<br>C. RA       |                        | ·                              | D. V                  | VAR                              |  |
| A. Compu<br>B. Compu<br>hardwa       | iter architec<br>iter architec                               | ture shou                            | ild include                            | out "Compute instruction the imple                                         | set design.                             |                           |                        | ch has two                     | o compone             | ents: organizatio                |  |

| organization and hardy                                                                                      | vare.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                               |                                                             |
|-------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|-------------------------------------------------------------|
| 6. Labor costs belongs to<br>A. gross margin                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | C. component cost                                                                             | D. average discount                                         |
|                                                                                                             | e of recent compilers, globa<br>B. code generator                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | l and local optimizations belong<br>C. global optimizer                                       | to which of the following part?  D. high-level optimization |
| 8. CPU time can be further of A. user CPU time and sy C. Elapsed time wall-close                            | stem CPU time.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | B. Response time and I/O ti D. System CPU time and pr                                         |                                                             |
| B. harmonic mean ≤ ar<br>C. harmonic mean ≤ ge                                                              | elationship is always correctified mean $\leq$ harmon in ithmetic mean $\leq$ geometric mean $\leq$ arithmet armonic mean $\leq$ geometric | ic mean<br>ic mean<br>ic mean                                                                 |                                                             |
| 10. Compared with the mem A. Fixed-length instruction C. Higher CPI                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | the Register-register architecture B. Less instructions to comp D. Large variation in instruc | plete a function.                                           |
|                                                                                                             | ns, which is NOT the meas<br>B. Critical word first                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | urement for reducing the cache n<br>C. Merging write b                                        |                                                             |
| 12. Which RAID level is Bi<br>A. RAID 1                                                                     | t-Interleaved Parity?<br>B. RAID 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | C. RAID3                                                                                      | D. RAID 5                                                   |
| 13. To reduce data hazards,.<br>A. MEM/WB.ALUoutput<br>C. MEM/WB.LMD,ALU                                    | t, DM input                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | forwarding path.<br>B. MEM/WB.LMD ,DM in<br>D. EX/MEM.ALUoutput, D                            |                                                             |
| 14. We often use a technique A. colored paging                                                              | e named to solve reg<br>B. graph coloring                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                               | D. page coloring                                            |
| <ul><li>15. Which is NOT the chara</li><li>A. Powerful instruction f</li><li>C. Complex memory ac</li></ul> | functions.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | B. Use a complex in D. Use the Load/Sto                                                       |                                                             |
| 16. The extension of MIPS J                                                                                 | pipeline to handle multi-cyc<br>B. WAR                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | cle operation will bring about<br>C. RAW                                                      | hazard. D. RAR                                              |
| 17. We often use to A. register                                                                             | allocate statically declared<br>B. stack                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | l objects.<br>C. heap                                                                         | D. global data area                                         |
| 18. Which of the following A. Write through                                                                 | policy will NOT improve v<br>B. full-associative m                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | irtual memory performance?<br>ap C. TLB cache                                                 | D. LRU replacement                                          |
|                                                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | t for resolution of data hazard?<br>he earlier pipeline stages as poss                        | ible.                                                       |
| 20. In MIPS pipeline, "Upda<br>A. IF B.                                                                     | ating the PC" is completed ID C. I                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | •                                                                                             | E. WB                                                       |

|                                                                             | e when an instruction depe<br>structions in the pipeline                                                                     |                          |                     |                           |                     |
|-----------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|--------------------------|---------------------|---------------------------|---------------------|
| A. Data, Structural                                                         | B. Structural, Str                                                                                                           | ructural                 | C. Data, Control    | D. Stru                   | ctural, Control     |
| correct? A. If K=1, then it's a B. If K=1, then it's a C. If K=M, then it's | M blocks in a cache, and a direct mapped cache. a full-associative cache. a direct mapped cache. then it's a M/K-way set ass | ·                        | re grouped in one   | set, then which follow    | wing description is |
| 二、 Fill in the                                                              | blanks (16 points )                                                                                                          |                          |                     |                           |                     |
| 1. Suppose a compute                                                        | er spends 90 percent of its ters make a change that impr                                                                     |                          | • •                 | -                         |                     |
| ID stage, while the                                                         | re implementation is the cla<br>branch-target address is kn<br>is predict-taken. Then how                                    | own at ID too. Bu        | it the branch condi | tion is evaluated till th |                     |
| Branch taken:                                                               | cycles;                                                                                                                      | Branch untaken:          | cycle               |                           |                     |
| 3. A cache has 64-KB c addresses.                                           | capacity, 128-bytes/line, and                                                                                                | l is 4-way set-asso      | ciative. The system | containing the cache      | uses 32-bit         |
| The cache has                                                               | lines.                                                                                                                       | The cache ha             | s                   | _ sets.                   |                     |
| Tag information                                                             | on is bits.                                                                                                                  |                          |                     |                           |                     |
| 4 50                                                                        | ance of the basic memory of<br>clock cycles to send the add<br>5 clock cycles for the access<br>clock cycles to transfer a w | dress<br>s time per word |                     |                           |                     |
| Given a cache block                                                         | of four words, and that a iven out.) clock cycles, with                                                                      | word is 8 bytes          |                     |                           | (Calculation ycle.  |
|                                                                             |                                                                                                                              |                          |                     |                           |                     |

## 三、Calculations (62 points)

- 1. (12 points) Your company is developing a program with high requirement on computation. You asked your R&D department to make some improvements on the execution time. After several months, they give you two solutions. The first one is to use a new hardware technology, by which 40% of the computation can be accelerated by 10 times. Another solution is focused on algorithm design, which can enhance 60% and 10% of the total computation by 2 and 20 times respectively. Question:
- a) What is the overall enhancement of the hardware solution?
- b) What is the overall enhancement of the software solution?
- c) Which one will you choose?

2. (15 points)Within some memory/cache memory hierarchies, there are 2 words in a block. Access time of Cache is 8ns and miss penalty is 70ns. For the following C code, assumes that each element is one word in array (A[i]). Variable s has be loaded into registers. Please answer questions:

- (1) What is the miss rate for data accesses?
- (2) What is the average memory access time for data read?
- (3) What is the overall CPI including memory access? Assume processor runs at 1.1GHz and has a CPI of 1 excluding memory accesses. Ignores instructions misses and data hazard and control hazard. Assumes assembler code is below:

 LOOP:
 LOAD
 R2, 0(R1) ;  $R2 \leftarrow Mem[R1+0]$  

 ADD
 R5, R1, #4
 ;  $R5 \leftarrow R1+4$  

 ADD
 R3, R2, R3
 ;  $R3 \leftarrow R2+R3$ , s was stored in R3

 BNE
 R5, LOOP
 ; if  $R5 \neq 0$ , then goto LOOP

3. (15 points) Consider the following pipeline. All instructions have five cycles but autoincrement addressing instruction, which is IF, ID, EX, MEM, WB. Branch will complete at the third cycle. The pipeline extended MIPS pipeline in autoincrement addressing mode which have seven pipe stages. The Fig 1 is an example in autoincrement addressing:

The register files can perform two reads and two write every clock cycle. To handle reads and writes to the same register, assume the register write in the first half of the clock cycle and the read in the second half.

Read the following code segment . The pipeline has forwarding path.

Loop: LW R1, 4(R2)
ADD R2, (R1)+
ADDI R3, R3, #4
SUB R4, R1, R2
SW R2, 4(R4)+
BNEZ R3, Loop

(1) if the pipeline has no delay slot, find out how forwarding path work in every instruction? Draw the pipe stage diagram, mark the every forwarding path in diagram.

eg: Add R1, (R2)+
means: Regs[R2]←Regs[R2]+4
Regs[R1]←Regs[R1] + Mem[Regs[R2]]
Fig 1: example instruction of autoincrement addressing



IF: Instruction fetch

ID: Instruction decode

ADDR: AutoIncrement Addressing

WB1: Write Result of ADDR to Register file

EX: Memory Reference: Calculate the absolute address

ALU Instruction: Calculate.

MEM: Memory Access

WB2: Write Result to Register file

Fig 2: seven pipeline stage of the autoincrement addressing

(2) if the pipeline has one delay slot, how to adjust the code segment. Draw the pipe stage diagram.

4 (**20 points**) You are building a system around a processor with inorder execution that runs at 1.1GHz and has a CPI of 0.7 excluding memory accesses. The only instructions that read or write data from memory are loads(20% of all instructions) and stores(5% of all instructions).

The memory system for this computer is composed of a split L1 cache that imposes no penalty on hits. Both the I-cache and D-cache are direct mapped and hold 32 KB each. The I-cache has a 2% miss rate and 32-byte blocks, and the D-cache is write through with a 5% miss rate and 16-byte blocks. There is a write buffer on the D-cache that eliminates stalls for 95% of all writes.

The 512KB write-back, unified L2 cache has 64-byte blocks and an access time of 15ns. It is connected to the L1 cache by a 128-bit data bus that runs at 266 MHz and can transfer one 128-bit word per bus cycle. Of all memory references sent to the L2 cache in this system, 80% are satisfied without going to main memory. Also, 50% of all blocks replaced are dirty.

The 128-bit-wide main memory has an access latency of 60ns, after which any number of bus words may be transferred at the rate of one per cycle on the 128-bit-wide 133 MHz main memory bus.

- a) [5 points] What is the average memory access time for instruction accesses?
- b) [5 points] What is the average memory access time for data read?
- c) [5 points] What is the average memory access time for data writes?

When a write miss occurred, Assume write back use write allocated, and write through use write around. So no matter whether there a write miss, we should write data to second level cache.

d) [5 points] what is the overall CPI, including memory accesses?