1. What are the 3C’s for cache misses?
2. Of the three factors in Processor Performance Equation (Instruction count, CPI, Cycle time), which is most influenced by the compiler?
3. Suppose we have made the following measurements for a program: frequency of floating point (FP) instructions is 20%, average CPI of FP instructions is 4, average CPI of other instructions is 1.5, frequency of FP multiplication instructions is 4% (out of all instructions), CPI of FP multiplication instructions is 10.
4. What is the average CPI of the program?
5. If we improve the FP hardware to reduce the average CPI of FP instructions to 2, what is the overall speedup we get?
6. If we can decrease the average CPI of FP multiplication instructions to 2, what is the overall speedup we get?
7. When parallelizing an application, the idea speedup is speeding up by the number of processors.
8. If 75% of the application execution time is parallelizable, what is the overall speedup with 4 processors?
9. For (a), what percentage of application execution time (after the parallelizing) is parallelized?
10. If only 50% of the application execution time is parallelizable, at least how many processors is needed to achieve an overall speedup no less than that obtained in (a)?

Consider a disk system with the following components and MTTF:

* 20 disks, each rated at 1,000,000-hour MTTF
* 1 disk controller, 500,000-hour MTTF

Assume the lifetimes of all components are exponentially distributed and that failures are independent, compute the MTTF of the whole disk system.

Consider a processor with an instruction length of 10 bits and with 16 general-purpose registers so the size of the address fields is 4 bits. Is it possible to have instruction coding for the following: 3 two-address instructions, 15 one-address instructions, and 17 zero-address instructions? Please justify your answer.

Consider the following chart of instruction type frequencies for a certain program.

|  |  |  |  |
| --- | --- | --- | --- |
| Operations | Load/Store | ALU | Branches |
| Frequency | 40% | 50% | 10% |
| CPI | 2 | 1 | 3 |

1. What is the average CPI of the program?
2. Suppose one of following two proposals for speeding up the program is to be implemented. Proposal 1 decreases the CPI of load/store operations to 1.5, but increases the clock cycle time by 5%. Proposal 2 decreases the CPI of branch operations to 1 but increases the clock cycle time by 10%. Which proposal gives a higher speedup? Assume that the instruction count remains the same for both proposals.

Consider a system with load and store ISA has split caches (one instruction cache and one data cache). Assume the frequency of load and store instructions is 30%. A cache hit takes one clock cycle while the penalty for a cache miss is 100 clock cycles. Assume the measured cache misses are: 4 misses per 1000 instructions for instruction cache, 40 misses per 1000 instructions for data cache.

1. What is the overall cache miss rate?
2. What is the average memory access time (count as number of clock cycles)?

Consider a byte-addressable machine with 16-bit virtual address and 16-bit physical address.

* The system has a 512B 2-way set associative physically indexed and physically tagged cache with write back, write allocate and FIFO replacement.
* The system has a 2-way set associative TLB with 16 entries.
* The cache block size is 16B.
* Page size is 1KB.

For the following questions, please justify your answer.

1. Determine how to break down the virtual address into virtual page number, virtual page offset, TLB index, and TLB rage
2. Determine how to break down the physical address into physical page number, physical page offset, cache block offset, cache set index, and cache tag.

Consider a system with three levels of cache. Every 1000 memory references, there are 100 misses in L1, 50 misses in L2 and 10 misses in L3.

* 1. What is the local miss rate for L2 cache?
  2. What is the global miss rate for L3 cache?