1. Suppose that we have two implementations of the same instruction set architecture. Machine A has a clock cycle time of *50* *ns* and a *CPI* of *4.0* for some program and machine B has a clock cycle of *65* *ns* and a *CPI* of *2.5* for the same program. Which machine is faster by how much?
2. Three enhancements with the following speedups are proposed for a new machine: Speedup (a) = *30*, Speedup(b) = *20*, and Speedup(c) = *15*. Assume that for some set of programs, the fraction of use is *25%* for enhancement (a), *30%* for enhancement (b), and *45%* for enhancement (c). If only one enhancement can be implemented, which should be chosen to maximize the speedup? If two enhancements can be implemented, which should be chosen, to maximize the speedup?
3. Suppose we are considering two alternatives for a conditional branch instruction, as follows:

**CPU A.** A condition code is set by a compare instruction and followed by a branch that tests the condition code.

**CPU B.** A compare in included in the branch statement itself.

On both CPUs, the conditional branch instruction takes 2 cycles, and all other instructions take 1 clock cycle. On CPU A, 20% of all instructions executed are conditional branches; since every branch needs a compare, it follows that another 20% of the instructions are compares. Because CPU A does not have the compare included in the branch, its clock cycle time is 25% faster than CPU B’s. Which CPU is faster?

1. Assume we have a computer where the CPI is 1.0 when all memory access hit in the cache. The only data access are loads and stores, and these total 50% of the instructions. If the miss penalty is 25 clock cycles and the miss rate is 2%, how much faster would the computer be if all instructions were cache hit?
2. How many bits are in the (0, 2) branch predictor with 8K entries? How many entries are in a (1, 2) predictors with the same number of bits.
3. You are going to enhance a computer, and there are two possible improvements: either makes multiplication instructions run four times faster than before, and/or make memory access instructions run two times faster than before. You repeatedly run a program that takes 100 seconds to execute. Of this time, 20% is used for multiplication, 40% for memory access instructions, and 40% for other tasks. What will be the speedup be if you improve only multiplication? What will be the speedup be if you improve only memory access? What will be the speedup be if both improvements are made?
4. Consider a machine A for which the following performance measures were recorded when executing a set of benchmark programs. Assuming the execution of benchmark program with 100 instructions, what is the overall CPI of Machine A?

|  |  |  |  |
| --- | --- | --- | --- |
| Instruction  category |  | Percentage of  occurrence | No. of cycles  per instruction |
| ALU |  | 35 | 1 |
| Load & store |  | 20 | 3 |
| Branch |  | 40 | 4 |
| Others |  | 5 | 5 |

1. Assume that we make an enhancement to a computer that improves some mode of execution by a factor of 10. Enhanced mode is used 50% of the time, measured as a percentage of the execution time *when the enhanced mode is in use*.
2. What is the speedup we have obtained from fast mode?
3. What percentage of the original execution time has been converted to fast mode?
4. Suppose the branch frequencies (as percentages of all instructions) are as follows:

Conditional branches 15%

Jumps and calls 1%

Taken conditional branches 60% are taken

1. We are examining a four-deep pipeline where the branch is resolved at the end of the second cycle for unconditional branches and at the end of the third cycle for conditional branches. Assuming that only the first pipe stage can always be done independent of whether the branch goes and ignoring other pipeline stalls, how much faster would the machine be without any branch hazards?
2. Now assume a high-performance processor in which we have a 15-deep pipeline where the branch is resolved at the end of the fifth cycle for unconditional branches and at the end of the tenth cycle for conditional branches. Assuming that only the first pipe stage can always be done independent of whether the branch goes and ignoring other pipeline stalls, how much faster would the machine be without any branch hazards? .