**ECE 462/562   
Homework 1 Deliverables**

**Due: Wednesday, February 12, 11.59pm.**

**Group members (and netID) who participated:**

|  |  |
| --- | --- |
| **Name** | **NetID** |
|  |  |
|  |  |
|  |  |
|  |  |
|  |  |

**PART 1**

Fill in the following tables with the required information and answer the following questions. The *notes* below contain the stats that should be used for filling out the tables.

1. Overall profiling stats (*30 pts*)

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| **Benchmark** | Instruction count | # cycles simulated | IPC | % load | % store | % branches | % int | % fp |
| a2time01 |  |  |  |  |  |  |  |  |
| cacheb01 |  |  |  |  |  |  |  |  |
| bitmnp01 |  |  |  |  |  |  |  |  |
| mcf |  |  |  |  |  |  |  |  |
| libquantum |  |  |  |  |  |  |  |  |

Answer the following questions based on your simulations.

1. Using the number of ALU instructions (*system.cpu.commit.int\_insts* + *system.cpu.commit.fp\_insts*), determine the % compute intensity of each benchmark and list the benchmarks in order of compute intensity. (*5 pts*)

|  |  |
| --- | --- |
| **Benchmark** | **% ALU instructions** |
|  |  |
|  |  |
|  |  |
|  |  |
|  |  |
|  |  |
| **GEOMETRIC MEAN** |  |

\*Note that the geometric mean is *not* the arithmetic mean!

1. Using the number of executed memory references (*system.cpu.iew.exec\_refs*), determine the % memory reference of each benchmark and list the benchmarks in ascending order of memory intensity (*5 pts*).

|  |  |
| --- | --- |
| **Benchmark** | **% memory references** |
|  |  |
|  |  |
|  |  |
|  |  |
|  |  |
|  |  |
| **GEOMETRIC MEAN** |  |

1. Cache performance (*30 pts*)

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| **Benchmark** | iCache miss rate (%) | dCache miss rate (%) | iCache AMAT (cycles) | dCache AMAT (cycles) |
| a2time01 |  |  |  |  |
| cacheb01 |  |  |  |  |
| bitmnp01 |  |  |  |  |
| mcf |  |  |  |  |
| libquantum |  |  |  |  |
| **GEOMETRIC MEAN** |  |  |  |  |

\*To calculate the average memory access time (AMAT), assume an L1 iCache and dCache latency of 1 cycle and main memory access latency of 80 cycles.

Notes:

* Instruction count is *sim\_insts*
* % load and % store are the percentage of load and store instructions executed, respectively. Calculate this using the total memory references executed (*system.cpu.iew.exec\_refs*), stores executed (*system.cpu.iew.exec\_stores*), and loads executed (*system.cpu.iew.iewExecLoadInsts*). These are events in the Issue/Execute/Writeback (iew) stages of the 7-stage O3CPU pipeline
* Total branches: *system.cpu.iew.exec\_branches*.
* % int (*system.cpu.commit.int\_insts*) and %fp (*system.cpu.commit.fp\_insts*) are the percentage of integer and floating point instructions, respectively, with respect to the total number of instructions
* iCache and dCache are the instruction and data caches, respectively.
  + iCache miss rate: *system.cpu.icache.overall\_miss\_rate::total*
  + dCache miss rate: *system.cpu.dcache.overall\_miss\_rate::total*

**PART 2**

Fill in the following tables with the required information and answer the following questions. The *notes* below contain the stats that should be used for filling out the tables.

1. L2 cache performance impact *(30 pts)*

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| **Benchmark** | L2 cache miss rates | Simulation seconds without L2 | Simulation seconds with L2 | % time improvement with L2 | IPC without L2 | IPC with L2 | % IPC improvement with L2 |
| a2time01 |  |  |  |  |  |  |  |
| cacheb01 |  |  |  |  |  |  |  |
| bitmnp01 |  |  |  |  |  |  |  |
| mcf |  |  |  |  |  |  |  |
| libquantum |  |  |  |  |  |  |  |
| **GEOMETRIC MEAN** |  |  |  |  |  |  |  |

\*Simulation seconds is *sim\_seconds;* L2 cache miss rates is *system.l2.overall\_miss\_rate::total*

Think about and discuss the results and your observations with your group.

Some benchmarks’ L2 cache miss rates are much higher than the L1 cache miss rates. Why do you think that is the case?

Some benchmarks for which the L1 cache seems to suffice (i.e., very low miss rates) have very high L2 cache miss rates. Why do you think that is the case?