# 2.1

The results of `spark.out` were obtained running the program `spark.c` on lab5p7

In [25]:
import tabulate

NR_ITERATIONS = 100  # defined in spark.c
CACHE_MIN     = 4    # in KiB

sample = [["Cache Size (KiB)", "Avg Elapsed Time (s)", "# Accesses", "Avg Time per Access (ns)"]]

fp = open("spark.out", "r")
next(fp)             # ignore first line
line = fp.readline() # second line, [LOG] line which we ignore, we just read it here to initialize the loop condition

cache_size = CACHE_MIN
cum_elapsed_time = 0.0
elapsed_time_obs_count = 0

while line:
    line = fp.readline()
    
    if line.startswith("[LOG]") or line == "":
        avg_elapsed_time = float(cum_elapsed_time) / elapsed_time_obs_count
        nr_accesses = cache_size * 2**10 * NR_ITERATIONS # reverse engineering the number: 
                                                         # in spark.c every stride value simulates the same nr of accesses, 
                                                         # considering stride = 1 we can see this is the nr of accesses for each cache_size

        sample.append([f"{cache_size} KiB", avg_elapsed_time, nr_accesses, (avg_elapsed_time / nr_accesses) * 10**9])

        cache_size *= 2
        cum_elapsed_time = 0.0
        elapsed_time_obs_count = 0
        continue

    _, _, elapsed_time, *_ = line.split("\t")
    cum_elapsed_time += float(elapsed_time)
    elapsed_time_obs_count += 1


tabulate.tabulate(sample, headers="firstrow", tablefmt="html")

Cache Size (KiB),Avg Elapsed Time (s),# Accesses,Avg Time per Access (ns)
4 KiB,0.00092925,409600,2.26868
8 KiB,0.00188254,819200,2.29802
16 KiB,0.00382021,1638400,2.33167
32 KiB,0.00756427,3276800,2.30843
64 KiB,0.0167127,6553600,2.55015
128 KiB,0.0396387,13107200,3.02419
256 KiB,0.0934102,26214400,3.56332
512 KiB,0.203428,52428800,3.88008
1024 KiB,0.442602,104857600,4.22098
2048 KiB,0.921256,209715200,4.39289


We conclude that the cache capacity is 32 KiB because that's where we see the first singificant increase in average time per access, prior to 64 KiB the time remained kind of constant. We could be tempted to guess 64KiB is the cache capacity since 2.55ns to 3.02ns is a much bigger difference, however, looking at the following data in `spark.out`
```
size	stride	elapsed(s)	cycles
(...)
[LOG]: running with array of size 64 KiB
65536	1	    0.014075	14076	
65536	2	    0.013435	13435	
65536	4	    0.015363	15363	
65536	8	    0.015878	15878	
65536	16	    0.015519	15519	
65536	32	    0.015379	15379	
65536	64	    0.013368	13368	
65536	128	    0.013804	13804	
65536	256	    0.013328	13329	
65536	512	    0.013572	13571	
65536	1024    0.013174	13174	
65536	2048    0.028942	28941	
65536	4096    0.035842	35842	
65536	8192    0.015605	15605	
65536	16384   0.015422	15422	
65536	32768   0.014697	14698
```
we can see the elapsed time for stride values of 2048 and 4096 is much bigger than other observations, this indicates the miss rate has slightly inscreased (even if there were still cache misses happening with different stride values), meaning our L1 cache filled up and cannot hold 64 KiB of data.

<table>
  <tr>
    <th>Array Size:</th>
    <td>8 KiB</td>
    <td>16 KiB</td>
    <td>32 KiB</td>
    <td>64 KiB</td>
    <td>128 KiB</td>
  </tr>
  <tr>
    <th>t2 - t1 (s):</th>
    <td>0.00188254</td>
    <td>0.00382021</td>
    <td>0.00756427</td>
    <td>0.0167127 </td>
    <td>0.0396387 </td>
  </tr>
  <tr>
    <th># accesses[i]:</th>
    <td>819200</td>
    <td>1638400</td>
    <td>3276800</td>
    <td>6553600</td>
    <td>13107200</td>
  <tr>
    <th># mean accesse time (ns)</th>
    <td>2.29802</td>
    <td>2.33167</td>
    <td>2.30843</td>
    <td>2.55015</td>
    <td>3.02419</td>
  </tr>
</table>

# 2.2

In the chart, we can identify a group of array sizes whose access time is small, and another group whose access time is significantly higher. We then conclude the cache size to be 64 KiB because up until that point, all the array values can fit into cache, making the read and write operation times small and relatively constant, regardless of stride. At 128 KiB, all the array content cannot fit simultaneously into cache, resulting in cache misses and, consequently, higher read and write times.


# 2.3

For the arrays whose size exceeds the cache capacity (>= 128 KiB) we can determine the cache block's size by seeing which stride value stabilizes the read and write times. For stride values below 16, the access times increase for each stride step, after 16 the read and write times stabilize because we start accessing contents that fall on different cache blocks and, since the entire array cannot entirely fit into cache, this will lead to a 100% miss rate. All subsequent strides values will keep the 100% miss rate.

We can then determine the cache block's size to be **16 Bytes** (since our array is of type `uint8_t`).