# VisCPU: data pre-processing

Metrics to calculate:
- IPC (Instructions Per Cycle): how much CPU cycles instructions require to execute?
    - How to calculate? Instructions divided by CPU cycles;
    - How to interpret? In general, values higher than one are good;
    - https://stackoverflow.com/questions/51438407/how-to-correctly-measure-ipc-instructions-per-cycle-with-perf
    
- https://stackoverflow.com/questions/22165299/what-are-stalled-cycles-frontend-and-stalled-cycles-backend-in-perf-stat-resul

- How well cache is working?
    - How to calculate? Cache misses divided by instructions;

## Dependencies and imports

Install and import required packages.

In [None]:
!pip install pandas
!pip install psutil
!conda install -c plotly plotly-orca -y

In [2]:
import json
import pandas as pd
import numpy as np
from viscpu import utils, perf_record

%load_ext autoreload
%autoreload 2

## perf stat

### Load datasets

Load the two datasets that will be compared. We use the first dataset as base for comparison. For example: if the first experiment has 10 cache misses and the second has 15 cache misses, this means that the number of cache misses increased.

In [42]:
dataset_1 = "../applications/simple-ff-test/data/perf-test-1.csv"
dataset_2 = "../applications/simple-ff-test/data/perf-test-2.csv"

column_names=["time", "cpu", "counter_value", "ignore_1", "event",
              "ignore_2", "ignore_3", "ignore_4", "ignore_5", "ignore_6"]
usecols=["time", "cpu", "counter_value", "event"]

df_1 = pd.read_csv(dataset_1, skiprows=1, header=None, names=column_names, usecols=usecols)
df_1["time"] = df_1["time"].round(4)
df_2 = pd.read_csv(dataset_2, skiprows=1, header=None, names=column_names, usecols=usecols)
df_2["time"] = df_2["time"].round(4)

df_1["counter_value"] = df_1["counter_value"].replace(["<not counted>"], 0).astype(int)
df_2["counter_value"] = df_2["counter_value"].replace(["<not counted>"], 0).astype(int)

Define captured events:

In [43]:
events = list(df_1["event"].unique())

### Pre-processing

In [44]:
output = {
    "events": events,
    "dataset-1": {"raw": {}, "aggregated": {}},
    "dataset-2": {"raw": {}, "aggregated": {}},
    "comparison-1-2": {},
    "comparison-2-1": {}
}

Create the setup of the CPUs. Here you can choose how many CPUs will be shown on each row.

In [45]:
cpu_labels, cpu_setup = utils.get_cpu_setup(df_1["cpu"].unique(), cpus_per_row=4)

output["cpu_labels"] = cpu_labels
output["cpu_setup"] = cpu_setup

Load data from each captured event and write in `output`:

In [48]:
for event in events:
    print(f"Processing event '{event}'...")
    times, captures = utils.get_event_data(df_1, cpu_setup, event)
    output["dataset-1"]["raw"][event] = {
        "captures": captures,
        "captures_min": float(df_1[df_1["event"] == event]["counter_value"].min()),
        "captures_max": float(df_1[df_1["event"] == event]["counter_value"].max())
    }
    
    times, captures = utils.get_event_data(df_2, cpu_setup, event)
    output["dataset-2"]["raw"][event] = {
        "captures": captures,
        "captures_min": float(df_2[df_2["event"] == event]["counter_value"].min()),
        "captures_max": float(df_2[df_2["event"] == event]["counter_value"].max())
    }
    print(f"Finished event '{event}'.")

Processing event 'cpu-cycles'...
Finished event 'cpu-cycles'.
Processing event 'instructions'...
Finished event 'instructions'.
Processing event 'cache-misses'...
Finished event 'cache-misses'.
Processing event 'cache-references'...
Finished event 'cache-references'.
Processing event 'L1-dcache-load-misses'...
Finished event 'L1-dcache-load-misses'.
Processing event 'L1-dcache-loads'...
Finished event 'L1-dcache-loads'.
Processing event 'L1-dcache-stores'...
Finished event 'L1-dcache-stores'.
Processing event 'L1-icache-load-misses'...
Finished event 'L1-icache-load-misses'.
Processing event 'LLC-loads'...
Finished event 'LLC-loads'.
Processing event 'LLC-load-misses'...
Finished event 'LLC-load-misses'.
Processing event 'LLC-stores'...
Finished event 'LLC-stores'.
Processing event 'LLC-store-misses'...
Finished event 'LLC-store-misses'.
Processing event 'mem-loads'...
Finished event 'mem-loads'.
Processing event 'mem-stores'...
Finished event 'mem-stores'.
Processing event 'Joules'...

Aggregate time series of events. This will allow to compare the overall performance of the experiments.

In [57]:
for event in events:
    print(f"Processing event '{event}'...")
    df_1_aggr = df_1[df_1["event"] == event].groupby(["cpu"], as_index=False)["counter_value"]
    df_2_aggr = df_2[df_2["event"] == event].groupby(["cpu"], as_index=False)["counter_value"]
    
    if event not in output["dataset-1"]["aggregated"]:
        output["dataset-1"]["aggregated"][event] = {
            "mean": {},
            "mean_relative": {},
            "sum": {},
            "sum_relative": {},
        }
        output["dataset-2"]["aggregated"][event] = {
            "mean": {},
            "mean_relative": {},
            "sum": {},
            "sum_relative": {}
        }
    
    if event not in output["comparison-1-2"]:
        output["comparison-1-2"][event] = {}
        output["comparison-2-1"][event] = {}
    
    a = utils.transform_cpu_data(df_1_aggr.mean(), cpu_setup)
    a_sum = np.sum(a)
    b = utils.transform_cpu_data(df_2_aggr.mean(), cpu_setup)
    b_sum = np.sum(b)
    output["dataset-1"]["aggregated"][event]["mean"] = a
    output["dataset-1"]["aggregated"][event]["mean_relative"] = a if a_sum <= 0 else ((np.array(a) / a_sum) * 100).tolist()
    output["dataset-2"]["aggregated"][event]["mean"] = b
    output["dataset-2"]["aggregated"][event]["mean_relative"] = b if b_sum <= 0 else ((np.array(b) / b_sum) * 100).tolist()
    
    output["comparison-1-2"][event]["mean"] = (np.array(a) - np.array(b)).tolist()
    output["comparison-1-2"][event]["mean_relative"] = b if b_sum <= 0 else ((np.array(a) / a_sum - np.array(b) / b_sum) * 100).tolist()
    output["comparison-1-2"][event]["mean_value"] = 0 if np.mean(a) <= 0 else ((np.mean(b) / np.mean(a)) - 1) * 100
    output["comparison-2-1"][event]["mean"] = (np.array(b) - np.array(a)).tolist()
    output["comparison-2-1"][event]["mean_relative"] = a if a_sum <= 0 else ((np.array(b) / b_sum - np.array(a) / a_sum) * 100).tolist()
    output["comparison-2-1"][event]["mean_value"] = 0 if np.mean(b) <= 0 else ((np.mean(a) / np.mean(b)) - 1) * 100
    
    a = utils.transform_cpu_data(df_1_aggr.sum(), cpu_setup)
    a_sum = np.sum(a)
    b = utils.transform_cpu_data(df_2_aggr.sum(), cpu_setup)
    b_sum = np.sum(b)
    output["dataset-1"]["aggregated"][event]["sum"] = a
    output["dataset-1"]["aggregated"][event]["sum_relative"] = a if a_sum <= 0 else ((np.array(a) / a_sum) * 100).tolist()
    output["dataset-2"]["aggregated"][event]["sum"] = b
    output["dataset-2"]["aggregated"][event]["sum_relative"] = b if b_sum <= 0 else ((np.array(b) / b_sum) * 100).tolist()
    
    output["comparison-1-2"][event]["sum"] = (np.array(a) - np.array(b)).tolist()
    output["comparison-1-2"][event]["sum_relative"] = b if b_sum <= 0 else (((np.array(a) / a_sum) - (np.array(b) / b_sum)) * 100).tolist()
    output["comparison-1-2"][event]["sum_value"] = 0 if np.sum(b) <= 0 else ((np.sum(a) / np.sum(b)) - 1) * 100
    output["comparison-2-1"][event]["sum"] = (np.array(b) - np.array(a)).tolist()
    output["comparison-2-1"][event]["sum_relative"] = a if a_sum <= 0 else (((np.array(b) / b_sum) - (np.array(a) / a_sum)) * 100).tolist()
    output["comparison-2-1"][event]["sum_value"] = 0 if np.sum(a) <= 0 else ((np.sum(b) / np.sum(a)) - 1) * 100
    
    print(f"Finished event '{event}'.")

Processing event 'cpu-cycles'...
Finished event 'cpu-cycles'.
Processing event 'instructions'...
Finished event 'instructions'.
Processing event 'cache-misses'...
Finished event 'cache-misses'.
Processing event 'cache-references'...
Finished event 'cache-references'.
Processing event 'L1-dcache-load-misses'...
Finished event 'L1-dcache-load-misses'.
Processing event 'L1-dcache-loads'...
Finished event 'L1-dcache-loads'.
Processing event 'L1-dcache-stores'...
Finished event 'L1-dcache-stores'.
Processing event 'L1-icache-load-misses'...
Finished event 'L1-icache-load-misses'.
Processing event 'LLC-loads'...
Finished event 'LLC-loads'.
Processing event 'LLC-load-misses'...
Finished event 'LLC-load-misses'.
Processing event 'LLC-stores'...
Finished event 'LLC-stores'.
Processing event 'LLC-store-misses'...
Finished event 'LLC-store-misses'.
Processing event 'mem-loads'...
Finished event 'mem-loads'.
Processing event 'mem-stores'...
Finished event 'mem-stores'.
Processing event 'Joules'...

### Output

Write `output` to a JSON file:

In [58]:
!mkdir data

with open("data/simple-ff-test-1-2.json", "w") as f:
    json.dump(output, f)

mkdir: cannot create directory ‘data’: File exists


## perf record

### Parse datasets

In [74]:
dataset_1 = "../applications/simple-ff-test/data/perf-record-1.txt"
dataset_1_output = "../applications/simple-ff-test/data/perf-record-1.csv"
dataset_2 = "../applications/simple-ff-test/data/perf-record-2.txt"
dataset_2_output = "../applications/simple-ff-test/data/perf-record-2.csv"

# perf_record.parse_record_dataset(dataset_1, dataset_1_output)
# perf_record.parse_record_dataset(dataset_2, dataset_2_output)

df_1 = pd.read_csv(dataset_1_output)
df_2 = pd.read_csv(dataset_2_output)

In [75]:
df_1["time"] = df_1["time"] - df_1["time"].min()
df_1["time_second"] = df_1["time"].apply(lambda x: int(x))
df_1 = df_1.groupby(["time_second", "event", "cpu"]).agg({"counter": "sum", "stack": list}).reset_index()
print(df_1.head(20))

df_2["time"] = df_2["time"] - df_2["time"].min()
df_2["time_second"] = df_2["time"].apply(lambda x: int(x))
df_2 = df_2.groupby(["time_second", "event", "cpu"]).agg({"counter": "sum", "stack": list}).reset_index()
print(df_2.head(20))

    time_second                  event  cpu    counter  \
0             0  L1-dcache-load-misses    0       2148   
1             0  L1-dcache-load-misses    1    2055373   
2             0  L1-dcache-load-misses    2        331   
3             0  L1-dcache-load-misses    3        119   
4             0  L1-dcache-load-misses    4    1836128   
5             0  L1-dcache-load-misses    5        240   
6             0  L1-dcache-load-misses    6        114   
7             0  L1-dcache-load-misses    7       1439   
8             0  L1-dcache-load-misses    8        166   
9             0  L1-dcache-load-misses    9    1029360   
10            0  L1-dcache-load-misses   10        362   
11            0  L1-dcache-load-misses   11      39609   
12            0        L1-dcache-loads    1  406915287   
13            0        L1-dcache-loads    4  118291750   
14            0        L1-dcache-loads    9  368596511   
15            0       L1-dcache-stores    1  147452749   
16            