# Preparing Results CSV Files Before Plotting
**Table of Contents**
- [Preparing Results CSV Files Before Plotting](#preparing-results-csv-files-before-plotting)
- [Import Required Libraries](#import-required-libraries)
- [Load Results CSV Data](#load-results-csv-data)
- [Case 1: Extracting Information from Standard Model Filenames](#case-1-extracting-information-from-standard-model-filenames)
- [Case 2: Manual Column Creation (For Custom Filenames)](#case-2-manual-column-creation-for-custom-filenames)
- [(Optional) Merging with Model Metrics](#optional-merging-with-model-metrics)

Before using `AnalysisPlotter` to generate insightful graphs, it is necessary to ensure that your results file contains all required columns. This notebook helps you prepare those columns.

If you want to plot energy consumption **versus model performance metrics** (e.g., accuracy, F1-score), you must add your metric of choice as a column.



---

## Import Required Libraries

In [9]:
import sys
import os

lib_path = os.path.abspath(os.path.join(os.getcwd(), ".."))
if lib_path not in sys.path:
    sys.path.append(lib_path)

import pandas as pd
from PruneEnergyAnalizer import parse_model_name


---

## Load Results CSV Data

In [10]:
# Load the CSV file
results_df = pd.read_csv("results.csv")
results_df

Unnamed: 0,MODEL_NAME,BATCH_SIZE,Mean Time per Sample (s),FPS,STD Time per Sample (s),Mean Energy per Sample (J),STD Energy per Sample (J),Parameters,FLOPs
0,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,1,0.001067,937.360405,1.630872e-04,0.179271,0.020604,29288367,287175794
1,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,8,0.000128,7820.735324,7.151073e-06,0.034128,0.001762,29288367,2297406352
2,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,16,0.000076,13163.687820,5.275919e-06,0.022645,0.000626,29288367,4594812704
3,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,32,0.000063,15791.149994,1.574052e-07,0.019305,0.000840,29288367,9189625408
4,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,64,0.000060,16753.248936,2.279891e-07,0.018104,0.000374,29288367,18379250816
...,...,...,...,...,...,...,...,...,...
985,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,1,0.003198,312.691602,1.393133e-06,1.014720,0.002878,105475850,11196053973
986,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,8,0.001798,556.158070,2.682008e-07,0.560247,0.000482,105475850,89568431784
987,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,16,0.001701,587.937494,8.066138e-07,0.542115,0.001701,105475850,179136863568
988,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,32,0.001611,620.593181,2.604402e-07,0.513291,0.000686,105475850,358273727136



---

## Case 1: Extracting Information from Standard Model Filenames

If your model filenames follow a specific naming convention, you can use the function `parse_model_name()` to automatically create the columns: **GPR** (Global Pruning Ratio), **Architecture**, and **Pruning Distribution**.

### Expected Filename Structure

Filenames should follow this pattern:

- `{ARCHITECTURE}`: Model architecture (e.g., `AlexNet`, `VGG16`)
- `{DATASET}`: Dataset used for training (e.g., `CIFAR10`)
- `{PRUNING_DISTRIBUTION}`: Pruning distribution strategy (e.g., `random_PD3`)
- `{PRUNING_RATIO}`: Global pruning ratio in percentage (e.g., `20` for 20%)
- `{STATE}`: Either `PRUNED` or `UNPRUNED`

**Example:**

```
{ARCHITECTURE}_{DATASET}_{PRUNING_DISTRIBUTION}_GPR-{PRUNING_RATIO}_{SEED}.pth
```

### Unpruned Model Filenames

For unpruned models, use:

```
{ARCHITECTURE}_{DATASET}_UNPRUNED.pth
```

These have a pruning ratio of `0%` and serve as baselines for comparison with pruned models.

In [11]:
results_df = parse_model_name(results_df)
results_df[["MODEL_NAME", "GPR", "Architecture", "Pruning Distribution"]]


Unnamed: 0,MODEL_NAME,GPR,Architecture,Pruning Distribution
0,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,40,AlexNet,PD3
1,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,40,AlexNet,PD3
2,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,40,AlexNet,PD3
3,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,40,AlexNet,PD3
4,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,40,AlexNet,PD3
...,...,...,...,...
985,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,15,VGG16,PD5
986,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,15,VGG16,PD5
987,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,15,VGG16,PD5
988,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,15,VGG16,PD5



---

## Case 2: Manual Column Creation (For Custom Filenames)

In [None]:
# Read the metadata file
metadata_df = pd.read_csv('metadata.csv')  # must contain MODEL_NAME, Architecture, Pruning Distribution

# Merge based on MODEL_NAME
results_df = pd.merge(results_df, metadata_df, on='MODEL_NAME', how='left')

# Uncomment if you want to set a default value for "Pruning Distribution" becouse it is not present in the metadata
# merged_df["Pruning Distribution"] = 'PD0' 

# Display the merged DataFrame
results_df.head()



---

## (Optional) Merging with Model Metrics

In [12]:
# Read the metadata file
metric_df = pd.read_csv('metadata.csv')  # must contain MODEL_NAME and  YOUR_METRIC (e.g., Accuracy, F1 Score, etc.)

# Merge based on MODEL_NAME
results_df = pd.merge(results_df, metric_df, on='MODEL_NAME', how='left')

# Display the merged DataFrame
results_df

Unnamed: 0,MODEL_NAME,BATCH_SIZE,Mean Time per Sample (s),FPS,STD Time per Sample (s),Mean Energy per Sample (J),STD Energy per Sample (J),Parameters,FLOPs,GPR,Architecture,Pruning Distribution,YOUR_METRIC
0,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,1,0.001067,937.360405,1.630872e-04,0.179271,0.020604,29288367,287175794,40,AlexNet,PD3,0.517827
1,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,8,0.000128,7820.735324,7.151073e-06,0.034128,0.001762,29288367,2297406352,40,AlexNet,PD3,0.517827
2,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,16,0.000076,13163.687820,5.275919e-06,0.022645,0.000626,29288367,4594812704,40,AlexNet,PD3,0.517827
3,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,32,0.000063,15791.149994,1.574052e-07,0.019305,0.000840,29288367,9189625408,40,AlexNet,PD3,0.517827
4,AlexNet_DATASET_random_PD3_GPR-40_PRUNED_SEED_...,64,0.000060,16753.248936,2.279891e-07,0.018104,0.000374,29288367,18379250816,40,AlexNet,PD3,0.517827
...,...,...,...,...,...,...,...,...,...,...,...,...,...
985,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,1,0.003198,312.691602,1.393133e-06,1.014720,0.002878,105475850,11196053973,15,VGG16,PD5,0.531742
986,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,8,0.001798,556.158070,2.682008e-07,0.560247,0.000482,105475850,89568431784,15,VGG16,PD5,0.531742
987,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,16,0.001701,587.937494,8.066138e-07,0.542115,0.001701,105475850,179136863568,15,VGG16,PD5,0.531742
988,VGG16_DATASET_random_PD5_GPR-15_PRUNED_SEED_23...,32,0.001611,620.593181,2.604402e-07,0.513291,0.000686,105475850,358273727136,15,VGG16,PD5,0.531742



---

## Save Prepared Results to CSV

After all necessary processing, you can save your prepared DataFrame (e.g., `results_df`) to a new CSV file for later use or for plotting.


In [13]:
# Save the processed results DataFrame to a new CSV file
results_df.to_csv("prepared_results.csv", index=False)
print("Results saved to prepared_results.csv")


Results saved to prepared_results.csv
