### EDA(Exploratory Data Analysis) of Power Trace files
The goal of this notebook is to go through all power trace values and get the average exponent value. This will be used to increase all power values so the CNN would struggle less and unlikely run into vanishing gradient problems.

### Essential imports
Rather than using Pandas, I decided to use a simple loop over all files with a line-read function.

Will use seaborn for visualization.

In [None]:
# import pandas as pd
import seaborn as sns
import numpy as np
from collections import defaultdict
import os
import re

## Helper Functions

In [None]:
''' 
Function that extracts all exponent values from a SINGLE power trace file
Input:
    1) filename: string; name of power trace file
Returns:
    1) avg_exponent: float; average exponent of all power trace values
'''
def extract_exponents_from_file(filename):
    exponents = []
    with open(filename, 'r') as f:
        for line in f:
            parts = line.strip().split()
            if len(parts) < 2:
                continue  # Skip lines with fewer than 2 columns
            value_str = parts[1]
            if 'e' not in value_str:
                print(f"Skipping non-scientific value '{value_str}' in {filename}")
                continue
            try:
                exponent = int(value_str.split('e')[1])
                exponents.append(exponent)
            except ValueError:
                print(f"Invalid exponent format in '{value_str}' from {filename}")
    avg_exponent = np.sum(exponents) / len(exponents)
    return avg_exponent

### Data Analysis

In [None]:
# Loop through all power trace files, create dictionary of avg exponent values
cwd = os.getcwd()
root_path = os.path.join(cwd, 'trace_files')
exp_results = defaultdict(list)

# exp_results: per each dir as key, saves all avg exponent values as list
for root, dirs, files in os.walk(root_path):
    for dir in dirs:
        for file in files:
            if file.endswith(".txt"):
                exp_results[dir].append(extract_exponents_from_file(file))
                

In [None]:
# key = dir, val = list of its avg exponent values
for k, v in exp_results.items():
    