# **CPU Energy Model Construction Notebook**

This notebook builds a server-specific energy model by analyzing `stress` experiment runs.  
The aim is to estimate the **per-core power consumption** based on controlled workloads, and use this to form a predictive model for energy estimation.

We assume:
- Each experiment run uses a known number of CPU cores.
- Power consumption is measured over time.
- The base (idle) power consumption of the system is known or measured separately.

In [None]:
import os
import json
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
import matplotlib.pyplot as plt
import seaborn as sns

## Insert Experiment Configuration

Define the mapping of each CSV run to the number of CPU cores used during that run.  
Also specify the idle power (P_base), measured separately.

In [None]:
# Path to folder with CSV files (one per run)
CSV_FOLDER = "/srv/testbed/results/warmuth/default/2024_stress_4_cores/energy/riga"

# Idle power measurement in watts
P_BASE = 58.0

# --- Configurable Parameters ---
base_name = "measurement_run"
file_extension = ".csv"
start = 0
end = 3

# Set what the run is mapped to (e.g., 'cores', 'duration', 'power', etc.)
map_to = "cores"

# Mapping generator
run_mapping = {
    f"{base_name}{i}{file_extension}": i + 1
    for i in range(start, end + 1)
}

# Print with context
print(f"Mapping of measurement run filenames to {map_to}:")
print(json.dumps(run_mapping, indent=4))

## Load Stress Runs and Estimate Power per Core
Each CSV will be loaded, the average power calculated, and then the power attributable to each active core is estimated using the formula:

$$
P_{\text{core}} = \frac{\bar{P}_{\text{measured}} - P_{\text{base}}}{\text{\# active cores}}
$$

In [None]:
# === Build model dataframe ===
results = []
for filename, cores in run_mapping.items():
    path = os.path.join(CSV_FOLDER, filename)
    try:
        df = pd.read_csv(path)
        avg_power = df["power_active_W"].mean()
        per_core_power = (avg_power - P_BASE) / cores

        results.append({
            "filename": filename,
            "cores": cores,
            "avg_power": round(avg_power, 2),
            "per_core_power": round(per_core_power, 2)
        })
    except Exception as e:
        print(f"Failed to read {filename}: {e}")

model_df = pd.DataFrame(results)

## Modeling Power Consumption with Linear and Polynomial Regression

We aim to model the relationship between the number of active **CPU cores** and the **real power consumption** of a server.

To do this, we fit two models:
- **Linear Regression**: Assumes each core adds a constant amount of power.
- **Polynomial Regression (Degree 2)**: Captures non-linear relationships (e.g., diminishing returns or thermal effects).

This model can be used in a server power estimation formula:

$$
P_{\text{server}} = P_{\text{base}} + \sum_{i=1}^{n} \lambda_i \cdot P_{\text{core}, i}
$$

Where:
- $P_{\text{base}}$: Idle (baseline) power consumption of the server  
- $n$: Number of **active cores**  
- $\lambda_i$: Load factor of the *i-th* core (0 to 1)  
- $P_{\text{core}, i}$: Power consumed by core *i* at full load

This generalized model allows core-specific power contributions and supports heterogeneous or dynamic core behaviors.

In [None]:
# === Fit Linear & Polynomial Regression ===
X = model_df[["cores"]]
y = model_df["avg_power"]

# Linear fit
reg_linear = LinearRegression().fit(X, y)
model_df["predicted_linear"] = reg_linear.predict(X)
model_df["error_linear"] = (
    abs(model_df["avg_power"] - model_df["predicted_linear"]) / model_df["avg_power"]
) * 100

# Polynomial fit (degree 2)
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
reg_poly = LinearRegression().fit(X_poly, y)
model_df["predicted_poly"] = reg_poly.predict(X_poly)
model_df["error_poly"] = (
    abs(model_df["avg_power"] - model_df["predicted_poly"]) / model_df["avg_power"]
) * 100

# === CPU Power Model Info ===
cpu_model = {
    "p_base": P_BASE,
    "p_core": reg_linear.coef_[0],
    "fitted_intercept": reg_linear.intercept_,
    "timestamp": pd.Timestamp.now().isoformat(),
    "node_name": "riga"
}

# Save model (optional)
# with open("cpu_model.json", "w") as f:
#     json.dump(cpu_model, f, indent=2)

# === Predict total power at various CPU loads ===
def predict_cpu_power_detailed(per_core_loads, per_core_powers, p_base):
    return p_base + sum(l * p for l, p in zip(per_core_loads, per_core_powers))

num_cores = 4
P_CORE = reg_linear.coef_[0]
load_levels = [1.0, 0.5, 0.25]
labels = ["100% Load", "50% Load", "25% Load"]

predicted_powers = []
for load in load_levels:
    loads = [load] * num_cores
    powers = [P_CORE] * num_cores
    predicted_power = predict_cpu_power_detailed(loads, powers, P_BASE)
    predicted_powers.append(predicted_power)

# === Plot: Predicted Power at Different Load Levels ===
plt.figure(figsize=(7, 5))
bars = plt.bar(labels, predicted_powers, color="skyblue", edgecolor="black")
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval + 1, f"{yval:.1f} W", ha='center', va='bottom', fontsize=11)
plt.title("Predicted Server Power at Different CPU Loads (4 cores)")
plt.ylabel("Power (W)")
plt.ylim(0, max(predicted_powers) + 20)
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.tight_layout()
plt.show()

## Visualizing Fitted Models vs. Measured Power

We compare the **measured power consumption** to predictions from:
- A **Linear Regression** model
- A **Polynomial Regression** model (Degree 2)

This helps assess how well each model captures the trend in power usage as more CPU cores are activated.

In [None]:
# === Plot: Actual vs Fitted Power ===
x_vals = np.linspace(model_df["cores"].min(), model_df["cores"].max(), 100)
x_vals_df = pd.DataFrame(x_vals, columns=["cores"])
y_vals_linear = reg_linear.predict(x_vals_df)
y_vals_poly = reg_poly.predict(poly.transform(x_vals_df))

sns.set_theme(style="whitegrid")
plt.figure(figsize=(8, 5))
sns.scatterplot(data=model_df, x="cores", y="avg_power", s=80, label="Measured")
plt.plot(x_vals, y_vals_linear, color="red", label="Linear Fit", linewidth=2)
plt.plot(x_vals, y_vals_poly, color="blue", label="Polynomial Fit (Degree 2)", linewidth=2)
plt.title("Measured Power vs. Fitted Models")
plt.xlabel("CPU Cores Used")
plt.ylabel("Power (W)")
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()

## Prediction Accuracy: Linear vs. Polynomial

Below is a comparison table showing the actual power consumption values (`real_power`), the predicted values from both models, and their corresponding percentage errors.
This helps evaluate how well each model fits the data.


In [None]:
# Copy the columns to display
display_df = model_df.copy()

# Format error columns with a % sign
display_df["error_linear"] = display_df["error_linear"].map(lambda x: f"{x:.2f}%")
display_df["error_poly"] = display_df["error_poly"].map(lambda x: f"{x:.2f}%")

# Round the rest and display
for col in ["cores", "avg_power", "predicted_linear", "predicted_poly"]:
    display_df[col] = display_df[col].round(2)

display(display_df)