# **CPU Energy Model Construction Notebook**

This notebook builds a server-specific energy model by analyzing `stress` experiment runs.  
The aim is to estimate the **per-core power consumption** based on controlled workloads, and use this to form a predictive model for energy estimation.

We assume:
- Each experiment run uses a known number of CPU cores.
- Power consumption is measured over time.
- The base (idle) power consumption of the system is known or measured separately.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

## ðŸ”§ Insert Experiment Configuration

Define the mapping of each CSV run to the number of CPU cores used during that run.  
Also specify the idle power (P_base), measured separately.

In [None]:
# Path to folder with CSV files (one per run)
CSV_FOLDER = "/srv/testbed/results/warmuth/default/2024_stress_8_cores/energy/riga"

# Idle power measurement in watts
P_BASE = 58.0

# Mapping of run filename to number of stressed cores
run_core_mapping = {
    "measurement_run0.csv": 1,
    "measurement_run1.csv": 2,
    "measurement_run2.csv": 3,
    "measurement_run3.csv": 4,
    "measurement_run4.csv": 5,
    "measurement_run5.csv": 6,
    "measurement_run6.csv": 7,
    "measurement_run7.csv": 8,
}

## Load Stress Runs and Estimate Power per Core
Each CSV will be loaded, the average power calculated, and then the power attributable to each active core is estimated using the formula:

$$
P_{\text{core}} = \frac{\bar{P}_{\text{measured}} - P_{\text{base}}}{\text{\# active cores}}
$$

In [None]:
results = []

for filename, cores in run_core_mapping.items():
    path = os.path.join(CSV_FOLDER, filename)
    df = pd.read_csv(path)

    avg_power = df["power_active_W"].mean()
    per_core_power = (avg_power - P_BASE) / cores

    results.append({
        "filename": filename,
        "cores": cores,
        "avg_power": round(avg_power, 2),
        "per_core_power": round(per_core_power, 2)
    })

model_df = pd.DataFrame(results)
model_df


## Modeling Power Consumption with Linear and Polynomial Regression

We aim to model the relationship between the number of active **CPU cores** and the **real power consumption** of a server.

To do this, we fit two models:
- **Linear Regression**: Assumes each core adds a constant amount of power.
- **Polynomial Regression (Degree 2)**: Captures non-linear relationships (e.g., diminishing returns or thermal effects).

This model can be used in a server power estimation formula:

$$
P_{\text{server}} = P_{\text{base}} + \sum_{i=1}^{n} \lambda_i \cdot P_{\text{core}, i}
$$

Where:
- $P_{\text{base}}$: Idle (baseline) power consumption of the server  
- $n$: Number of **active cores**  
- $\lambda_i$: Load factor of the *i-th* core (0 to 1)  
- $P_{\text{core}, i}$: Power consumed by core *i* at full load

This generalized model allows core-specific power contributions and supports heterogeneous or dynamic core behaviors.


In [None]:
cpu_model = {
    "p_base": P_BASE,                # Idle power in W
    "p_core": reg.coef_[0],          # Fitted power per core
    "fitted_intercept": reg.intercept_,  # Optional: full linear model P = intercept + slope * cores
    "timestamp": pd.Timestamp.now().isoformat(),
    "node_name": "riga"
}

import json
with open("cpu_model.json", "w") as f:
    json.dump(cpu_model, f, indent=2)

def predict_cpu_power_detailed(per_core_loads, per_core_powers, p_base):
    """
    Predict total power based on individual core loads and power contributions.

    Parameters:
    - per_core_loads: list or array of lambda_i values (per-core load, 0â€“1)
    - per_core_powers: list or array of P_core_i values (per-core power at full load)
    - p_base: baseline server power (idle power)

    Returns:
    - Total predicted server power
    """
    return p_base + sum(l * p for l, p in zip(per_core_loads, per_core_powers))

import matplotlib.pyplot as plt

# === Parameters ===
P_BASE = 50  # Idle power in W
P_CORE = reg_linear.coef_[0]  # Per-core power from linear model
num_cores = 4

# === Load Scenarios ===
load_levels = [1.0, 0.5, 0.25]
labels = ["100% Load", "50% Load", "25% Load"]

# === Calculate power for each scenario ===
predicted_powers = []
for load in load_levels:
    loads = [load] * num_cores
    powers = [P_CORE] * num_cores
    power = predict_cpu_power_detailed(loads, powers, P_BASE)
    predicted_powers.append(power)

# === Plot ===
plt.figure(figsize=(7, 5))
bars = plt.bar(labels, predicted_powers, color="skyblue", edgecolor="black")

# Add value labels on top
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval + 1, f"{yval:.1f} W", ha='center', va='bottom', fontsize=11)

plt.title("Predicted Server Power at Different CPU Loads (4 cores)")
plt.ylabel("Power (W)")
plt.ylim(0, max(predicted_powers) + 20)
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.tight_layout()
plt.show()

## Modeling Power Consumption with Linear and Polynomial Regression

In this section, we model the relationship between the number of CPU cores and the **real power consumption** using:
- A **Linear Regression** model
- A **Polynomial Regression** model (Degree 2)

We then visualize the measured values along with both fitted curves.


In [None]:
# === Fit models on real_power vs cores ===
X = model_df[["cores"]]
y = model_df["real_power"]

# --- Linear model ---
reg_linear = LinearRegression().fit(X, y)
model_df["predicted_linear"] = reg_linear.predict(X)
model_df["error_linear"] = (
    abs(model_df["real_power"] - model_df["predicted_linear"]) / model_df["real_power"]
) * 100

# --- Polynomial model (degree 2) ---
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
reg_poly = LinearRegression().fit(X_poly, y)
model_df["predicted_poly"] = reg_poly.predict(X_poly)
model_df["error_poly"] = (
    abs(model_df["real_power"] - model_df["predicted_poly"]) / model_df["real_power"]
) * 100

# === Plot real_power with linear & polynomial regression fits ===
x_vals = np.linspace(model_df["cores"].min(), model_df["cores"].max(), 100)
x_vals_df = pd.DataFrame(x_vals, columns=["cores"])

y_vals_linear = reg_linear.predict(x_vals_df)
y_vals_poly = reg_poly.predict(poly.transform(x_vals_df))

sns.set_theme(style="whitegrid")
plt.figure(figsize=(8, 5))

# Scatter plot of actual measurements
sns.scatterplot(data=model_df, x="cores", y="real_power", s=80, label="Measured")

# Plot linear and polynomial fits
plt.plot(x_vals, y_vals_linear, color="red", label="Linear Fit", linewidth=2)
plt.plot(x_vals, y_vals_poly, color="blue", label="Polynomial Fit (Degree 2)", linewidth=2)

# Titles and labels
plt.title("Real Power Consumption vs. Number of CPU Cores")
plt.xlabel("CPU Cores Used")
plt.ylabel("Real Power (W)")
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()


## Prediction Accuracy: Linear vs. Polynomial

Below is a comparison table showing the actual power consumption values (`real_power`), the predicted values from both models, and their corresponding percentage errors.
This helps evaluate how well each model fits the data.


In [None]:
columns_to_show = [
    "cores", "real_power",
    "predicted_linear", "error_linear",
    "predicted_poly", "error_poly"
]

display(model_df[columns_to_show].round(2))
