# **CPU Energy Model Construction Notebook**

This notebook builds a server-specific energy model by analyzing `stress` experiment runs.  
The aim is to estimate the **per-core power consumption** based on controlled workloads, and use this to form a predictive model for energy estimation.

We assume:
- Each experiment run uses a known number of CPU cores.
- Power consumption is measured over time.
- The base (idle) power consumption of the system is known or measured separately.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
from sklearn.linear_model import LinearRegression

## ðŸ”§ Insert Experiment Configuration

Define the mapping of each CSV run to the number of CPU cores used during that run.  
Also specify the idle power (P_base), measured separately.

In [None]:
# Path to folder with CSV files (one per run)
CSV_FOLDER = "/srv/testbed/results/warmuth/default/2025-03-18_13-03-16_395230/energy/riga"

# Idle power measurement in watts
P_BASE = 63.0  # Adjust as needed

# Mapping of run filename to number of stressed cores
run_core_mapping = {
    "measurement_run0.csv": 1,
    "measurement_run1.csv": 2,
    "measurement_run2.csv": 3,
    "measurement_run3.csv": 4,
}

## Load Stress Runs and Estimate Power per Core
Each CSV will be loaded, the average power calculated, and then the power attributable to each active core is estimated using the formula:

$$
P_{\text{core}} = \frac{\bar{P}_{\text{measured}} - P_{\text{base}}}{\text{\# active cores}}
$$

In [None]:
results = []

for filename, cores in run_core_mapping.items():
    path = os.path.join(CSV_FOLDER, filename)
    df = pd.read_csv(path)

    avg_power = df["power_active_W"].mean()
    per_core_power = (avg_power - P_BASE) / cores

    results.append({
        "filename": filename,
        "cores": cores,
        "avg_power": round(avg_power, 2),
        "per_core_power": round(per_core_power, 2)
    })

model_df = pd.DataFrame(results)
model_df

# X = model_df[["cores"]]
# y = model_df["avg_power"]

# reg = LinearRegression().fit(X, y)
# print(f"Fitted: base={reg.intercept_:.2f}, slope/core={reg.coef_[0]:.2f}")


## Visualize Power vs Number of Cores
The plot below shows how power increases as more cores are stressed.

In [None]:
# Fit linear regression
X = model_df[["cores"]]
y = model_df["avg_power"]
reg = LinearRegression().fit(X, y)

print(f"Fitted: base={reg.intercept_:.2f}, slope/core={reg.coef_[0]:.2f}")

# Generate regression line points
x_vals = np.linspace(model_df["cores"].min(), model_df["cores"].max(), 100)
y_vals = reg.predict(x_vals.reshape(-1, 1))

# Plot measured values and regression line
sns.set_theme(style="whitegrid")
plt.figure(figsize=(8, 5))

sns.scatterplot(data=model_df, x="cores", y="avg_power", s=80, label="Measured")
plt.plot(x_vals, y_vals, color="red", label="Linear Fit", linewidth=2)

plt.title("Average Power Consumption vs. Number of CPU Cores")
plt.xlabel("CPU Cores Used")
plt.ylabel("Average Power (W)")
plt.grid(True)
plt.legend()
plt.show()



## Model Output

You can now use the estimated per-core power (P_core) in the full server model:

$$
P_{\text{server}} = P_{\text{base}} + \lambda \cdot C \cdot P_{\text{core}}
$$

Where:
- $\lambda$: CPU load factor (0...1)  
- $C$: Number of active cores

In future steps, NIC power contributions can be integrated similarly.

In [None]:
cpu_model = {
    "p_base": P_BASE,                # Idle power in W
    "p_core": reg.coef_[0],          # Fitted power per core
    "fitted_intercept": reg.intercept_,  # Optional: full linear model P = intercept + slope * cores
    "timestamp": pd.Timestamp.now().isoformat(),
    "node_name": "riga"
}

import json
with open("cpu_model.json", "w") as f:
    json.dump(cpu_model, f, indent=2)

In [None]:
def predict_cpu_power(c, load, model):
    """
    Predict power consumption based on core count and load.
    """
    return model["p_base"] + c * load * model["p_core"]

# Example usage
power_pred = predict_cpu_power(4, 1, cpu_model)
print(f"Predicted power (4 cores at 75% load): {power_pred:.2f} W")

In [None]:
# You must add or already have avg_load and real_power columns
model_df["avg_load"] = 1.0  # assuming stress tests always run at full load
model_df["real_power"] = model_df["avg_power"]  # reuse your previous measurements

# Predict using model
model_df["predicted_power"] = model_df.apply(
    lambda row: predict_cpu_power(row["cores"], row["avg_load"], cpu_model), axis=1
)

# Compute error in %
model_df["error_percent"] = (
    abs(model_df["real_power"] - model_df["predicted_power"]) / model_df["real_power"]
) * 100

model_df[["cores", "avg_load", "real_power", "predicted_power", "error_percent"]]


In [None]:
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

# === Linear model ===
X_linear = model_df[["cores"]]
y = model_df["real_power"]

reg_linear = LinearRegression().fit(X_linear, y)
model_df["predicted_linear"] = reg_linear.predict(X_linear)
model_df["error_linear"] = (
    abs(model_df["real_power"] - model_df["predicted_linear"]) / model_df["real_power"]
) * 100

# === Polynomial model ===
X_poly = PolynomialFeatures(degree=2).fit_transform(X_linear)
reg_poly = LinearRegression().fit(X_poly, y)

model_df["predicted_poly"] = reg_poly.predict(X_poly)
model_df["error_poly"] = (
    abs(model_df["real_power"] - model_df["predicted_poly"]) / model_df["real_power"]
) * 100

# === Final comparison table ===
columns_to_show = [
    "cores", "real_power",
    "predicted_linear", "error_linear",
    "predicted_poly", "error_poly"
]

display(model_df[columns_to_show].round(2))
