# **PURPOSE OF THE NOTEBOOK**

 This notebook is designed to develop and train a predictive model aimed at optimizing power consumption in a laptop while minimizing the impact on system performance. The primary goal is to create a model that can accurately estimate power savings and suggest optimizations that balance power efficiency with system performance.

## **The Step 0. What are we going to predict?**

 The objective of the model is to predict `estimated_power_savings`. This feature represents the potential power savings that can be achieved through various optimizations. In addition to focusing on power savings, it is crucial to identify features that are critical to maintaining system performance. These features include:

 - **CPU-related Features:**
   - `cpu_power_state`: Represents the power states of the CPU, including active and idle times. This is important as it directly impacts power consumption and performance.
   - `frequency_stats`: Indicates the CPU frequency statistics and how often each frequency is used. Higher frequencies generally mean better performance but increased power consumption.
   - `average_cpu_frequency`: The average CPU frequency over the monitoring period. This provides insight into the overall CPU performance.
   - `cpu_temperature`: Higher temperatures can lead to thermal throttling, reducing performance to prevent overheating.
   - `cpu_utilization_avg`: The average CPU utilization percentage, reflecting how much the CPU is being used.

 - **GPU-related Features:**
   - `gpu_power_usage`: Power usage of the GPU, a significant component of overall power consumption.
   - `gpu_core_clock`: GPU core clock frequency, indicating the speed of the GPU.
   - `gpu_memory_clock`: GPU memory clock frequency, affecting memory performance.
   - `gpu_temperature`: Similar to CPU temperature, it impacts thermal management and potential throttling.
   - `gpu_utilization`: GPU utilization percentage, reflecting how much the GPU is being used.
   - `gpu_memory_utilization`: Indicates how much of the GPU memory is being utilized.

 - **Display-related Features:**
   - `display_backlight_power`: Power consumption by the display backlight. The display is a significant power consumer, and managing its power usage can yield considerable savings.

These features are critical because they collectively represent the primary components affecting both power consumption and performance. By carefully analyzing these features, we can ensure that our model suggests optimizations that do not significantly degrade system performance.

## **The Step 0.5. How are we going to evaluate our selected model?**

### Performance Metrics

To evaluate the performance of our predictive models, we will use the following metrics:

- **Root Mean Squared Error (RMSE):**
  - **Definition:** RMSE is the square root of the average of squared differences between the predicted and actual values. It provides a measure of the average magnitude of the error.

  - **Why RMSE?**
    - **Interpretability:** RMSE is easy to interpret as it is in the same units as the target variable, `estimated_power_savings`.
    - **Error Sensitivity:** RMSE is sensitive to large errors due to the squaring of differences, making it effective for identifying models that have larger deviations from actual values.

- **Mean Absolute Error (MAE):**
  - **Definition:** MAE is the average of the absolute differences between the predicted and actual values. It provides a straightforward measure of prediction accuracy.

  - **Why MAE?**
    - **Robustness to Outliers:** MAE is less sensitive to outliers compared to RMSE, as it does not square the error terms.
    - **Simplicity:** MAE is simple to calculate and interpret, providing a clear view of the average prediction error.

### Justification for Using RMSE and MAE

- **Comprehensive Evaluation:** Using both RMSE and MAE allows for a comprehensive evaluation of the models' performance. RMSE will help identify models with larger prediction errors, while MAE will provide an overall measure of prediction accuracy.
- **Balance of Sensitivity and Robustness:** The combination of RMSE and MAE balances sensitivity to large errors (RMSE) and robustness to outliers (MAE). This dual approach ensures that our model selection considers both aspects of prediction accuracy.
- **Applicability to Regression Tasks:** Both metrics are standard evaluation measures for regression tasks, making them appropriate for our objective of predicting `estimated_power_savings`.

By employing these performance metrics, we aim to select the model that best balances accuracy and robustness, ultimately achieving our goal of optimizing power consumption while maintaining system performance.

> Do note that, although we are going to evaluate based on these two metrics, we are only going to
> use *RMSE* to do the **hyperparameter tuning**. The *MAE* will be a secondary metric.

## **The Model Solution. What kind of predictive task do we have? How are we going to optimize for multiple features?**

 The predictive task at hand is a regression task, where we aim to predict a continuous target variable, `estimated_power_savings`. However, our goal is multi-objective: we want to optimize power consumption while also considering system performance.

 To address this, we will explore several modeling options:

### **Linear Regression**
 - **Advantages:**
   - Simple to implement and interpret.
   - Fast training time.
   - Works well with linearly separable data.
 - **Disadvantages:**
   - Limited to linear relationships.
   - Can underperform with complex datasets.

### **Decision Trees**
 - **Advantages:**
   - Easy to interpret and visualize.
   - Can handle non-linear relationships.
   - Requires little data preprocessing. (we already preprocessed anyways)
 - **Disadvantages:**
   - Prone to overfitting.
   - Can be unstable with small variations in data.

### **Random Forest**

 - **Advantages:**

   - Reduces overfitting by averaging multiple decision trees.
   - Can handle large datasets and high-dimensional data.
   - Provides feature importance scores.

 - **Disadvantages:**
   - Longer training time.
   - Less interpretable compared to individual decision trees.

### **Gradient Boosting Machines (GBM)**
 - **Advantages:**
   - Often achieves high predictive performance.
   - Can handle various types of data (continuous, categorical).
   - Provides feature importance scores.
 - **Disadvantages:**
   - Computationally intensive.
   - Requires careful tuning of hyperparameters.

### **XGBoost**

 XGBoost (Extreme Gradient Boosting) is a powerful and popular implementation of gradient boosting algorithms. It is particularly known for its performance and efficiency.

 - **Advantages:**
   - High predictive accuracy.
   - Handles missing data and provides regularization to prevent overfitting.
   - Efficient memory usage and parallel processing capabilities.
   - Provides feature importance scores, which can help in feature selection.
 - **Disadvantages:**
   - Requires careful hyperparameter tuning.
   - Can be complex to interpret compared to simpler models.

<br>

## Achieving Multi-Objective Optimization

To achieve the goal of optimizing for multiple objectives—power savings and maintaining system performance—we can explore several strategies:

### Weighted Loss Function

  - **Approach:**

    - Combine multiple objectives into a single loss function by assigning weights to each objective.
    - For example, the loss function could be a weighted sum of power savings error and performance degradation.
  - **Advantages:**
    - Simple to implement.
    - Provides flexibility in adjusting the trade-off between objectives.
  - **Disadvantages:**
    - Requires careful tuning of weights.
    - May not capture complex relationships between objectives.

### Multi-Output Regression

  - **Approach:**
    - Train a model to predict multiple targets simultaneously, such as `estimated_power_savings` and performance metrics (e.g., CPU utilization, GPU utilization).
    - Use algorithms that support multi-output regression, like Random Forest Regressor or XGBoost.
  - **Advantages:**
    - Can capture relationships between multiple targets.
    - Efficiently handles multiple objectives in a single model.
  - **Disadvantages:**
    - Increased model complexity.
    - Requires more computational resources.

### Pareto Optimization

  - **Approach:**
    - Use Pareto optimization to identify solutions that offer the best trade-offs between multiple objectives.
    - Generate a Pareto front, representing the set of non-dominated solutions where no objective can be improved without degrading another.
  - **Advantages:**
    - Provides a clear view of trade-offs between objectives.
    - Helps in selecting optimal solutions based on preferences.
  - **Disadvantages:**
    - Computationally intensive.
    - Requires more complex optimization algorithms.

### Reinforcement Learning (RL)
  - **Approach:**
    - Implement an RL agent that learns to optimize power consumption and performance through interactions with the environment.
    - Define a reward function that balances power savings and performance.
  - **Advantages:**
    - Adaptive and can handle dynamic environments.
    - Learns optimal strategies over time.
  - **Disadvantages:**
    - Complex to implement.
    - Requires significant computational resources and time for training.

In this notebook, we will start by focusing on weighted loss function and multi-output regression approaches. These methods provide a good balance of simplicity and effectiveness, making them suitable for our initial implementation. Depending on the results, we may explore more advanced techniques like Pareto optimization and reinforcement learning in future iterations.

In summary, this notebook will guide us through the process of developing a predictive model to optimize power consumption while maintaining system performance. We will explore different modeling approaches, evaluate their advantages and disadvantages, and ultimately focus on implementing XGBoost due to its powerful capabilities in handling complex datasets and achieving high predictive accuracy.

 --- 
 
 ## Step-by-Step Process

 ### 1. Model Training

 In this step, we will train various regression models to predict `estimated_power_savings`. The process includes:

 - **Splitting the Data:** Divide the dataset into training and testing sets to evaluate the model's performance. A common split is 80% training and 20% testing.
 - **Training Models:** Train multiple regression models, including Linear Regression, Decision Trees, Random Forest, Gradient Boosting Machines (GBM), and XGBoost. Each model will be trained using the training set.
 - **Evaluating Models:** Assess the performance of each model using metrics such as Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). This will help us understand how well the models predict `estimated_power_savings`.

### 2. Hyperparameter Tuning

 To enhance the performance of our selected models, we will perform hyperparameter tuning:

 - **Grid Search:** Systematically search through a predefined set of hyperparameters to find the combination that yields the best performance. This method involves training the model with different hyperparameter combinations and evaluating their performance.
 - **Random Search:** Randomly sample hyperparameters from a defined range and evaluate model performance. This approach can be more efficient than grid search, especially when dealing with a large number of hyperparameters.

 ### 3. Multi-Objective Optimization

 Given our dual objectives—optimizing power savings while maintaining system performance—we will implement strategies to handle multiple objectives:

 - **Weighted Loss Function:** Combine power savings and performance into a single loss function by assigning weights to each objective. This allows us to balance the trade-off between power savings and performance degradation. The loss function can be fine-tuned to prioritize one objective over the other based on the desired outcome.
 - **Multi-Output Regression:** Train models that can predict multiple targets simultaneously, such as `estimated_power_savings` and performance metrics (e.g., CPU utilization, GPU utilization). This approach captures the relationship between power savings and performance, enabling more informed optimizations.

 ### 4. Model Evaluation and Validation

 After training the models, we will validate their performance to ensure robustness and generalizability:

 - **Cross-Validation:** Use cross-validation techniques (e.g., k-fold cross-validation) to evaluate model performance on different subsets of the data. This helps in assessing the model's ability to generalize to unseen data.
 - **Performance Comparison:** Compare the performance of different models based on validation metrics. This involves analyzing the trade-offs between power savings and performance to select the best model.



 ### 5. Final Model Implementation

 With the optimal hyperparameters identified, we will implement the final model:

 - **Training with Optimal Hyperparameters:** Retrain the model on the entire training dataset using the best hyperparameters identified during tuning.
 - **Testing on Hold-Out Set:** Evaluate the final model on the hold-out test set to assess its performance in predicting `estimated_power_savings` while maintaining system performance.
 - **Analysis and Interpretation:** Analyze the results to understand the impact of different features on power savings and performance. This includes examining feature importance scores and the model's predictions.

 ### 6. Future Work

 Based on the results, we may explore additional techniques and improvements:

 - **Advanced Optimization Techniques:** Investigate methods like Pareto optimization to better handle the trade-offs between power savings and performance. Pareto optimization identifies non-dominated solutions where improving one objective cannot degrade the other.
 - **Reinforcement Learning (RL):** Implement RL agents to dynamically optimize power consumption and performance based on real-time data. This approach allows for continuous learning and adaptation to changing conditions.
 - **Continuous Monitoring and Updating:** Continuously monitor the model's performance and update it with new data to maintain accuracy and relevance. This involves setting up a feedback loop to incorporate new insights and improve the model over time.

 By following this detailed step-by-step process, we aim to develop a robust and efficient model that optimizes power consumption while maintaining system performance. The comprehensive approach ensures that we carefully consider multiple objectives and select the best solution for our goal.

<br>

 --- 

 <br>


# **STEP 1 AND 2. MODEL TRAINING AND HYPERPARAMETER TUNING**

---

<br>

## Loading Data and checking that it loaded OK

In [1]:
## imports

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### Loading DATASET!

In [2]:
dataset = pd.read_csv("laptop_stats_normalized.csv")

In [3]:
dataset.head()

Unnamed: 0,signal_strength (dBm),tx_power (dBm),device_power_usage (W),gpu_power_usage (W),display_backlight_power (W),idle_stats (%),frequency_stats (GHz),wakeups_per_sec (Hz),battery_discharge_rate (W),energy_consumed (Wh),...,device_name_Device_10,device_name_Device_11,device_name_Device_2,device_name_Device_3,device_name_Device_4,device_name_Device_5,device_name_Device_6,device_name_Device_7,device_name_Device_8,device_name_Device_9
0,-0.210941,-0.174946,-0.429449,0.026303,-1.581139,0.115323,-0.141395,0.062007,-0.246961,0.09917,...,False,False,False,False,False,False,False,False,False,True
1,-0.149429,-1.302987,-0.184735,1.199539,-1.264911,-0.03087,0.046752,-1.245494,-0.103369,-0.136244,...,True,False,False,False,False,False,False,False,False,False
2,1.052561,-1.274283,1.020206,1.194412,-0.948683,-0.261191,1.121649,-1.112252,-0.089217,-1.298594,...,False,False,False,False,False,False,False,False,False,False
3,0.914397,-1.086263,1.070054,1.199539,-0.632456,1.191936,1.121649,-0.978453,-0.170196,-1.298594,...,False,False,False,False,False,False,False,False,False,False
4,-0.500518,-1.102761,1.119902,-0.439969,-0.316228,0.963792,1.093617,-1.222282,0.04827,0.53803,...,False,False,True,False,False,False,False,False,False,False


In [4]:
dataset.shape

(87648, 62)

In [5]:
dataset.columns

Index(['signal_strength (dBm)', 'tx_power (dBm)', 'device_power_usage (W)',
       'gpu_power_usage (W)', 'display_backlight_power (W)', 'idle_stats (%)',
       'frequency_stats (GHz)', 'wakeups_per_sec (Hz)',
       'battery_discharge_rate (W)', 'energy_consumed (Wh)',
       'software_activity_impact (W)', 'package_power_state',
       'estimated_power_savings (W)', 'thermal_power_management (W)',
       'disk_io_power_usage (W)', 'wireless_radio_power_usage (W)',
       'power_usage_trends (W)', 'system_base_power (W)',
       'wakeups_from_idle (Hz)', 'average_cpu_frequency (GHz)',
       'network_latency (ms)', 'gpu_core_clock (GHz)',
       'gpu_memory_clock (GHz)', 'gpu_temperature (°C)', 'gpu_power_limit (W)',
       'gpu_utilization (%)', 'gpu_memory_utilization (%)',
       'cpu_core_clock_avg (GHz)', 'cpu_temperature (°C)',
       'cpu_power_limit (W)', 'cpu_utilization_avg (%)',
       'cpu_undervolt_offset (V)', 'gpu_undervolt_offset (V)',
       'ram_voltage (V)', 'ram_f

## DOING THE TRAINING WITH HYPERPARAMETER TUNING

**62 columns, 87648 rows. Means that it loaded successfully**

In [21]:
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
import xgboost as xgb
from sklearn.metrics import mean_absolute_error, root_mean_squared_error
import json
import os
from tqdm import tqdm
import joblib

# Load the dataset
dataset = pd.read_csv("laptop_stats_normalized.csv")

# Define the target variable and features
target = 'estimated_power_savings (W)'
features = ['signal_strength (dBm)', 'tx_power (dBm)', 'device_power_usage (W)', 'gpu_power_usage (W)',
            'display_backlight_power (W)', 'idle_stats (%)', 'frequency_stats (GHz)', 'wakeups_per_sec (Hz)',
            'battery_discharge_rate (W)', 'energy_consumed (Wh)', 'software_activity_impact (W)',
            'thermal_power_management (W)', 'disk_io_power_usage (W)', 'wireless_radio_power_usage (W)',
            'power_usage_trends (W)', 'system_base_power (W)', 'wakeups_from_idle (Hz)', 'average_cpu_frequency (GHz)',
            'network_latency (ms)', 'gpu_core_clock (GHz)', 'gpu_memory_clock (GHz)', 'gpu_temperature (°C)',
            'gpu_power_limit (W)', 'gpu_utilization (%)', 'gpu_memory_utilization (%)', 'cpu_core_clock_avg (GHz)',
            'cpu_temperature (°C)', 'cpu_power_limit (W)', 'cpu_utilization_avg (%)', 'cpu_undervolt_offset (V)',
            'gpu_undervolt_offset (V)', 'ram_voltage (V)', 'ram_frequency (MHz)', 'ram_utilization (%)',
            'ram_cas_latency_cl', 'cpu_power_state_encoded']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(dataset[features], dataset[target], test_size=0.3, random_state=42)

# Define hyperparameters for tuning
param_grids = {
    "Linear Regression": {},
    "Decision Tree": {
        'max_depth': [3, 5, 10, None],
        'min_samples_split': [2, 5, 10],
        'min_samples_leaf': [1, 2, 4]
    },
    "Random Forest": {
        'n_estimators': [100, 200, 500],
        'max_depth': [3, 5, 10, None],
        'min_samples_split': [2, 5, 10],
        'min_samples_leaf': [1, 2, 4]
    },
    "Gradient Boosting": {
        'n_estimators': [100, 200, 500],
        'learning_rate': [0.01, 0.1, 0.2],
        'max_depth': [3, 5, 10],
        'min_samples_split': [2, 5, 10],
        'min_samples_leaf': [1, 2, 4]
    },
    "XGBoost": {
        'n_estimators': [100, 200, 500],
        'learning_rate': [0.01, 0.1, 0.2],
        'max_depth': [3, 5, 10],
        'subsample': [0.8, 0.9, 1.0],
        'colsample_bytree': [0.8, 0.9, 1.0]
    }
}

# Initialize the models
models = {
    "Linear Regression": LinearRegression(),
    "Decision Tree": DecisionTreeRegressor(random_state=42),
    "Random Forest": RandomForestRegressor(random_state=42),
    "Gradient Boosting": GradientBoostingRegressor(random_state=42),
    "XGBoost": xgb.XGBRegressor(random_state=42)
}

# Directory to store best hyperparameters and models
output_dir = "best_models"
os.makedirs(output_dir, exist_ok=True)

# Train the models with hyperparameter tuning and evaluate their performance
for model_name, model in tqdm(models.items(), desc="Training Models"):
    print(f"\nTraining {model_name} with hyperparameter tuning...")
    
    # Perform hyperparameter tuning if there are parameters to tune
    if param_grids[model_name]:
        grid_search = GridSearchCV(model, param_grids[model_name], scoring='neg_mean_squared_error', cv=5, n_jobs=-1, verbose=1)
        grid_search.fit(X_train, y_train)
        best_model = grid_search.best_estimator_
        best_params = grid_search.best_params_
        print(f"Best hyperparameters for {model_name}: {best_params}")
    else:
        best_model = model
        best_model.fit(X_train, y_train)
        best_params = {}
    
    # Make predictions
    y_pred = best_model.predict(X_test)
    
    # Evaluate the model
    rmse = root_mean_squared_error(y_test, y_pred)
    mae = mean_absolute_error(y_test, y_pred)
    
    print(f"{model_name} - RMSE: {rmse:.4f}, MAE: {mae:.4f}\n")
    
    # Define the results structure
    results = {
        "best_params": best_params,
        "metrics": {
            "RMSE": rmse,
            "MAE": mae
        }
    }
    
    # JSON file path
    json_file_path = os.path.join(output_dir, f"best_hyperparameters_{model_name.replace(' ', '_')}.json")
    
    # Model file path
    model_file_path = os.path.join(output_dir, f"best_model_{model_name.replace(' ', '_')}.joblib")
    
    # Check if the file exists and compare the new results with the existing ones
    if os.path.exists(json_file_path):
        with open(json_file_path, 'r') as file:
            existing_results = json.load(file)
        
        # Compare RMSE; lower RMSE indicates better performance
        if rmse < existing_results["metrics"]["RMSE"]:
            with open(json_file_path, 'w') as file:
                json.dump(results, file, indent=4)
                print(f"Updated best hyperparameters for {model_name}")
            
            # Save the model
            joblib.dump(best_model, model_file_path)
            print(f"Updated best model for {model_name}")
    else:
        with open(json_file_path, 'w') as file:
            json.dump(results, file, indent=4)
            print(f"Saved best hyperparameters for {model_name}")
        
        # Save the model
        joblib.dump(best_model, model_file_path)
        print(f"Saved best model for {model_name}")


Training Models:   0%|          | 0/5 [00:00<?, ?it/s]


Training Linear Regression with hyperparameter tuning...
Linear Regression - RMSE: 0.9992, MAE: 0.9418


Training Decision Tree with hyperparameter tuning...
Fitting 5 folds for each of 36 candidates, totalling 180 fits


Training Models:  40%|████      | 2/5 [00:16<00:24,  8.05s/it]

Best hyperparameters for Decision Tree: {'max_depth': 3, 'min_samples_leaf': 1, 'min_samples_split': 2}
Decision Tree - RMSE: 0.9996, MAE: 0.9419


Training Random Forest with hyperparameter tuning...
Fitting 5 folds for each of 108 candidates, totalling 540 fits
