# **DOING MULTIOBJECTIVE OPTIMIZATION**

---

## PURPOSE OF THE NOTEBOOK

This notebook aims to address the dual objectives of optimizing power consumption while maintaining system performance for a laptop. In our project, it's critical to achieve significant power savings without compromising on essential performance metrics such as CPU frequency, GPU clock speeds, and display backlight power. Given the constraints of battery life and performance requirements in portable computing, our goal is to find a balanced solution that maximizes power efficiency while ensuring optimal system performance. The best-performing model from the previous analysis, XGBoost, will be leveraged with its optimal hyperparameters as the base model for our multi-objective optimization efforts.

## Breakdown of What Will Be DOne In THiss Notebook

### 1. Introduction to Multi-Objective Optimization

In the context of our project, multi-objective optimization is essential because we have competing goals: reducing power consumption (to extend battery life) and maintaining system performance (to ensure a smooth user experience). These objectives often conflict, as aggressive power-saving measures can lead to degraded performance. Therefore, a balanced approach is required to find an optimal trade-off.

### 2. Weighted Loss Function

- **Objective**: Combine power savings and performance into a single loss function by assigning weights to each objective.
- **Approach**:
  - **Why It's Needed**: A weighted loss function allows us to explicitly balance power savings and performance. By adjusting the weights, we can prioritize one objective over the other based on specific use cases or user preferences. For example, during intensive tasks like gaming, performance might be prioritized, whereas, during idle times, power savings might be more important.
  - **Implementation**: 
    - Define a combined loss function that includes both `estimated_power_savings` and performance metrics such as `average_cpu_frequency`, `gpu_core_clock`, `gpu_memory_clock`, and `display_backlight_power`.
    - Assign weights to each metric to reflect their relative importance.
    - Fine-tune the loss function to achieve the desired balance between power savings and performance.

### 3. Multi-Output Regression

- **Objective**: Train models that can predict multiple targets simultaneously, such as `estimated_power_savings` and performance metrics (e.g., `average_cpu_frequency`, `gpu_core_clock`, `gpu_memory_clock`, `display_backlight_power`).
- **Approach**:
  - **Why It's Needed**: Multi-output regression is crucial for understanding the interdependencies between power savings and performance metrics. By predicting multiple outcomes, we can better capture the trade-offs and interactions between different system components, leading to more informed and balanced optimizations.
  - **Implementation**:
    - Use the best-performing model (XGBoost) to create a multi-output regression framework.
    - Train the model to predict both `estimated_power_savings` and performance metrics.
    - Evaluate the model's ability to capture the relationship between power savings and performance, ensuring that improvements in power savings do not come at an unacceptable cost to performance.

### 4. Advanced Optimization Techniques

- **Objective**: Investigate methods like Pareto optimization to better handle the trade-offs between power savings and performance.
- **Approach**:
  - **Why It's Needed**: Pareto optimization helps identify solutions that offer the best trade-offs between conflicting objectives. In our project, it ensures that any improvement in power savings does not degrade performance beyond an acceptable threshold. This method provides a set of optimal solutions (Pareto front) from which we can choose based on specific requirements or user preferences.
  - **Implementation**:
    - Implement Pareto optimization to identify non-dominated solutions where improving one objective cannot degrade the other.
    - Analyze the Pareto front to select the optimal trade-offs between power savings and performance, providing a clear visual representation of the trade-offs.

### 5. Implementation and Evaluation

- **Steps**:
  - **Implement Weighted Loss Function**: Develop and train the model using a combined loss function that includes power savings and performance metrics.
  - **Implement Multi-Output Regression**: Train the XGBoost model to predict both `estimated_power_savings` and performance metrics simultaneously.
  - **Implement Pareto Optimization**: Apply Pareto optimization to identify the set of optimal solutions balancing power savings and performance.
  - **Evaluate Performance**: Assess the performance of each approach using metrics such as RMSE and MAE for power savings, and similar metrics for performance metrics.
  - **Compare Results**: Compare the results of each approach to determine the most effective strategy for balancing power savings and performance.

## Best Model and Hyperparameters from our one objective exploration

The best-performing model from our previous analysis was XGBoost with the following hyperparameters:

```json
{
    "colsample_bytree": 0.9,
    "learning_rate": 0.1,
    "max_depth": 10,
    "n_estimators": 500,
    "subsample": 0.9
}


---

<br>

## Loading Libraries and the Data

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor
import xgboost as xgb
from sklearn.metrics import mean_absolute_error, mean_squared_error
import joblib
import numpy as np
import optuna
import os
from tqdm import tqdm
import json


# Load the dataset
dataset = pd.read_csv("laptop_stats_normalized.csv")

# Define the target variables and features
targets = ['estimated_power_savings (W)', 'average_cpu_frequency (GHz)', 'gpu_core_clock (GHz)', 
           'gpu_memory_clock (GHz)', 'display_backlight_power (W)']
features = ['signal_strength (dBm)', 'tx_power (dBm)', 'device_power_usage (W)', 'gpu_power_usage (W)',
            'idle_stats (%)', 'frequency_stats (GHz)', 'wakeups_per_sec (Hz)',
            'battery_discharge_rate (W)', 'energy_consumed (Wh)', 'software_activity_impact (W)',
            'thermal_power_management (W)', 'disk_io_power_usage (W)', 'wireless_radio_power_usage (W)',
            'power_usage_trends (W)', 'system_base_power (W)', 'wakeups_from_idle (Hz)',
            'network_latency (ms)', 'gpu_temperature (°C)', 'gpu_power_limit (W)', 'gpu_utilization (%)', 
            'gpu_memory_utilization (%)', 'cpu_temperature (°C)', 'cpu_power_limit (W)', 'cpu_utilization_avg (%)', 
            'cpu_undervolt_offset (V)', 'gpu_undervolt_offset (V)', 'ram_voltage (V)', 'ram_frequency (MHz)', 
            'ram_utilization (%)', 'ram_cas_latency_cl', 'cpu_power_state_encoded']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(dataset[features], dataset[targets], test_size=0.3, random_state=42)


---

<br>

# Before we start with the Multi Objective OPtimization

The choice of weights for the weighted loss function in multi-objective optimization is a critical decision that can significantly impact the performance and outcomes of the model.

Because of that, we will implement a search of optimal weights for our model, as part of our iterative optimization process.

<br>

## Weighted Loss Function

In [3]:
# Define the combined loss function
def weighted_loss(y_true, y_pred, weights):
    power_savings_loss = mean_squared_error(y_true[:, 0], y_pred[:, 0])
    performance_loss = sum(weights[i] * mean_squared_error(y_true[:, i+1], y_pred[:, i+1]) for i in range(len(weights)))
    return power_savings_loss + performance_loss


## Objective Function for Optuna

The Optuna study was conducted to optimize the hyperparameters of the XGBoost model and the weights for the performance metrics in the multi-objective optimization task. 


In [4]:
def objective(trial):
    # Suggest hyperparameters for XGBoost. These are the same that we got from the hyperparameter optimization
    colsample_bytree = trial.suggest_float('colsample_bytree', 0.5, 1.0, step=0.1)
    learning_rate = trial.suggest_float('learning_rate', 0.01, 0.2, step=0.01)
    max_depth = trial.suggest_int('max_depth', 3, 10)
    n_estimators = trial.suggest_int('n_estimators', 100, 500, step=50)
    subsample = trial.suggest_float('subsample', 0.5, 1.0, step=0.1)
    
    # Suggest weights for the performance metrics
    weight_cpu = trial.suggest_float('weight_cpu', 0.1, 0.5, step=0.05)
    weight_gpu_core = trial.suggest_float('weight_gpu_core', 0.05, 0.3, step=0.05)
    weight_gpu_memory = trial.suggest_float('weight_gpu_memory', 0.05, 0.3, step=0.05)
    weight_display = trial.suggest_float('weight_display', 0.05, 0.2, step=0.05)
    
    weights = [weight_cpu, weight_gpu_core, weight_gpu_memory, weight_display]
    
    # Initialize the model with suggested hyperparameters
    model = MultiOutputRegressor(xgb.XGBRegressor(
        colsample_bytree=colsample_bytree, 
        learning_rate=learning_rate, 
        max_depth=max_depth, 
        n_estimators=n_estimators, 
        subsample=subsample, 
        random_state=42
    ))
    
    # Train the model
    model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = model.predict(X_test)
    
    # Calculate the combined loss
    loss = weighted_loss(np.array(y_test), y_pred, weights)
    return loss


In [5]:
# Create the Optuna study and optimize
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)

# Save the best parameters
best_params = study.best_params
best_weights = [best_params['weight_cpu'], best_params['weight_gpu_core'], best_params['weight_gpu_memory'], best_params['weight_display']]
best_hyperparams = {
    'colsample_bytree': best_params['colsample_bytree'],
    'learning_rate': best_params['learning_rate'],
    'max_depth': best_params['max_depth'],
    'n_estimators': best_params['n_estimators'],
    'subsample': best_params['subsample']
}

print(f"Best hyperparameters: {best_hyperparams}")
print(f"Best weights: {best_weights}")

# Save the study results
output_dir = "best_models"
os.makedirs(output_dir, exist_ok=True)
study_file_path = os.path.join(output_dir, "optuna_study.pkl")
joblib.dump(study, study_file_path)
print(f"Saved Optuna study to {study_file_path}")


[I 2024-05-16 17:02:15,605] A new study created in memory with name: no-name-ab803169-9c24-4df0-b937-3319b98b1da1
[I 2024-05-16 17:02:17,823] Trial 0 finished with value: 1.6330228340699058 and parameters: {'colsample_bytree': 1.0, 'learning_rate': 0.04, 'max_depth': 5, 'n_estimators': 200, 'subsample': 0.8, 'weight_cpu': 0.30000000000000004, 'weight_gpu_core': 0.1, 'weight_gpu_memory': 0.15000000000000002, 'weight_display': 0.2}. Best is trial 0 with value: 1.6330228340699058.
[I 2024-05-16 17:02:20,786] Trial 1 finished with value: 1.7028833910414705 and parameters: {'colsample_bytree': 1.0, 'learning_rate': 0.09, 'max_depth': 3, 'n_estimators': 500, 'subsample': 0.9, 'weight_cpu': 0.1, 'weight_gpu_core': 0.3, 'weight_gpu_memory': 0.3, 'weight_display': 0.05}. Best is trial 0 with value: 1.6330228340699058.
[I 2024-05-16 17:02:24,893] Trial 2 finished with value: 1.5435846372129634 and parameters: {'colsample_bytree': 1.0, 'learning_rate': 0.06999999999999999, 'max_depth': 8, 'n_esti

Best hyperparameters: {'colsample_bytree': 0.5, 'learning_rate': 0.16, 'max_depth': 10, 'n_estimators': 450, 'subsample': 0.9}
Best weights: [0.1, 0.05, 0.05, 0.05]
Saved Optuna study to best_models/optuna_study.pkl


The results of the study are as follows:


## Best Hyperparameters

The best hyperparameters identified for the XGBoost model are:

- **colsample_bytree**: 0.5
  - This parameter controls the subsample ratio of columns when constructing each tree. A lower value can help in reducing overfitting.

- **learning_rate**: 0.16
  - The learning rate controls the step size at each iteration while moving toward a minimum of the loss function. A value of 0.1 is commonly used to balance convergence speed and the ability to find a better minimum.

- **max_depth**: 10
  - This parameter specifies the maximum depth of the trees. A deeper tree can model more complex relationships but may overfit the data. A depth of 9 is a reasonable balance between complexity and generalization.

- **n_estimators**: 450
  - The number of boosting rounds or trees to be built. More trees can improve performance but also increase training time. 350 trees provide a good balance between performance and training time.

- **subsample**: 0.9
  - This parameter controls the subsample ratio of the training instances. A value of 0.7 means that 70% of the training data is used to build each tree, helping to prevent overfitting.

## Best Weights

The best weights for the performance metrics are:

- **Weight for average_cpu_frequency**: 0.1
  - This indicates the relative importance of maintaining average CPU frequency. A higher weight means more emphasis is placed on this metric during optimization.

- **Weight for gpu_core_clock**: 0.05
  - This weight reflects the importance of the GPU core clock frequency. A lower weight indicates less emphasis compared to the CPU frequency.

- **Weight for gpu_memory_clock**: 0.05
  - Similar to the GPU core clock, this weight represents the importance of the GPU memory clock frequency.

- **Weight for display_backlight_power**: 0.05
  - This weight signifies the importance of the display backlight power consumption. A lower weight suggests less emphasis on this metric compared to CPU frequency.


The Optuna study results show a careful balance between different hyperparameters and weights to optimize the model's performance while maintaining system efficiency. The selected hyperparameters and weights are providing us with with a foundation to fine-tune and evaluate in the multi-objective optimization process.

These parameters will be used to train the final XGBoost model and evaluate its performance in achieving the dual objectives of optimizing power savings and maintaining system performance.

The study results have been saved for future reference and further analysis:
- **Study file path**: `best_models/optuna_study.pkl`

---

<br>

## Training Model with best parameters and final weights

In [8]:
# Initialize the final model with the best parameters
final_model = MultiOutputRegressor(xgb.XGBRegressor(
    colsample_bytree=best_hyperparams['colsample_bytree'], 
    learning_rate=best_hyperparams['learning_rate'], 
    max_depth=best_hyperparams['max_depth'], 
    n_estimators=best_hyperparams['n_estimators'], 
    subsample=best_hyperparams['subsample'], 
    random_state=42
))

# Add a loading bar to the training process
with tqdm(total=len(X_train)) as pbar:
    for i in range(len(X_train)):
        final_model.fit(X_train[i:i+1], y_train[i:i+1])
        pbar.update(1)

# Make predictions
y_pred = final_model.predict(X_test)

# Evaluate the final model using the combined loss function
combined_loss = weighted_loss(np.array(y_test), y_pred, best_weights)
print(f"Final Combined Weighted Loss: {combined_loss:.4f}")

# Evaluate the final model for each target
results = {}
for i, target in enumerate(targets):
    rmse = mean_squared_error(y_test[target], y_pred[:, i], squared=False)
    mae = mean_absolute_error(y_test[target], y_pred[:, i])
    results[target] = {
        "RMSE": rmse,
        "MAE": mae
    }
    print(f"{target} - RMSE: {rmse:.4f}, MAE: {mae:.4f}")

# Save the final model and results
model_file_path = os.path.join(output_dir, "multi_output_xgboost_optimized.joblib")
joblib.dump(final_model, model_file_path)
print(f"Saved final multi-output XGBoost model to {model_file_path}")

results_file_path = os.path.join(output_dir, "multi_objective_results.json")
with open(results_file_path, 'w') as file:
    json.dump({
        "best_hyperparams": best_hyperparams,
        "best_weights": best_weights,
        "combined_loss": combined_loss,
        "metrics": results
    }, file, indent=4)
print(f"Saved multi-objective results to {results_file_path}")

Training XGBoost Model:   0%|          | 0/61353 [00:08<?, ?it/s]
100%|██████████| 61353/61353 [4:37:05<00:00,  3.69it/s]   

Final Combined Weighted Loss: 1.9634
estimated_power_savings (W) - RMSE: 1.1758, MAE: 1.0121
average_cpu_frequency (GHz) - RMSE: 1.4891, MAE: 1.1007
gpu_core_clock (GHz) - RMSE: 1.0523, MAE: 0.9475
gpu_memory_clock (GHz) - RMSE: 1.6084, MAE: 1.2601
display_backlight_power (W) - RMSE: 1.8678, MAE: 1.5772
Saved final multi-output XGBoost model to best_models/multi_output_xgboost_optimized.joblib
Saved multi-objective results to best_models/multi_objective_results.json



