In [1]:
import os

In [4]:
! ls

config.yaml         eval.py             run.py              [30m[43mvalidation[m[m
dataset.py          model.py            tools.py
[30m[43mdatasets[m[m            notebook_test.ipynb train.py


## TASK 1 : TRAINING

### Data Description (see data analysis notebook)

The new dataset that we've created incorporates temporal features into the graph representations used in your graph neural network model. Here's a detailed description of the dataset considering the parameters `lag_step=4` and `lag_jump=10`:

### Temporal Feature Construction:

For each node in the graph at the current timestep `t`, you have constructed a feature vector that includes:

1. The injection value at the current timestep `t`.
2. The previous injection values at timesteps `t - lag_jump`, `t - 2*lag_jump`, `t - 3*lag_jump`, and `t - 4*lag_jump`. This captures the history of injections at intervals of 10 timesteps before the current timestep.
3. Similarly, it includes the previous load values at timesteps `t - lag_jump`, `t - 2*lag_jump`, `t - 3*lag_jump`, and `t - 4*lag_jump`.

### Parameters Description:

- **lag_step**: This parameter defines how many previous timesteps you look back to construct the feature vector for each node. With `lag_step=4`, you are looking at four previous timesteps.
- **lag_jump**: This parameter defines the interval between the considered timesteps. With `lag_jump=10`, you are considering every 10th timestep in the past. This means that for a current timestep `t`, you are using data from `t-10`, `t-20`, `t-30`, and `t-40` as part of the feature vector.

### Dataset Structure:

- The resulting feature vector for each node at each timestep is 9-dimensional. It consists of 4 past injections, 4 past loads, and the current injection value.
- The initial timesteps where we cannot construct a full feature vector (because there aren't enough previous timesteps) are excluded from the dataset. This means the first few timesteps (specifically, the first 40 timesteps in this case) are not represented in your dataset.
- For each graph in the dataset, the `edge_index` and `edge_attr` remain the same as in the original graph structure, representing the connections between nodes and their respective attributes (e.g., reactance).

### Implications for the Model:

- By including temporal features, your model can potentially learn patterns related to the evolution of loads and injections over time, which may help in predicting future loads more accurately.
- The temporal resolution we've chosen (every 10 timesteps) suggests we're interested in capturing medium-term trends rather than very short-term fluctuations.
- The model can now potentially identify and learn from the cyclic patterns, trends, and time-based dependencies present in the data, which could be crucial for tasks like load forecasting.

This enriched dataset should provide your model with a more nuanced understanding of the grid's dynamics, allowing it to make more informed predictions that take into account not just the current state of the grid but also its recent history.

### Model architecture

The updated `GNNModel` includes suited layers to handle graph-structured data for tasks like load forecasting on an electricity grid. Here's a detailed breakdown of its features and functionalities:

1. **Layer Normalization at Input (`ln1`)**:
   - The input features `x` are first normalized using LayerNorm. This step can help stabilize learning by normalizing the features to have a mean of zero and a standard deviation of one.

2. **Convolutional Layers (`conv_layers`)**:
   - The model employs Graph Attention Network Convolution (GATConv) or Graph Convolution Network (GCNConv) layers, chosen based on the `use_gat` flag.
   - The first convolutional layer takes the input features and transforms them into `hidden_features`.

3. **Residual Connections and Normalization**:
   - From the second convolutional layer onwards, residual connections are used. The model adds the input of the layer (identity) to the output of the normalization layer.
   - This approach is beneficial for deeper models, as it helps in mitigating the vanishing gradient problem and enables the model to learn more complex patterns.

4. **Activation Function (`leaky_or_tanh_func`)**:
   - The choice between Leaky ReLU and Tanh as the activation function is controlled by the `leaky_or_tanh` flag. This allows flexibility in model behavior and non-linearity.

5. **Dropout for Regularization**:
   - Dropout is applied after the activation function during training to prevent overfitting. It randomly zeroes some of the elements of the input tensor with probability `dropout` during training.

6. **Layer Normalization for Edge Features (`ln2`)**:
   - After computing the edge features (by concatenating the node features for each edge), these features are normalized using another LayerNorm layer (`ln2`).

7. **Edge Features Transformation**:
   - The model employs a linear transformation (`edge_transform`) followed by the final fully connected layer (`fc`) to generate the output features for each edge.

8. **Attention Mechanism**:
   - If `use_attention` is true, the model includes an `AttentionLayer` that applies an attention mechanism to the edge outputs. This can allow the model to focus on the most relevant edges, potentially improving the accuracy of predictions.

9. **Output**:
   - The final output of the model is the transformed edge output, which represents the model’s predictions.

The architecture of this model is designed to process graph-structured data effectively by leveraging both node features and the structural information encoded in the graph's edges. The use of attention mechanisms and various normalization techniques aims to enhance the model's ability to capture and learn complex patterns within the data, which is crucial for accurate load forecasting in electrical grids. The flexibility in choosing convolution types (GAT or GCN) and activation functions allows for customization based on specific dataset characteristics and modeling requirements.

### Training and test of the eval.py functionnality

In [123]:
cur_dir = os.curdir

config_path = cur_dir + '/config.yaml'
dir_path = cur_dir + '/datasets'
verbose = 2
save_plot = "" # use --save_plot or --no-save_plot

(the plot are generated and saved in the log folder of the model, we also include tensorboard visualisation)

In [159]:
command = f"python train.py {config_path} {dir_path} {verbose} {save_plot}"
!{command}


The trained model and other data will be saved here : ./logs/20231221-230759

device is set to cpu
Training (epoch 1/25): 100%|█████████████████| 218/218 [00:01<00:00, 150.47it/s]
Epoch 1, Avg Training Loss: 211.4559, MAE: 0.0448, RMSE: 0.0632, MAPE: 13792.6411, R-squared: 0.0003, Max Error: 0.2599
Training (epoch 1/25): 100%|███████████████████| 94/94 [00:00<00:00, 187.75it/s]
Epoch 1, Avg Validation Loss: 170.9469, MAE: 0.0802, RMSE: 0.1021, MAPE: 41666.1277, R-squared: 0.0010, Max Error: 0.3173
Improved best validation loss at epoch 1 : saving...

 ----------------- 

Training (epoch 2/25): 100%|█████████████████| 218/218 [00:01<00:00, 169.16it/s]
Epoch 2, Avg Training Loss: 176.3360, MAE: 0.0432, RMSE: 0.0588, MAPE: 20706.3326, R-squared: 0.0006, Max Error: 0.2365
Training (epoch 2/25): 100%|███████████████████| 94/94 [00:00<00:00, 188.73it/s]
Epoch 2, Avg Validation Loss: 159.9592, MAE: 0.0799, RMSE: 0.1042, MAPE: 44307.3245, R-squared: 0.0006, Max Error: 0.3072
Improved best val

In [165]:
%load_ext tensorboard

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


In [166]:
tensorboard  --logdir='/logs'

In [169]:
! ls logs

[34m20231221-230759[m[m


(copy the folder name and paste it below after 'logs/....')

In [170]:
config_path_dataset = cur_dir + '/logs/20231221-230759' + '/config.yaml'
dir_path_dataset = cur_dir + '/datasets'
output_path_dataset = cur_dir + '/logs/20231221-230759'
verbose_dataset = 2
pred_only = "" # use --pred_only or --no-pred_only

example

In [173]:
command = f"python eval.py {config_path_dataset} {dir_path_dataset} {output_path_dataset} {verbose_dataset} {pred_only}"
!{command}

device is set to cpu
Evaluation (epoch 1/1): 100%|██████████████| 9960/9960 [00:13<00:00, 763.98it/s]

Loads.npy file generated at : ./logs/20231221-230759/Preds/loads_20231221-231820.npy

Epoch 1, Avg Evaluation Loss: 267.9728, MAE: 0.0006, RMSE: 0.0012, MAPE: 40.6160, R-squared: -0.0000, Max Error: 0.0038


## TASK 2: MODEL EVALUATION (underestimated prediction)

---------

Training (epoch 25/25): 100%|████████████████| 218/218 [00:01<00:00, 169.20it/s]

Epoch 25, Avg Training Loss: 82.9589, MAE: 0.0265, RMSE: 0.0406, MAPE: 5310.9375, R-squared: 0.0029, Max Error: 0.2100

Training (epoch 25/25): 100%|██████████████████| 94/94 [00:00<00:00, 188.15it/s]

Epoch 25, Avg Validation Loss: 76.6000, MAE: 0.0600, RMSE: 0.0785, MAPE: 10344.1416, R-squared: 0.0049, Max Error: 0.2664

Improved best validation loss at epoch 25 : saving...

-----------

The model performance, as indicated by the provided metrics, suggests certain areas for improvement, especially considering the critical nature of underestimating loads in an electricity grid. Let's analyze the scores and discuss potential strategies to address the underestimation issue:

Based on the updated metrics for your model's performance, here's an analysis focusing on addressing the underestimation issue in the context of electricity grid load forecasting:

### Analysis of Model Performance:

1. **Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE)**:
   - Both MAE and RMSE are relatively low, which initially seems positive. However, the specific values of these errors need to be contextualized against baseline models or domain-specific thresholds to fully understand their significance.

2. **Mean Absolute Percentage Error (MAPE)**:
   - The MAPE values are quite high, especially in the validation phase. This suggests that the model's relative errors are significant, which is critical when considering the accuracy required in load forecasting.

3. **R-squared**:
   - The R-squared values are very low for both training and validation, indicating that the model is not capturing a substantial portion of the variance in the dataset.

4. **Max Error**:
   - The maximum errors observed are not negligible, particularly in the validation set. In the context of load forecasting, such errors could lead to significant issues, especially if they represent underestimations.

### Addressing Underestimation


**Regular monitoring** of the model's performance in a real-world setting is crucial. Over time, as more data is collected and the behavior of the grid evolves, the model should be retrained or fine-tuned to adapt to these changes. 

Implementing these strategies requires a careful balance to avoid overfitting and to maintain the model's ability to generalize well to unseen data. **The goal is to reduce underestimation risks without significantly compromising overall accuracy.**

#### 1. Custom Loss Function for Asymmetric Penalties:

We could introduce a loss function that penalizes underestimations more than overestimations. This approach specifically targets the critical nature of underestimation in load forecasting. We would need to design a custom loss function where errors below the actual values (underestimations) incur a higher penalty. For example, a weighted mean squared error where the weights are higher for underestimations. The model learns to err on the side of overestimation, reducing the risk associated with underestimating loads.

#### 2. Enhanced Model Architecture and Feature Engineering:

If not, we might try to improve the model’s ability to capture complex patterns in the data, potentially reducing underestimation errors. We will need to experiment with different neural network architectures, such as deeper networks or different types of layers (e.g., adding attention mechanisms). Additionally, explore feature engineering to include more relevant information or transform existing features for better representation. A more robust model that can better understand the nuances of the data, leading to more accurate load predictions.

#### 3. Detailed Error Analysis and Model Monitoring:

Finally, we will focus on understanding the specific scenarios where underestimations occur and continuously improve model performance. We will perform a thorough analysis of instances where the model underestimates, looking for patterns or common characteristics. Monitor the model's performance over time, especially during peak loads or unusual conditions. Regularly update the model with new data and insights. Identification of key factors leading to underestimation, allowing for targeted improvements. Adaptation of the model to changing conditions over time, maintaining its relevance and accuracy.

**These strategies aim to directly tackle underestimation while maintaining the overall integrity and predictive power of the model. By focusing on asymmetric loss adjustment, enhancing the model's structure, and committing to ongoing analysis and updates, you can significantly mitigate the risks associated with load forecasting underestimation. Through this framework we could introduce new feature, improve the data quality, maybe try ensemble methods (specificly for the issue), add some post-processing adjustment (maybe compare ourself to physic-based algorithm) and all of this to develop a new training strategy.**


## TASK 3: EVALUATION

In [176]:
! ls logs

[34m20231221-230759[m[m


(copy the folder name and paste it below after 'logs/....')

In [177]:
config_path_eval = cur_dir + '/logs/20231221-230759' + '/config.yaml'
dir_path_eval = cur_dir + '/validation'
output_path_eval = cur_dir + '/logs/20231221-230759'
verbose_eval = 2
pred_only = ""

In [178]:
command = f"python eval.py {config_path_eval} {dir_path_eval} {output_path_eval} {verbose_eval} {pred_only}"
!{command}

device is set to cpu
Evaluation (epoch 1/1): 100%|██████████████████| 10/10 [00:00<00:00, 149.50it/s]

Loads.npy file generated at : ./logs/20231221-230759/Preds/loads_20231221-232242.npy

Epoch 1, Avg Evaluation Loss: 181.2689, MAE: 0.7335, RMSE: 1.3353, MAPE: 565.1479, R-squared: -0.0449, Max Error: 4.3679


The evaluation results on a separate dataset provide valuable insights into the model's generalization capabilities and areas for improvement. Let's analyze these results:

### Analysis of Model Performance on the Separate Dataset:

1. **Mean Absolute Error (MAE)**: 
   - An MAE of 0.7335 indicates that, on average, the model's predictions are off by this amount. While the absolute figure might seem low, its significance is context-dependent and should be compared against domain-specific benchmarks or baseline models.

2. **Root Mean Squared Error (RMSE)**:
   - The RMSE of 1.3353, being higher than the MAE, suggests the presence of some larger errors in the predictions. This could be indicative of the model struggling with certain instances in the data.

3. **Mean Absolute Percentage Error (MAPE)**:
   - A very high MAPE of 565.1479% indicates that the model's predictions have significant relative errors. This is especially concerning in load forecasting, where accuracy is crucial.

4. **R-squared (R²)**:
   - A negative R² value of -0.0449 implies that the model performs worse than a simple mean-based model. This is a strong indicator that the model is not capturing the underlying trends and patterns in the data effectively.

5. **Max Error**:
   - A Max Error of 4.3679 is concerning, especially in load forecasting, where large errors can have substantial consequences.

### Strategies for Improvement:

Given these results, it's clear that the model requires significant improvements to be effective for practical use. Here are some strategies:

1. **Model Complexity**

2. **Feature Engineering and Selection**

3. **Hyperparameter Tuning**

4. **Training Procedure**

5. **Data Quality and Preprocessing**

6. **Error Analysis**

7. **Model Monitoring and Updating**

8. **Comparison with Baselines**