# Neural Network Verification for DC-OPF

Welcome to the Jupyter Notebook for the **Neural Network Verification for DC-OPF** tutorial, presented at the **Oslo Workshop of AI-EFFECT**. In this tutorial, we focus on verifying a neural network model designed to predict solutions for the DC Optimal Power Flow (DC-OPF) problem—an essential optimization task in power system operations. The objective of the DC-OPF is to determine the optimal dispatch of generators to meet system demand, typically aiming for the most cost-efficient solution.

For our neural network, the input consists of the system's load profile, and the output is the predicted generator dispatch that minimizes cost. We trained the network on 80,000 samples and tested it using 20,000 samples.

Throughout this tutorial, we will assess the network's performance by examining how accurately it predicts the generator dispatch, and whether any predicted dispatches violate generator limits or line flow constraints. The analysis will proceed as follows:

1. Violations on the Entire Dataset: We will begin by identifying any violations in the predictions across the complete dataset (100,000 samples).
2. Worst-Case Violations Over the Continuous Input Domain: Next, we will explore the worst-case violations across the entire continuous input domain, demonstrating how violations can be more severe when considering all possible inputs.
3. Worst-Case Distance to Optimality: Finally, we will evaluate the worst-case deviation from the optimal solution.

These three experiments will highlight the importance of verification in neural network models, especially in critical applications like power system operations.



### Data Loader

First, we load the dataset and the trained neural network for evaluation.

In [1]:
import os

os.environ['GRB_LICENSE_FILE'] = 'C:/Users/bagir/gurobi.lic'

# Get the current working directory
parent_directory = os.path.abspath(os.path.join(os.getcwd(), "../"))
print(parent_directory)

# Define the cases
case_name = 'case39_DCOPF'
case_iter = '1'
case_path = os.path.join(parent_directory, "python_based", "test_networks", "network_modified")

# define the neural network
nn_path = os.path.join(parent_directory, "python_based", "trained_nns", case_name, case_iter)
dataset_type = 'all'

c:\Users\bagir\OneDrive - Danmarks Tekniske Universitet\Dokumenter\1) Projects\3) AI Effect\5) Verification\DC-OPF verification


### Empirical Worst-Case Over Entire Dataset

In this section, we will empirically evaluate the worst-case violations across the entire dataset.

- **MAE (Mean Absolute Error) (%)**:  
    The Mean Absolute Error (MAE) represents the average of the absolute differences between predicted and actual values, expressed as a percentage.

- **Avg Max Generator Violation (MW)**:  
    This term refers to the average generator violations (in MW) across the entire dataset, considering all the observed values.

- **Avg Max Line Violation (MW)**:  
    Similar to the generator violation, this term measures the average line flow violations (in MW) over the entire dataset.

- **Avg Distance to Optimal Setpoints (%)**:  
    This term quantifies the average distance, as a percentage, between the predicted operating points and the true optimal operating points across the dataset.

- **Avg Sub-Optimality (%)**:  
    The average sub-optimality measures the dispatch cost deviation from the optimal solution, averaged across the entire dataset.

- **Worst-Case Generator Violation (MW)**:  
    This represents the largest observed generator violation (in MW) over the entire dataset of 100,000 samples.

- **Worst-Case Line Violation (MW)**:  
    This refers to the largest observed line flow violation (in MW) over the entire dataset of 100,000 samples.

- **Worst-Case Distance to Optimal Setpoints (%)**:  
    This is the largest observed distance (in %) between the generator setpoints and the optimal setpoints across the dataset of 100,000 samples.

- **Worst-Case Sub-Optimality (%)**:  
    This refers to the largest observed sub-optimality, in terms of dispatch cost, compared to the optimal solution over the entire dataset.


In [2]:
import sys
functions_path = os.path.join(parent_directory, "python_based/functions")
sys.path.append(functions_path)

from statistical_bound import run_dc_opf_evaluation

# do the empirical evaluation of the worst-case performance of the neural network
results_summary = run_dc_opf_evaluation(case_name, case_path, nn_path)

  net[table] = pd.concat([net[table], dd[dd.columns[~dd.isnull().all()]]], sort=False)


Loaded case data successfully.
Neural network data loaded successfully.

Summary Results:
------------------------------------------------------------
MAE (%)                                      :       0.11
Avg Max Generator Violation (MW)             :       1.21
Avg Max Line Violation (MW)                  :       0.55
Avg Distance to Optimal Setpoints (%)        :       0.49
Avg Sub-Optimality (%)                       :       0.01
Worst-Case Generator Violation (MW)          :      55.07
Worst-Case Line Violation (MW)               :      50.94
Worst-Case Distance to Optimal Setpoints (%) :       7.56
Worst-Case Sub-Optimality (%)                :       0.63


### Worst-Case Violations Analysis

In this section, we analyze the worst-case violations across the entire input domain to compare them with the empirically observed worst-case violations.

- **v g time (s)**:  
    This represents the time required to calculate the worst-case generator violation over the continuous input domain.

- **v g wc (MW)**:  
    This indicates the worst-case generator violation over the continuous input domain. It can be compared with the "Worst-Case Generator Violation (MW)" mentioned earlier.

- **v g ID**:  
    This is the index of the generator where the worst-case violation occurs.

- **v line time (s)**:  
    This represents the time needed to calculate the worst-case line flow violation over the continuous input domain.

- **v line wc (MW)**:  
    This indicates the worst-case line flow violation over the continuous input domain. It can be compared with the "Worst-Case Line Violation (MW)" mentioned earlier.

- **v line ID**:  
    This is the index of the line where the worst-case violation occurs.


In [3]:
from exact_bound import WorstCaseAnalyzer

dataset_type = 'test'
analyzer = WorstCaseAnalyzer(case_name, case_path, nn_path, dataset_type)
analyzer.run_analysis()

  net[table] = pd.concat([net[table], dd[dd.columns[~dd.isnull().all()]]], sort=False)


Loaded case data successfully.
Neural network data loaded successfully.
Set parameter Username
Academic license - for non-commercial use only - expires 2025-05-24


  self.model.addConstr(self.mpc.M_g @ gp.vstack((self.pg_slack, (self.pg_pred * self.mpc.pg_delta.reshape(-1, 1) / self.mpc.baseMVA)))


Solving MILP for PGMAX Violations
Generator 7, Mismatch in neural network prediction -- PGMAX: 0.0046160449337394015
Solving MILP for PGMIN Violations
Solving MILP for PLINE Violations

Worst-Case Summary:
--------------------------------------------------
v g time                      :      2.505
v g wc                        :    157.621
v g ID                        :          8
v line time                   :      3.408
v line wc                     :    227.530
v line ID                     :         35


### Distance to Optimal Solution

In this section, we evaluate the worst-case distance to optimality and the most sub-optimal dispatch across the entire continuous input domain.

- **v dist time (s)**:  
     This represents the time required to identify the largest observed distance between the generator setpoints and the optimal setpoints across the entire input domain.

 - **v dist wc (%)**:  
     This indicates the largest observed distance between the generator setpoints and the optimal setpoints over the entire input domain. It can be compared to the "Worst-Case Distance to Optimal Setpoints."

 - **v dist ID**:  
     This is the index of the generator where the largest distance to the optimal setpoint occurs.

 - **v opt time (s)**:  
     This represents the time needed to identify the most sub-optimal dispatch, in terms of dispatch cost, across the entire input domain.

 - **v opt wc (%)**:  
     This is the percentage of sub-optimality over the continuous input domain, which can be compared with the "Worst-Case Line Violation (MW)" mentioned above.


In [4]:
from optimality_gap import SubOptimalityAnalyzer

analyzer = SubOptimalityAnalyzer(case_name, case_path, nn_path, dataset_type)
analyzer.run_analysis()

  net[table] = pd.concat([net[table], dd[dd.columns[~dd.isnull().all()]]], sort=False)


Loaded case data successfully.
Neural network data loaded successfully.


  self.model.addConstr(self.mpc.M_g @ gp.vstack((self.pg_slack, (self.pg_pred * self.mpc.pg_delta.reshape(-1, 1) / self.mpc.baseMVA)))


this is the max distance for generator 0:  0.018453123082550538
KKT solution and rundcopf do match -- continue
this is the max distance for generator 1:  0.6167372241140243
KKT solution and rundcopf do match -- continue
this is the max distance for generator 2:  0.4250694808632653
KKT solution and rundcopf do match -- continue
this is the max distance for generator 3:  0.25860979540781137
KKT solution and rundcopf do match -- continue
this is the max distance for generator 4:  0.22106819506141315
KKT solution and rundcopf do match -- continue
this is the max distance for generator 5:  0.02861096982378769
KKT solution and rundcopf do match -- continue
this is the max distance for generator 6:  0.15753967213828016
KKT solution and rundcopf do match -- continue
this is the max distance for generator 7:  0.02614087342103477
KKT solution and rundcopf do match -- continue
this is the max distance for generator 8:  0.1579206611482067
KKT solution and rundcopf do match -- continue
this is the 