This notebook explains how to generate the datasets used for the "Scenario 1" of the proposed benchmark.

In [1]:
import os
path_benchmark = os.path.join("reference_data")
if not os.path.exists(path_benchmark):
    os.mkdir(path_benchmark)

# LIPS : Learning Industrial physical simulationbenchmark suite: the power grid case

The benchmarking platform (as it can be observed in the Figure below) is composed of three main modules: Data, Benchmarking and Evaluation. As the names suggest, each module is specialized with its own set of functions for a specific task. The Data module is the one that helps to generate different dataset and distributions providing a high flexibility using a set of parameters. Once the datasets are generated, the benchmarking platform will take as entry the generated dataset and perform a required benchmark on it. Each benchmark could be precised via the inputs and outputs required for modeling and prediction. Finally, in order to evaluate the performance of each benchmark, the Evaluation module provides a set of functions to inspect the models from different point of views (physics compliances, machine learning metrics, adaptability and readiness.  

The figure below presents the scheme of benchmarking platform which is composed of three distinct parts : 
* DataSet: Generate some dataset for training and evaluation 
* Benchmark: coordinates between different parts and allows train and evaluate an agumented simulator 
* Evaluation: once the benchmark done, it allow to evaluate the performance with respect to various point of views

![title](img/Benchmarking_scheme.png)

## Benchmark 1 : Risk assessment through contingency screening
The first power grid benchmark tackles the step of risk assessment in power grids. The problem is to anticipate potential threats on the power grid (several hours ahead) and warn the operators accordingly. It simulates incidents (aka contingencies) involving various elements of the grid (such as the disconnection of a line/production unit), one by one. For each contingency, a risk (weakness of the grid) is detected when overloads are detected by the simulation engine on some lines. On a real grid, this scenario means running several dozens of thousands of simulations, thereby, computation time is critical, especially since this risk assessment is refreshed every 15 minutes. In this benchmark, the main physical variable we are interested in is the value of electric current in the lines (in amperes), because an overload is detected if this value exceeds a line-dependent threshold.

First we need to create the associated benchmark:

In [2]:
from lips.neurips_benchmark import NeuripsBenchmark1
neurips_benchmark1 = NeuripsBenchmark1(path_benchmark=path_benchmark)

Then we need to generate the data, note that this step needs to be done only once (by the person in charge of releasing the benchmark).

People that want to use this benchmark are after free to load the data (see the next notebook for an example)

**NB** To generate the datasets, you need extra libraries (for example `grid2op`, `lightsim2grid` or `leap_net`). These libraries are not required to use this platform once the data are generated.

**NB** This process takes a lot of time...

In [3]:
neurips_benchmark1.generate(nb_sample_train=int(1e5),
                            nb_sample_val=int(1e4),
                            nb_sample_test=int(1e4),
                            nb_sample_test_ood_topo=int(1e4),
                           )

Deleting path reference_data/NeuripsBenchmark1 that might contain previous runs
Creating path reference_data/NeuripsBenchmark1 to save the current data


train: 100%|██████████| 100000/100000 [11:33<00:00, 144.24it/s]


Creating the path /home/donnotben/Documents/lips/reference_data/NeuripsBenchmark1/train to store the dataset name train


val: 100%|██████████| 10000/10000 [00:53<00:00, 187.12it/s]


Creating the path /home/donnotben/Documents/lips/reference_data/NeuripsBenchmark1/val to store the dataset name val


test: 100%|██████████| 10000/10000 [01:39<00:00, 100.37it/s]


Creating the path /home/donnotben/Documents/lips/reference_data/NeuripsBenchmark1/test to store the dataset name test


test_ood_topo: 100%|██████████| 10000/10000 [05:28<00:00, 30.42it/s]


Creating the path /home/donnotben/Documents/lips/reference_data/NeuripsBenchmark1/test_ood_topo to store the dataset name test_ood_topo
