# Training a dense neural network to imitate a simulator

In this notebook, we explain how to use our package to train a simple neural network to imitate the output of an AC power flow simulator.

In [1]:
from torch.utils.data import DataLoader
import numpy as np
import tqdm
import jax
import jax.numpy as jnp

%load_ext autoreload
%autoreload 2

import sys; sys.path.insert(0, '../../..')
import ml4ps as mp

## Downloading a dataset

First of all, we need to download a dataset. We propose to download a small dataset of power grids derived from the case60nordic file (also known as nordic32), randomly generated using [powerdatagen](https://github.com/bdonon/powerdatagen).

The dataset is available on zenodo [here](https://zenodo.org/record/7077699). The following code downloads the dataset if it is not already here. Please be patient, as it may take several minutes (not more than 10 minutes though).

If you have already downloaded the dataset, then this does nothing.

In [2]:
%%bash
if [ ! -d data/case60/ ]
then
    zenodo_get '10.5281/zenodo.7077699' -o data/
    unzip -qq data/case60.zip -d data/
    rm data/case60.zip data/md5sums.txt
fi

## Backend instantation

We need to import a backend, which will serve to read power grid data. In some more complex problem, it will be used to perform power grid simulations.

In this case, we are considering a dataset of .json files that can be read by pandapower. We thus choose the backend that uses pandapower.

In [3]:
backend = mp.PandaPowerBackend()

In [4]:
train_dir = 'data/case60/train'

## Building a normalizer

In [5]:
normalizer = mp.Normalizer(data_dir=train_dir, backend=backend)

Loading power grids.: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:04<00:00, 20.52it/s]
Extracting features.: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 110.63it/s]


## Building a train set and a data loader

The normalizer is fed to the data loader, so that ...

In [6]:
train_set = mp.PowerGridDataset(data_dir=train_dir, backend=backend, normalizer=normalizer)
train_loader = DataLoader(train_set,
                          batch_size=1,
                          shuffle=True,
                          num_workers=2,
                          collate_fn=mp.collate_power_grid,
                          prefetch_factor=8)

In [7]:
x, nets = next(iter(train_loader))

In [8]:
x['line']

{'address': {'from_bus': array([[ 0,  0,  1,  1, 12, 12, 17, 17, 34, 34, 34, 34, 45, 45, 45, 56,
          56, 59, 59, 18, 18, 18, 18, 15, 15, 19, 19, 22, 20, 20, 24, 24,
          25, 24, 26, 21, 27, 21, 28,  5,  5,  6,  6,  7,  7,  7, 29, 29,
          30, 30, 30,  8, 11, 13, 13,  3,  3]]),
  'to_bus': array([[12, 12, 16, 16, 16, 16, 23, 23, 56, 56, 58, 58, 57, 57, 58, 57,
          57,  2,  2, 15, 19, 20,  3, 20,  3, 21, 22,  6, 24, 24, 21, 25,
           5, 26,  5, 27,  6, 28, 29, 29, 11,  7, 29, 29,  8,  9, 30, 30,
          10, 10, 13,  9, 13, 14, 14,  4,  4]]),
  'name': array([[ 83,  84,  95, 106, 117, 128, 136, 137, 138, 139,  85,  86,  87,
           88,  89,  90,  91,  92,  93,  94,  96,  97,  98,  99, 100, 101,
          102, 103, 104, 105, 107, 108, 109, 110, 111, 112, 113, 114, 115,
          116, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 129, 130,
          131, 132, 133, 134, 135]])},
 'features': {'c_nf_per_km': array([[-0.58 , -0.58 , -0.44 , -0.44 , -0.72 , -

In [9]:
nets[0].line

Unnamed: 0,name,std_type,from_bus,to_bus,length_km,r_ohm_per_km,x_ohm_per_km,c_nf_per_km,g_us_per_km,max_i_ka,df,parallel,type,in_service,max_loading_percent
0,,,0,2,1.0,1.69,11.83,259.997022,0.0,0.777202,1.0,1,ol,True,100.0
1,,,0,2,1.0,1.69,11.83,259.997022,0.0,0.777202,1.0,1,ol,True,100.0
2,,,1,23,1.0,2.370056,15.21,340.007696,0.0,0.777202,1.0,1,ol,True,100.0
3,,,1,23,1.0,2.370056,15.21,340.007696,0.0,0.777202,1.0,1,ol,True,100.0
4,,,2,23,1.0,1.179958,8.45,190.006517,0.0,0.777202,1.0,1,ol,True,100.0
5,,,2,23,1.0,1.179958,8.45,190.006517,0.0,0.777202,1.0,1,ol,True,100.0
6,,,24,3,1.0,5.07,33.8,570.000715,0.0,0.777202,1.0,1,ol,True,100.0
7,,,24,3,1.0,5.07,33.8,570.000715,0.0,0.777202,1.0,1,ol,True,100.0
8,,,4,6,1.0,1.69,10.14,229.993019,0.0,0.777202,1.0,1,ol,True,100.0
9,,,4,6,1.0,1.69,10.14,229.993019,0.0,0.777202,1.0,1,ol,True,100.0


## Building a Fully Connected neural network

First of all, we need to tell the neural network which features it should take as input, and wich features we want it to output. In this case, we want the neural network to output predictions for the voltage magnitude at each bus.

In [10]:
input_feature_names = {'load': ['p_mw', 'q_mvar'], 'gen': ['p_mw', 'vm_pu'], 'ext_grid': ['vm_pu']}
output_feature_names = {'bus': ['res_vm_pu']}

In [11]:
data_structure = {
    'load': {'input_feature_names': ['p_mw', 'q_mvar']},
    'gen': {'input_feature_names': ['p_mw', 'vm_pu']},
    'ext_grid': {'input_feature_names': ['vm_pu']},
    'bus': {'output_feature_names': ['res_vm_pu']}
}

Since we are working with a fully connected neural network, we need to pass a sample to the constructor, so that  it knows how many object of each class will be present in the data. This is due to the fact that fully connected neural networks can only take vector data as input. By telling the neural network the amount of objects, it is able to initialize its weights using the right dimensions.

In [12]:
x, nets = next(iter(train_loader))

fully_connected = mp.FullyConnected(
    x=x,
    data_structure=data_structure,
    hidden_dimensions=[1024,1024])

In addition, we need to specify post-processing functions, so that our model starts its training in a reasonable range. Here, we know that voltage magnitudes should be around 1 p.u., so we post-process the neural network output by adding an offset of 1.

In [13]:
functions = {'bus': {'features': {'res_vm_pu': [mp.AffineTransform(offset=1.)]}}}
postprocessor = mp.PostProcessor(functions=functions)

## Training loop

Here, we propose to train our neural network using the Adam optimizer. The loss function is the squared distance between the neural network prediction and the output of the simulator.

In [14]:
from jax.example_libraries import optimizers

learning_rate = 3e-4
opt_init, opt_update, get_params = optimizers.adam(learning_rate)
opt_state = opt_init(fully_connected.weights)

In [15]:
def loss_function(params, x, y):
    y_hat = fully_connected.forward_batch(params, x)
    y_post = postprocessor(y_hat)
    loss = jnp.mean((y_post['bus']['features']['res_vm_pu'] - y['bus']['features']['res_vm_pu'])**2)
    # TODO : Faire disparaître features
    return loss

@jax.jit
def update(params, x, y, opt_state, step):
    loss, grads = jax.value_and_grad(loss_function)(params, x, y)
    opt_state = opt_update(step, grads, opt_state)
    return get_params(opt_state), opt_state, loss

In [16]:
step = 0
for epoch in range(2):
    for x, nets in (pbar := tqdm.tqdm(train_loader)):
        step += 1
        
        backend.run_batch(nets) # AC power flow simulation 
        y = backend.get_data_batch(nets, data_structure={'bus':{'feature_names': ['res_vm_pu']}}) # Ground truth extraction
        fully_connected.weights, opt_state, loss = update(fully_connected.weights, x, y, opt_state, step)
        
        pbar.set_description("Epoch {}, Loss = {:.2e}".format(epoch, loss))

Epoch 0, Loss = 9.44e-04: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [05:21<00:00, 31.12it/s]
Epoch 1, Loss = 2.49e-03: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [06:00<00:00, 27.73it/s]


## Testing the model

We now wish to take a look at how well our model performs on the test data. In this notebook we propose to plot the ground truth versus the prediction for a sample of data.

In [17]:
test_dir = 'data/case60/test'
test_set = mp.PowerGridDataset(data_dir=test_dir, backend=backend, normalizer=normalizer)
test_loader = DataLoader(test_set,
                          batch_size=8,
                          shuffle=True,
                          num_workers=2,
                          collate_fn=mp.collate_power_grid,
                          prefetch_factor=8)

In [18]:
x, nets = next(iter(test_loader))

In [None]:
# Perform prediction
y_hat = fully_connected.forward_batch(fully_connected.weights, x)
y_post = postprocessor(y_hat)
y_post = np.reshape(y_post['bus']['features']['res_vm_pu'], [-1])

# Get ground truth
y_truth = backend.get_data_batch(nets, data_structure={'bus':{'feature_names': ['res_vm_pu']}})
y_truth = np.reshape(y_truth['bus']['features']['res_vm_pu'], [-1])

# Compare results
import matplotlib.pyplot as plt
plt.scatter(y_truth, y_post)
plt.xlabel('Ground truth')
plt.ylabel('Prediction')
plt.show()

In [None]:
y_hat

In [None]:
normalizer.functions