# Example Run of an Augmented Simulator (Tensorflow)

This tutorial notebook provides a guidance for installing the required packages and testing implemented augmented simulators using LIPS platform. 

**A quick walkthrough:**

- Install the required packages using the requirements.txt file in the github repository for the required used case.

- Some baseline are already implemented in the LIPS platform that could be seen to have some inspiration.

- The augmented simulators related hyperparameters could be modified via dedicated configuration files.

- The LIPS platform will be used to evaluate the trained augmented simulators from different evaluation criteria categories and attribute a score to each run.

## How to implement your own augmented simulator

In the following, we show 3 ways to implement an augmented simulator (based on ML or a hybrid physics-AI model):

1- Using an existing augmented simulator (baseline) in LIPS platform, train it and then evaluate the results;

2- Implement an augmented simulator using LIPS framework template to take the advantage of existing training loop and other offered features;

3- Implement an augmented simulator independently from LIPS platform and plug the trained model into LIPS to evaluate its results.

As so, this notebook is organized as follows:
1. [Generic step (Load the required data)](#generic_step)
2. [Evaluate an existing augmented simulator](#existing_sim) (Beginner users)
3. [Train and evaluate a custom augmented simulators developed using LIPS framework](#train_using_lips) (Intermediate level users)
4. [Train a custom augmented simulator independently from LIPS and use the framwork to evaluate the final results](#train_custom) (Advanced users)

### Prerequisites

Install the LIPS framework if it is not already done. For more information look at the LIPS framework [Github repository](https://github.com/IRT-SystemX/LIPS) 

#### For developments on local machine

In [None]:
### Install a virtual environment
# Option 1:  using conda (recommended)
!conda create -n venv_lips python=3.10
!conda activate venv_lips

# Option 2: using virtualenv
!pip install virtualenv
!virtualenv -p /usr/bin/python3.10 venv_lips
!source venv_lips/bin/activate

### Install the LIPS framework
# Option 1: Get the last version of LIPS framework from PyPI (Recommended)
!pip install 'lips-benchmark[recommended]'

# Option 2: Get the last version from github repository
!git clone https://github.com/IRT-SystemX/LIPS.git
!pip install -U LIPS/.[recommended]

#### For Google Colab Users
You could also use a GPU device from `Runtime > Change runtime type` and by selecting `T4 GPU`.

In [None]:
### Install the LIPS framework
# Option 1: Get the last version of LIPS framework from PyPI (Recommended)
!pip install 'lips-benchmark[recommended]'

In [None]:
# Option 2: Get the last version from github repository
!git clone https://github.com/IRT-SystemX/LIPS.git
!pip install -U LIPS/.[recommended]

Attention: You may restart the session after this installation, in order that the changes be effective.

In [None]:
# Clone the starting kit
!git clone https://github.com/IRT-SystemX/ml4physim_startingkit_powergrid.git
# and change the directory to the starting kit to be able to run correctly this notebook
import os
os.chdir("ml4physim_startingkit_powergrid")

### Generic Step (Load the required data) <a id='generic_step'></a>

Download the dataset

The already provided datasets on starting kit are demo versions of the complet datasets. The complet datasets should be downloaded using the following function and replace the demo versions.

**NB.** <span style="color: red">The challenge dataset is based on `lips_idf_2023` environment and all the solutions should be trained and evaluated on this dataset.</span> Execution of the following cell will replace the demo dataset with the complet dataset.

In [None]:
## Download the dataset through the dedicated lips function
from lips.dataset.powergridDataSet import downloadPowergridDataset

downloadPowergridDataset("input_data_local", "lips_idf_2023")

In [None]:
# Use some required pathes
import pathlib
import os
DATA_PATH = pathlib.Path().resolve() / "input_data_local" / "lips_idf_2023"
BENCH_CONFIG_PATH = pathlib.Path().resolve() / "configs" / "benchmarks" / "lips_idf_2023.ini"
SIM_CONFIG_PATH = pathlib.Path().resolve() / "configs" / "simulators"
TRAINED_MODELS = pathlib.Path().resolve() / "input_data_local" / "trained_models"
LOG_PATH = "logs.log"

Loading the dataset using the dedicated class used by LIPS platform offers a list of advantages:

1. Ease the importing of datasets
1. A set of functions to organize the `inputs` and `outputs` required by augmented simulators


In [None]:
# Load the required benchmark datasets
from lips.benchmark.powergridBenchmark import PowerGridBenchmark

benchmark_kwargs = {"attr_x": ("prod_p", "prod_v", "load_p", "load_q"),
                    "attr_y": ("a_or", "a_ex", "p_or", "p_ex", "v_or", "v_ex"),
                    "attr_tau": ("line_status", "topo_vect"),
                    "attr_physics": None}

benchmark = PowerGridBenchmark(benchmark_name="Benchmark_competition",
                               benchmark_path=DATA_PATH,
                               load_data_set=True,
                               log_path=None,
                               config_path=BENCH_CONFIG_PATH,
                               **benchmark_kwargs
                              )

Tensorflow users may use the commands below to select a GPU from the available physical devices.

In [None]:
# Use a GPU
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only use the first GPU
    try:
        tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
    except RuntimeError as e:
        # Visible devices must be set at program startup
        print(e)

In [None]:
# see the list of selected devices
tf.config.experimental.get_visible_devices()

### Option-I (Evaluate an existing augmented simulator) <a id='existing_sim'></a>
**<font color='red'>For beginners.</font>**

We start by importing an architecture from exisiting set of architectures and instantiate the `TfFullyConnected` class which offers a set of utilities to train and analyze the selected augmented simulator. User could play with the configuration file of an existing augmented simulator to modify the model hyperparameters.

The configuration file could be found at `./configurations/airfoil/simulators/tf_fc.ini`:

```output
[DEFAULT]
name = "tf_fc"
layers = (300, 300, 300, 300)
activation = "relu"
layer = "linear"
input_dropout = 0.0
dropout = 0.0
metrics = ["mae"]
loss = {"name": "mse",
        "params": {"size_average": None,
                   "reduce": None,
                   "reduction": 'mean'}}
device = "cpu"
optimizer = {"name": "adam",
             "params": {"lr": 3e-4}}
train_batch_size = 128
eval_batch_size = 128
epochs = 5
shuffle = True
save_freq = False
ckpt_freq = 50
```

In the example below we select the configuration provided in `[DEFAULT]` section and new configuration could be created using a new section name and modifying the existing parameters.

In this example, we select the fully connected architecture with its corresponding configuration file.

In [None]:
# Indicate the path required for corresponding augmented simulator parameters
SIM_CONFIG_PATH = os.path.join("configs", "simulators", "tf_fc.ini")

In [None]:
from lips.augmented_simulators.tensorflow_models import TfFullyConnected
from lips.dataset.scaler import StandardScaler

tf_fc = TfFullyConnected(name="tf_fc",
                         bench_config_path=BENCH_CONFIG_PATH,
                         bench_config_name="Benchmark_competition",
                         bench_kwargs=benchmark_kwargs,
                         sim_config_path=SIM_CONFIG_PATH,
                         sim_config_name="DEFAULT",
                         scaler=StandardScaler,
                         log_path=LOG_PATH)

Train the augmented simulator using the benchmark datasets.

In [None]:
tf_fc.train(train_dataset=benchmark.train_dataset,
            val_dataset=benchmark.val_dataset,
            epochs=10
           )

#### Save & Load
You can also save and load the model fitted parameters alongside its meta data using the following functions.

Save your model

In [None]:
SAVE_PATH = TRAINED_MODELS / "fully_connected"
tf_fc.save(SAVE_PATH)

Load your trained augmented simulator

In [None]:
from lips.augmented_simulators.tensorflow_models import TfFullyConnected
from lips.dataset.scaler import StandardScaler

tf_fc = TfFullyConnected(name="tf_fc",
                         bench_config_path=BENCH_CONFIG_PATH,
                         bench_config_name="Benchmark_competition",
                         bench_kwargs=benchmark_kwargs,
                         sim_config_path=SIM_CONFIG_PATH,
                         sim_config_name="DEFAULT",
                         scaler=StandardScaler,
                         log_path=LOG_PATH)

LOAD_PATH = TRAINED_MODELS / "fully_connected"
tf_fc.restore(path=LOAD_PATH)

#### Evaluation

Visualize the convergence

In [None]:
tf_fc.visualize_convergence()

Finally, the trained augmented simulator could be evaluated using the `evaluate_simulator` function of the `Benchmark` class. You can set on which dataset you want to evaluate your trained augmented simulator. The possibilites are `all`, `val`, `test`, `test_ood_topo`.

In [None]:
# EVAL_SAVE_PATH = get_path(EVALUATION_PATH, benchmark1)
tf_sim_metrics = benchmark.evaluate_simulator(augmented_simulator=tf_fc,
                                              eval_batch_size=100000,
                                              dataset="all",
                                              shuffle=False,
                                              save_path=None,
                                              save_predictions=False
                                             )

You can see how your model performs directly by looking at the evaluation metrics resulted by from the last step.

In [None]:
tf_sim_metrics["test"]

### Option-II (Train and evaluate a new augmented simulator using LIPS platform) <a id='train_using_lips'></a>

**<font color='red'>For intermediate level users.</font>**

In [None]:
"""
Tensorflow based fully connected
"""
import os
import pathlib
from typing import Union
import json
import warnings

import numpy as np
# from leap_net import ResNetLayer

with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=FutureWarning)
    from tensorflow import keras

from lips.augmented_simulators.tensorflow_simulator import TensorflowSimulator
from lips.logger import CustomLogger
from lips.config import ConfigManager
from lips.dataset import DataSet
from lips.dataset.scaler import Scaler
from lips.utils import NpEncoder

class MyCustomFullyConnected(TensorflowSimulator):
    """Fully Connected architecture
    Parameters
    ----------
    sim_config_path : ``str``
        The path to the configuration file for simulator.
        It should contain all the required hyperparameters for this model.
    sim_config_name : Union[str, None], optional
        _description_, by default None
    name : Union[str, None], optional
        _description_, by default None
    scaler : Union[Scaler, None], optional
        _description_, by default None
    bench_config_path : Union[str, pathlib.Path, None], optional
        _description_, by default None
    bench_config_name : Union[str, None], optional
        _description_, by default None
    log_path : Union[None, str], optional
        _description_, by default None
    Raises
    ------
    RuntimeError
        _description_
    """
    def __init__(self,
                 sim_config_path: str,
                 bench_config_path: Union[str, pathlib.Path],
                 bench_config_name: Union[str, None]=None,
                 bench_kwargs: dict={},
                 sim_config_name: Union[str, None]=None,
                 name: Union[str, None]=None,
                 scaler: Union[Scaler, None]=None,
                 log_path: Union[None, str]=None,
                 **kwargs):
        super().__init__(name=name, log_path=log_path, **kwargs)
        if not os.path.exists(sim_config_path):
            raise RuntimeError("Configuration path for the simulator not found!")
        if not str(sim_config_path).endswith(".ini"):
            raise RuntimeError("The configuration file should have `.ini` extension!")
        sim_config_name = sim_config_name if sim_config_name is not None else "DEFAULT"
        self.sim_config = ConfigManager(section_name=sim_config_name, path=sim_config_path)
        self.bench_config = ConfigManager(section_name=bench_config_name, path=bench_config_path)
        self.bench_config.set_options_from_dict(**bench_kwargs)
        self.name = name if name is not None else self.sim_config.get_option("name")
        self.name = self.name + '_' + sim_config_name
        # scaler
        self.scaler = scaler() if scaler else None
        # Logger
        self.log_path = log_path
        self.logger = CustomLogger(__class__.__name__, log_path).logger
        # model parameters
        self.params = self.sim_config.get_options_dict()
        self.params.update(kwargs)
        # Define layer to be used for the model
        self.layers = {"linear": keras.layers.Dense}#, "resnet" : ResNetLayer}
        self.layer = self.layers.get(self.params["layer"], None)
        if self.layer is None:
            self.layer = keras.layers.Dense 

        # optimizer
        if "optimizer" in kwargs:
            if not isinstance(kwargs["optimizer"], keras.optimizers.Optimizer):
                raise RuntimeError("If an optimizer is provided, it should be a type tensorflow.keras.optimizers")
            self._optimizer = kwargs["optimizer"](self.params["optimizer"]["params"])
        else:
            self._optimizer = keras.optimizers.Adam(learning_rate=self.params["optimizer"]["params"]["lr"])

        self._model: Union[keras.Model, None] = None

        self.input_size = None if kwargs.get("input_size") is None else kwargs["input_size"]
        self.output_size = None if kwargs.get("output_size") is None else kwargs["output_size"]

    def build_model(self):
        """Build the model
        Returns
        -------
        Model
            _description_
        """
        super().build_model()
        input_ = keras.layers.Input(shape=(self.input_size,), name="input")
        x = input_
        x = keras.layers.Dropout(rate=self.params["input_dropout"], name="input_dropout")(x)
        for layer_id, layer_size in enumerate(self.params["layers"]):
            x = self.layer(layer_size, name=f"layer_{layer_id}")(x)
            x = keras.layers.Activation(self.params["activation"], name=f"activation_{layer_id}")(x)
            x = keras.layers.Dropout(rate=self.params["dropout"], name=f"dropout_{layer_id}")(x)
        output_ = keras.layers.Dense(self.output_size)(x)
        self._model = keras.Model(inputs=input_,
                                  outputs=output_,
                                  name=f"{self.name}_model")
        return self._model

    def process_dataset(self, dataset: DataSet, training: bool=False) -> tuple:
        """process the datasets for training and evaluation
        This function transforms all the dataset into something that can be used by the neural network (for example)
        Warning
        -------
        It works with StandardScaler only for the moment.
        Parameters
        ----------
        dataset : DataSet
            _description_
        Scaler : bool, optional
            _description_, by default True
        training : bool, optional
            _description_, by default False
        Returns
        -------
        tuple
            the normalized dataset with features and labels
        """
        if training:
            self._infer_size(dataset)
            inputs, outputs = dataset.extract_data(concat=True)
            if self.scaler is not None:
                inputs, outputs = self.scaler.fit_transform(inputs, outputs)
        else:
            inputs, outputs = dataset.extract_data(concat=True)
            if self.scaler is not None:
                inputs, outputs = self.scaler.transform(inputs, outputs)

        return inputs, outputs

    def _infer_size(self, dataset: DataSet):
        """Infer the size of the model
        Parameters
        ----------
        dataset : DataSet
            _description_
        Returns
        -------
        None
            _description_
        """
        *dim_inputs, self.output_size = dataset.get_sizes()
        self.input_size = np.sum(dim_inputs)

    def _post_process(self, dataset, predictions):
        if self.scaler is not None:
            predictions = self.scaler.inverse_transform(predictions)
        predictions = super()._post_process(dataset, predictions)
        return predictions

    def _save_metadata(self, path: str):
        super()._save_metadata(path)
        if self.scaler is not None:
            self.scaler.save(path)
        res_json = {}
        res_json["input_size"] = self.input_size
        res_json["output_size"] = self.output_size
        with open((path / "metadata.json"), "w", encoding="utf-8") as f:
            json.dump(obj=res_json, fp=f, indent=4, sort_keys=True, cls=NpEncoder)

    def _load_metadata(self, path: str):
        if not isinstance(path, pathlib.Path):
            path = pathlib.Path(path)
        super()._load_metadata(path)
        if self.scaler is not None:
            self.scaler.load(path)
        with open((path / "metadata.json"), "r", encoding="utf-8") as f:
            res_json = json.load(fp=f)
        self.input_size = res_json["input_size"]
        self.output_size = res_json["output_size"]

Once, the augmented simulator is implemented, you should also create a configuration which indicate all the hyper parameters required by this augmented simulator. An example of configuration file is shown in `configs/simulators/tf_fc.ini` and its content is shown below. 

The path and the section name of this configuration file should be given to your architecture as an argument (`sim_config_path`, `sim_config_name`) in order that it could be able to import all its required hyper-parameters.

```output
[DEFAULT]
name = "tf_fc"
layers = (300, 300, 300, 300)
activation = "relu"
layer = "linear"
input_dropout = 0.0
dropout = 0.0
metrics = ["mae"]
loss = {"name": "mse",
        "params": {"size_average": None,
                   "reduce": None,
                   "reduction": 'mean'}}
device = "cpu"
optimizer = {"name": "adam",
             "params": {"lr": 3e-4}}
train_batch_size = 128
eval_batch_size = 128
epochs = 5
shuffle = True
save_freq = False
ckpt_freq = 50
```

Instantiate your simulator with corresponding configurations and a scaler.

In [None]:
# Indicate the path required for corresponding augmented simulator parameters
SIM_CONFIG_PATH = os.path.join("configs", "simulators", "tf_fc.ini")

# Import a scaler
from lips.dataset.scaler import StandardScaler

custom_tf_fc = MyCustomFullyConnected(name="tf_fc",
                                      bench_config_path=BENCH_CONFIG_PATH,
                                      bench_config_name="Benchmark_competition",
                                      bench_kwargs=benchmark_kwargs,
                                      sim_config_path=SIM_CONFIG_PATH,
                                      sim_config_name="DEFAULT",
                                      scaler=StandardScaler,
                                      log_path=LOG_PATH)

The `train` function is implemented in the base class `TensorflowSimulator`. You can call it directly to train your custom augmented simulator. You can also overload it and define your own training function if necessary.

In [None]:
custom_tf_fc.train(train_dataset=benchmark.train_dataset,
                   val_dataset=benchmark.val_dataset,
                   epochs=5
                  )

Finally, you can train your custom augmented simualator using the evaluation module of LIPS framework.

In [None]:
# EVAL_SAVE_PATH = get_path(EVALUATION_PATH, benchmark1)
custom_fc_metrics = benchmark.evaluate_simulator(augmented_simulator=custom_tf_fc,
                                                 eval_batch_size=100000,
                                                 dataset="all",
                                                 shuffle=False,
                                                 save_path=None,
                                                 save_predictions=False
                                                ) 

### Option-III (Train an augmented simulator independently and evaluate it through LIPS) <a id='train_custom'></a>

**<font color='red'>For advanced users.</font>**

If you requrie more functionalities that are not offered by LIPS platform (e.g., adding advanced regularizations into the training loop, or adding physics constraints in your model) you can implement your architecture independently from LIPS platform and use only the evaluation part of the framework to assess your model performance. 

In the following, we show a simple architecture with a training loop and how it can be evaluated by the LIPS platform.

Functions required to proprocess the data and post process the predictions:

In [None]:
import numpy as np
from lips.dataset import DataSet

def process_dataset(dataset: DataSet, training: bool=False, scaler=None) -> tuple:
    if training:
        inputs, outputs = dataset.extract_data(concat=True)
        if scaler is not None:
            inputs, outputs = scaler.fit_transform(inputs, outputs)
    else:
        inputs, outputs = dataset.extract_data(concat=True)
        if scaler is not None:
            inputs, outputs = scaler.transform(inputs, outputs)

    return inputs, outputs

def infer_size(dataset: DataSet):
    """Infer the size of the model
    Parameters
    ----------
    dataset : DataSet
        _description_
    Returns
    -------
    None
        _description_
    """
    *dim_inputs, output_size = dataset.get_sizes()
    input_size = np.sum(dim_inputs)
    return input_size, output_size

def post_process(dataset, predictions, scaler=None):
    if scaler is not None:
        predictions = scaler.inverse_transform(predictions)
    predictions = dataset.reconstruct_output(predictions)
    return predictions

##### STEP 1: Implement your architecture based on Tensorflow library in this Example

Your class should inherit from `AugmentedSimulator` of LIPS framework and implement the following functions:

- `build_model`: design the architecture of the model;
- `train`: train the model
- `predict`: predict using the trained model

**NB.** It is required to respect this format, because the evaluation module gets this class as input and evaluates its inference time.

In [None]:
from tensorflow import keras
from lips.augmented_simulators import AugmentedSimulator

class MyFullyCustomFullyConnected(AugmentedSimulator):
    def __init__(self,
                 name: str="MyCustomFC",
                 input_size: int=None,
                 output_size: int=None,
                 hidden_sizes: tuple=(100,100),
                 ):
        self.name = name
        self.input_size = input_size
        self.output_size = output_size
        self.hidden_sizes = hidden_sizes
        self._model = None

    def build_model(self):
        input_ = keras.layers.Input(shape=(self.input_size,), name="input")
        x = input_
        for layer_id, layer_size in enumerate(self.hidden_sizes):
            x = keras.layers.Dense(layer_size, name=f"layer_{layer_id}")(x)
            x = keras.layers.Activation("relu", name=f"activation_{layer_id}")(x)
        output_ = keras.layers.Dense(self.output_size)(x)
        self._model = keras.Model(inputs=input_,
                                  outputs=output_,
                                  name=f"{self.name}_model")     

    def train(self,
              train_dataset,
              val_dataset,
              epochs=10,
              lr=3e-4,
              shuffle=True,
              batch_size=256,
              ):
        processed_x, processed_y = train_dataset
        # init the model
        self.build_model()
        optimizer = keras.optimizers.Adam(learning_rate=lr)
        self._model.compile(optimizer=optimizer,
                            loss="mse",
                            metrics=["mae"])        

        history_callback = self._model.fit(x=processed_x,
                                            y=processed_y,
                                            validation_data=val_dataset,
                                            epochs=epochs,
                                            batch_size=batch_size,
                                            shuffle=shuffle)
        return history_callback

    def predict(self, dataset: DataSet, scaler=None, eval_batch_size=128) -> dict:
        processed_x, _ = process_dataset(dataset, training=False, scaler=scaler)

        # make the predictions
        predictions = self._model.predict(processed_x, batch_size=eval_batch_size)

        predictions = post_process(dataset, predictions)

        return predictions

In [None]:
from lips.dataset.scaler import StandardScaler

scaler = StandardScaler()
processed_x_train, processed_y_train = process_dataset(benchmark.train_dataset, training=True, scaler=scaler)
processed_x_val, processed_y_val = process_dataset(benchmark.val_dataset, training=False, scaler=scaler)
training_data = (processed_x_train, processed_y_train)
validation_data = (processed_x_val, processed_y_val)

In [None]:
input_size, output_size = infer_size(benchmark.train_dataset)

In [None]:
model = MyFullyCustomFullyConnected(name="MyFullyCustomFC",
                                    input_size=input_size,
                                    output_size=output_size,
                                    hidden_sizes=(100,100))

In [None]:
history = model.train(training_data, validation_data, epochs=2)

##### prediction on `test_dataset`
This dataset has the same distribution as the training set

In [None]:
predictions = model.predict(dataset=benchmark._test_dataset, scaler=scaler)

#### Evaluation

In [None]:
# get the environment which is required for evaluation
from lips.dataset.utils.powergrid_utils import get_kwargs_simulator_scenario
from lips.benchmark.powergridBenchmark import get_env

env = get_env(get_kwargs_simulator_scenario(benchmark.config))

In [None]:
from lips.evaluation.powergrid_evaluation import PowerGridEvaluation
from pprint import pprint

evaluator = PowerGridEvaluation(benchmark.config)
metrics_test = evaluator.evaluate(observations=benchmark._test_dataset.data,
                                  predictions=predictions,
                                  dataset=benchmark._test_dataset,
                                  augmented_simulator=model,
                                  env=env)
pprint(metrics_test)

In [None]:
metrics_all = dict()
metrics_all["test"] = metrics_test

##### Prediction on `test_ood_dataset`
This dataset has a different distribution in comparison to the training set. 

In [None]:
predictions = model.predict(dataset=benchmark._test_ood_topo_dataset, scaler=scaler)

evaluator = PowerGridEvaluation(benchmark.config)
metrics_ood = evaluator.evaluate(observations=benchmark._test_ood_topo_dataset.data,
                                 predictions=predictions,
                                 env=env)

In [None]:
metrics_all["test_ood_topo"] = metrics_ood

### Compute score

In [None]:
from utils.compute_score import compute_global_score
import warnings

with warnings.catch_warnings():
    warnings.filterwarnings("ignore")
    score = compute_global_score(metrics_all, benchmark.config)