# Evaluate Baselines (AirfRANS use case)

The goal of this notebook is to demonstrate how we can evaluate the results of a baseline on a given benchmark.

We will show how to load a baseline (or any other `AugmentedSimulator`) and evaluate it on a `Benchmark` of our choice.

**To learn more about the training procedure, visit [this notebook](../03_TrainAnAugmentedSimulator.ipynb)**

#### Import required packages

In [None]:
import os
from lips import get_root_path
from lips.dataset.airfransDataSet import download_data
from lips.benchmark.airfransBenchmark import AirfRANSBenchmark

In [None]:
# indicate required paths
LIPS_PATH = get_root_path()
DIRECTORY_NAME = 'Dataset'
BENCHMARK_NAME = "Case1"
BENCH_CONFIG_PATH = LIPS_PATH + os.path.join("..","configurations","airfrans","benchmarks","confAirfoil.ini")
SIM_CONFIG_PATH = LIPS_PATH + os.path.join("..","configurations","airfrans","simulators","torch_fc.ini")
LOG_PATH = LIPS_PATH + "lips_logs.log"

## Initial step: download the data

In [None]:
download_data(root_path=".", directory_name=DIRECTORY_NAME)

#  Benchmark <a id="Case1"></a>

## First step: load the dataset

A common dataset will be used for evaluate the two augmented simulator. This initial step aims at loading it once and for all.

In [None]:
benchmark=AirfRANSBenchmark(benchmark_path = DIRECTORY_NAME,
                            config_path = BENCH_CONFIG_PATH,
                            benchmark_name = BENCHMARK_NAME,
                            log_path = LOG_PATH)
benchmark.load(path_train=DIRECTORY_NAME,path_test=DIRECTORY_NAME)

In [None]:
# to verify the config is loaded appropriately for this benchmark
print("Benchmark name: ", benchmark.config.section_name)
print("Environment name: ", benchmark.config.get_option("env_name"))
print("Output attributes: ", benchmark.config.get_option("attr_x"))
print("Output attributes: ", benchmark.config.get_option("attr_y"))
print("Evaluation criteria: ")
print(benchmark.config.get_option("eval_dict"))

## A baseline "augmented simulator" <a id="bench1-fc"></a>

Along with some dataset, we provide also some baseline (from a trained neural network). This baseline is made of a fully connected neural network that takes the available input of the airfrans case and tries to predict all the output of the simulator.

The fully connected neural network is made of XXX layer each with YYY units.

It is learned for KKK epochs on the training set of the `Case1`.

First we need to load the baseline and initialize it properly

In [None]:
from lips.augmented_simulators.torch_models.fully_connected import TorchFullyConnected
from lips.augmented_simulators.torch_simulator import TorchSimulator
from lips.dataset.scaler import StandardScaler

augmented_simulator = TorchSimulator(name="torch_fc",
                                     model=TorchFullyConnected,
                                     scaler=StandardScaler,
                                     log_path="log_benchmark",
                                     device="cuda:0",
                                     bench_config_path=BENCH_CONFIG_PATH,
                                     bench_config_name=BENCHMARK_NAME,
                                     sim_config_path=SIM_CONFIG_PATH,
                                     sim_config_name="DEFAULT",
                                     architecture_type="Classical",
                                    )

Training the neural network

In [None]:
augmented_simulator.train(train_dataset=benchmark.train_dataset, epochs=1, train_batch_size=128000)

Then we can evaluate it on the test datasets of the benchmark. This is done by indicating the learned augmented simulator `augmented_simulator` as the argument:

In [None]:
fc_metrics = benchmark.evaluate_simulator(augmented_simulator=augmented_simulator,eval_batch_size=128000 )

## Performance of an augmented simulator <a id="bench1-comp"></a>

### Machine learning metrics 

And now we can assess the performance of the "augmented simulator". For example, if we want to retrieve the MSE (mean squared error) on the test dataset we can use:

In [None]:
ML_metrics = "ML"
dataset_name = "test"
print("Fully Connected Augmented Simulator")
print(f"Dataset : {dataset_name}")
print("{:<10} : {}".format("MSE", fc_metrics[dataset_name][ML_metrics]))

### Physic compliance
A trained augmented simulator could make some errors when verifying physics compliances.

In [None]:
physic_compliances = "Physics"
dataset_name = "test"
physical_metrics = fc_metrics[dataset_name][physic_compliances]
print(physical_metrics)