# Submission examples

The aim of this notebook is to show  examples of functionnal submission packages. 

#### General submission process

All submissions are processed by the codabench plateform. In order to submit a model for the competition, the submission folder need to be compressed as a zip file (be carefull to compress all the files and not the folder itself, the unzipping need to recreate the file and not a folder containing the files). This zip can then be uploaded on the `my_submission` tab :

![Alt text](utils/img/submission.png)

Once submitted it is processed by the codabench plateform and send to one of our compute node for evaluation. It is possible to see the current status of the submission (submitted, waiting for worker, running, done) however the logs will only be available once the submission is done running. 

Please note that we currently have a 12hours limit for the execution of a submission (training, evaluation and scoring).

## Example 1 : simple submission

This example is available in submission/simple 

It correspond to a simple submission that use pre-implemented model and scaler and recreate the 1st example from the 4th notebook.
This submission is composed of 3 files :
- parameters.json
- config.ini
- scaler_parameters.py




### parameters.json


In this example we are using a fully connected model implemented through torch and already available in the LIPS package :
- `from lips.augmented_simulators.torch_models.fully_connected import TorchFullyConnected`

We also want to train and evaluate the model as such we indicate 
- `evaluateonly: false`
- `scoringonly": false`

#### simulator_config :
As we are use an already implemented simulator/model we use :
- `custom_simulator : false` 
Which indicate to the compute node that it will need to load the model from the LIPS package
We name the model (used for saving an retrieving models, not important in this type of submission):
- `name: "MyAugmentedSimulator"`
And indicates to the compute node which model class and implementation we are using :
- `model_type: "fully_connected"`
- `model: "TorchFullyConnected"`

This will load the following model when running :  `from lips.augmented_simulators.torch_models.fully_connected import TorchFullyConnected`

In this example we also use a pre-implemented scaler : `from lips.dataset.scaler.standard_scaler_iterative import StandardScalerIterative`
Similarly we indicate which scaler class and implementation to load :
- `scaler_class: "standard_scaler_iterative"`
- `scaler: "StandardScalerIterative"`

We then indicate which configuration will need to be used from the config.ini file (in this example we use the standard config presented in the 1st example of notebook 4):
- `config_name: "DEFAULT"`

#### simulator_extra_parameters:
This section is used to pass custom parameters to the model and generally will be only used in association with a custom model as presented in the following example.
As we are running a pre-implemented model, we do not pass any custom parameters and `simulator_extra_parameters` stay empty :
- `simulator_extra_parameters: {}`

#### training_config:
We now configure to run the training for 10 epochs :
- `training_config: {"epoch": 10}`

Architecture_type is not used for the submission process and stay at "Classical"

The resulting `parameters.json` file :

In [None]:
{
  "evaluateonly": false,
  "scoringonly": false,
  "simulator_config": {
    "custom_simulator": false,
    "name": "MyAugmentedSimulator",
    "model": "TorchFullyConnected",
    "model_type": "fully_connected",
    "custom_scaler": false,
    "scaler_class": "standard_scaler_iterative",
    "scaler": "StandardScalerIterative",
    "config_name": "DEFAULT",
    "architecture_type": "Classical"
  },
  "simulator_extra_parameters": {},
  "training_config": {
    "epochs": 10
  }
}

### config.ini
This file is used to pass the configuration used in the model, as presented in notebook 4.
We had the configuration file as defined in previous notebook :

In [None]:
[DEFAULT]
name = "torch_fc"
layers = (64,64,8,64,64,64,8,64,64)
activation = "relu"
layer = "linear"
input_dropout = 0.0
dropout = 0.0
metrics = ("MAELoss",)
loss = {"name": "MSELoss",
        "params": {"size_average": None,
                   "reduce": None,
                   "reduction": 'mean'}}
device = "cpu"
optimizer = {"name": "adam",
             "params": {"lr": 2e-4}}
train_batch_size = 128000
eval_batch_size = 256000
epochs = 200
shuffle = False
save_freq = False
ckpt_freq = 50

### scaler_parameter.py

This file contains the function that return the arguments for the scaler. The function take in argument the `benchmark` object (see notebook 4). In this example we use a standard iterative scaler implemented in LIPS.

In [None]:
## define a function that return the parameters of the scaler
from lips.benchmark.airfransBenchmark import AirfRANSBenchmark

def compute_scaler_parameters(benchmark):
    chunk_sizes=benchmark.train_dataset.get_simulations_sizes()
    no_norm_x=benchmark.train_dataset.get_no_normalization_axis_indices()
    scalerParams={"chunk_sizes":chunk_sizes,"no_norm_x":no_norm_x}
    return scalerParams

## Example 2 : custom model

This example is available in submission/custom_model

It correspond to a submission that use a custom model implemented in `my_augmented_simulator.py` and a pre-implemented scaler. It recreates the 2nd example from the 4th notebook.
This submission is composed of 4 files :
- parameters.json
- config.ini
- scaler_parameters.py
- my_augmented_simulator.py

### parameters.json


In this example we are using a custom model implemented in `my_augmented_simulator.py`.ed`

We also want to train and evaluate the model as such we indicate 
- `evaluateonly: false`
- `scoringonly": false`

#### simulator_config :
As we are use an already implemented simulator/model we use :
- `custom_simulator : true` 
Which indicate to the compute node that it will need to load the model from `my_augmented_simulator.py`
We name the model (used for saving an retrieving models, not important in this type of submission):
- `name: "MyAugmentedSimulator"`
And indicates to the compute node which model we are using :
- `model: "MyCustomFullyConnected"`
This correspond to the name of the class implemented in `my_augmented_simulator.py`.

In this example we also use a pre-implemented scaler : `from lips.dataset.scaler.standard_scaler_iterative import StandardScalerIterative`
Similarly we indicate which scaler class and implementation to load :
- `scaler_class: "standard_scaler_iterative"`
- `scaler: "StandardScalerIterative"`

We then indicate which configuration will need to be used from the config.ini file (in this example we use the standard config presented in the 1st example of notebook 4):
- `config_name: "DEFAULT"`

#### simulator_extra_parameters:
This section is used to pass custom parameters to the model, it presents in the same form as training_config.
In this case, we do not pass any custom parameters and `simulator_extra_parameters` stay empty :
- `simulator_extra_parameters: {}`

#### training_config:
We now configure to run the training for 10 epochs :
- `training_config: {"epoch": 10}`

Architecture_type is not used for the submission process and stay at "Classical"

The resulting `parameters.json` file :

In [None]:
{
  "evaluateonly": false,
  "scoringonly": false,
  "simulator_config": {
    "custom_simulator": true,
    "name": "MyAugmentedSimulator",
    "model": "MyCustomFullyConnected",
    "model_type": "my_augmented_simulator",
    "custom_scaler": false,
    "scaler_class": "standard_scaler_iterative",
    "scaler": "StandardScalerIterative",
    "config_name": "DEFAULT",
    "architecture_type": "Classical"
  },
  "simulator_extra_parameters": {},
  "training_config": {
    "epochs": 10
  }
}


### config.ini
This file is used to pass the configuration used in the model, as presented in notebook 4.
We had the configuration file as defined in previous notebook :

In [None]:
[DEFAULT]
name = "torch_fc"
layers = (64,64,8,64,64,64,8,64,64)
activation = "relu"
layer = "linear"
input_dropout = 0.0
dropout = 0.0
metrics = ("MAELoss",)
loss = {"name": "MSELoss",
        "params": {"size_average": None,
                   "reduce": None,
                   "reduction": 'mean'}}
device = "cpu"
optimizer = {"name": "adam",
             "params": {"lr": 2e-4}}
train_batch_size = 128000
eval_batch_size = 256000
epochs = 200
shuffle = False
save_freq = False
ckpt_freq = 50

### scaler_parameter.py

This file contains the function that return the arguments for the scaler. The function take in argument the `benchmark` object (see notebook 4). In this example we use a standard iterative scaler implemented in LIPS.

In [None]:
## define a function that return the parameters of the scaler
from lips.benchmark.airfransBenchmark import AirfRANSBenchmark

def compute_scaler_parameters(benchmark):
    chunk_sizes=benchmark.train_dataset.get_simulations_sizes()
    no_norm_x=benchmark.train_dataset.get_no_normalization_axis_indices()
    scalerParams={"chunk_sizes":chunk_sizes,"no_norm_x":no_norm_x}
    return scalerParams

### my_augmented_simulator.py

This file contains the implementation of a custom model. This implementation needs to be compatible with the LIPS simulator class in order for the simulation and evaluation processes to be able to access it. Here we implement and example of a fully connected pytorch model as seen in the 2nd example of the 4th notebook.

In [None]:
"""
Torch fully connected model
"""
import os
import pathlib
from typing import Union
import json

import numpy as np

import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import TensorDataset, DataLoader

from lips.dataset import DataSet
from lips.dataset.scaler import Scaler
from lips.logger import CustomLogger
from lips.config import ConfigManager
from lips.utils import NpEncoder

class MyCustomFullyConnected(nn.Module):
    def __init__(self,
                 sim_config_path: Union[pathlib.Path, str],
                 bench_config_path: Union[str, pathlib.Path],
                 sim_config_name: Union[str, None]=None,
                 bench_config_name: Union[str, None]=None,
                 name: Union[str, None]=None,
                 scaler: Union[Scaler, None]=None,
                 log_path: Union[None, pathlib.Path, str]=None,
                 **kwargs):
        super().__init__()
        if not os.path.exists(sim_config_path):
            raise RuntimeError("Configuration path for the simulator not found!")
        if not str(sim_config_path).endswith(".ini"):
            raise RuntimeError("The configuration file should have `.ini` extension!")
        # if test_custom_param:
        #     print("test_custom_param: ", test_custom_param)
        sim_config_name = sim_config_name if sim_config_name is not None else "DEFAULT"
        self.sim_config = ConfigManager(section_name=sim_config_name, path=sim_config_path)
        self.bench_config = ConfigManager(section_name=bench_config_name, path=bench_config_path)
        self.name = name if name is not None else self.sim_config.get_option("name")
        # scaler
        self.scaler = scaler
        # Logger
        self.log_path = log_path
        self.logger = CustomLogger(__class__.__name__, log_path).logger
        # model parameters
        self.params = self.sim_config.get_options_dict()
        self.params.update(kwargs)

        self.activation = {
            "relu": F.relu,
            "sigmoid": F.sigmoid,
            "tanh": F.tanh
        }

        self.input_size = None if kwargs.get("input_size") is None else kwargs["input_size"]
        self.output_size = None if kwargs.get("output_size") is None else kwargs["output_size"]

        self.input_layer = None
        self.input_dropout = None
        self.fc_layers = None
        self.dropout_layers = None
        self.output_layer = None

        #self.__build_model()

    def build_model(self):
        """Build the model architecture
        """
        linear_sizes = list(self.params["layers"])

        self.input_layer = nn.Linear(self.input_size, linear_sizes[0])
        self.input_dropout = nn.Dropout(p=self.params["input_dropout"])

        self.fc_layers = nn.ModuleList([nn.Linear(in_f, out_f) \
            for in_f, out_f in zip(linear_sizes[:-1], linear_sizes[1:])])

        self.dropout_layers = nn.ModuleList([nn.Dropout(p=self.params["dropout"]) \
            for _ in range(len(self.fc_layers))])

        self.output_layer = nn.Linear(linear_sizes[-1], self.output_size)

    def forward(self, data):
        """The forward pass of the model
        """
        out = self.input_layer(data)
        out = self.input_dropout(out)
        for _, (fc_, dropout) in enumerate(zip(self.fc_layers, self.dropout_layers)):
            out = fc_(out)
            out = self.activation[self.params["activation"]](out)
            out = dropout(out)
        out = self.output_layer(out)
        return out

    def process_dataset(self, dataset: DataSet, training: bool):
        """process the datasets for training and evaluation

        This function transforms all the dataset into something that can be used by the neural network (for example)

        Parameters
        ----------
        dataset : DataSet
            A dataset that should be processed
        training : bool, optional
            indicate if we are in training phase or not, by default False

        Returns
        -------
        DataLoader
            _description_
        """
        if training:
            self._infer_size(dataset)
            batch_size = self.params["train_batch_size"]
            extract_x, extract_y = dataset.extract_data()
            if self.scaler is not None:
                extract_x, extract_y = self.scaler.fit_transform(extract_x, extract_y)
        else:
            batch_size = self.params["eval_batch_size"]
            extract_x, extract_y = dataset.extract_data()
            if self.scaler is not None:
                extract_x, extract_y = self.scaler.transform(extract_x, extract_y)

        torch_dataset = TensorDataset(torch.from_numpy(extract_x).float(), torch.from_numpy(extract_y).float())
        data_loader = DataLoader(torch_dataset, batch_size=batch_size, shuffle=self.params["shuffle"])
        return data_loader

    def _post_process(self, data):
        """
        This function is used to inverse the predictions of the model to their original state, before scaling
        to be able to compare them with ground truth data
        """
        if self.scaler is not None:
            try:
                processed = self.scaler.inverse_transform(data)
            except TypeError:
                processed = self.scaler.inverse_transform(data.cpu())
        else:
            processed = data
        return processed

    def _infer_size(self, dataset: DataSet):
        """Infer the size of the input and ouput variables
        """
        *dim_inputs, self.output_size = dataset.get_sizes()
        self.input_size = np.sum(dim_inputs)

    def get_metadata(self):
        res_json = {}
        res_json["input_size"] = self.input_size
        res_json["output_size"] = self.output_size
        return res_json

    def _save_metadata(self, path: str):
        res_json = {}
        res_json["input_size"] = self.input_size
        res_json["output_size"] = self.output_size
        with open((path / "metadata.json"), "w", encoding="utf-8") as f:
            json.dump(obj=res_json, fp=f, indent=4, sort_keys=True, cls=NpEncoder)

    def _load_metadata(self, path: str):
        if not isinstance(path, pathlib.Path):
            path = pathlib.Path(path)
        with open((path / "metadata.json"), "r", encoding="utf-8") as f:
            res_json = json.load(fp=f)
        self.input_size = res_json["input_size"]
        self.output_size = res_json["output_size"]