# Submission examples

The aim of this notebook is to show  examples of functionnal submission packages. 

#### General submission process

All submissions are processed by the codabench plateform. In order to submit a model for the competition, the submission folder need to be compressed as a zip file (be carefull to compress all the files and not the folder itself, the unzipping need to recreate the file and not a folder containing the files). This zip can then be uploaded on the `my_submission` tab :

![Alt text](utils/img/submission.png)

Once submitted it is processed by the codabench plateform and send to one of our compute node for evaluation. It is possible to see the current status of the submission (submitted, waiting for worker, running, done) however the logs will only be available once the submission is done running. 

Please note that we currently have a 12hours limit for the execution of a submission (training, evaluation and scoring).

## Example 1 : simple submission

This example is available in submission/simple, a torch and tensorflow variations are provided.

It correspond to a simple submission that use pre-implemented model and scaler and recreate the 1st example from the 4th notebook.
This submission is composed of 3 files :
- parameters.json
- config.ini
- scaler_parameters.py




### parameters.json


In this example we are using a fully connected model implemented through torch and already available in the LIPS package :
- `from lips.augmented_simulators.torch_models.fully_connected import TorchFullyConnected`

We also want to train and evaluate the model as such we indicate 
- `evaluateonly: false`
- `scoringonly": false`

#### simulator_config :
As we are use an already implemented simulator/model we use :
- `simulator_type : simple_torch` 
Which indicate to the compute node that it will need to load the model from the LIPS package
We name the model (used for saving an retrieving models, not important in this type of submission):
- `name: "MyAugmentedSimulator"`
And indicates to the compute node which model class and implementation we are using :
- `model_type: "fully_connected"`
- `model: "TorchFullyConnected"`

This will load the following model when running :  `from lips.augmented_simulators.torch_models.fully_connected import TorchFullyConnected`

In this example we also use a pre-implemented scaler : `from lips.dataset.scaler.standard_scaler import StandardScaler`
Similarly we indicate which scaler class and implementation to load :
- `scaler_type: "simple"`
- `scaler_class: "standard_scaler"`
- `scaler: "StandardScaler"`

We then indicate which configuration will need to be used from the config.ini file (in this example we use the standard config presented in the 1st example of notebook 4):
- `config_name: "DEFAULT"`

#### simulator_extra_parameters:
This section is used to pass custom parameters to the model and generally will be only used in association with a custom model as presented in the following example.
As we are running a pre-implemented model, we do not pass any custom parameters and `simulator_extra_parameters` stay empty :
- `simulator_extra_parameters: {}`

#### training_config:
We now configure to run the training for 10 epochs :
- `training_config: {"epoch": 10}`

Architecture_type is not used for the submission process and stay at "Classical"

The resulting `parameters.json` file :

In [None]:
{
  "evaluateonly": false,
  "scoringonly": false,
  "simulator_config": {
    "simulator_type": "simple_torch",
    "name": "MyAugmentedSimulator",
    "model": "TorchFullyConnected",
    "model_type": "fully_connected",
    "scaler_type": "simple",
    "scaler_class": "standard_scaler",
    "scaler": "StandardScaler",
    "config_name": "DEFAULT",
  },
  "simulator_extra_parameters": {},
  "training_config": {
    "epochs": 10
  }
}

### config.ini
This file is used to pass the configuration used in the model, as presented in notebook 4.
We had the configuration file as defined in previous notebook :

In [None]:
[DEFAULT]
name = "torch_fc"
layers = (64,64,8,64,64,64,8,64,64)
activation = "relu"
layer = "linear"
input_dropout = 0.0
dropout = 0.0
metrics = ("MAELoss",)
loss = {"name": "MSELoss",
        "params": {"size_average": None,
                   "reduce": None,
                   "reduction": 'mean'}}
device = "cpu"
optimizer = {"name": "adam",
             "params": {"lr": 2e-4}}
train_batch_size = 128000
eval_batch_size = 256000
epochs = 200
shuffle = False
save_freq = False
ckpt_freq = 50

### scaler_parameter.py

This file contains the function that return the arguments for the scaler. The function take in argument the `benchmark` object (see notebook 4). In this example we use a standard iterative scaler implemented in LIPS.

In [None]:
## define a function that return the parameters of the scaler
from lips.benchmark.airfransBenchmark import AirfRANSBenchmark

def compute_scaler_parameters(benchmark):
    chunk_sizes=benchmark.train_dataset.get_simulations_sizes()
    no_norm_x=benchmark.train_dataset.get_no_normalization_axis_indices()
    scalerParams={"chunk_sizes":chunk_sizes,"no_norm_x":no_norm_x}
    return scalerParams

## Example 2 : custom torch model using LIPS simulator class

This example is available in submission/custom_model

It correspond to a submission that use a custom model implemented in `my_augmented_simulator.py` and a pre-implemented scaler. It recreates the 2nd example from the 4th notebook.
This submission is composed of 4 files :
- parameters.json
- config.ini
- scaler_parameters.py
- my_augmented_simulator.py

### parameters.json


In this example we are using a custom model implemented in `my_augmented_simulator.py`

We also want to train and evaluate the model as such we indicate 
- `evaluateonly: false`
- `scoringonly": false`

#### simulator_config :
As we are use a custom torch simulator we use :
- `simulator_type : "simple_torch"`
- `simulator_file : "my_augmented_simulator"`
Which indicate to the compute node that it will need to load the model from `my_augmented_simulator.py`

We name the model (used for saving an retrieving models, not important in this type of submission):
- `name: "MyAugmentedSimulator"`
And indicates to the compute node which model we are using :
- `model: "MyCustomFullyConnected"`
This correspond to the name of the class implemented in `my_augmented_simulator.py`.

In this example we also use a pre-implemented scaler : `from lips.dataset.scaler.standard_scaler import StandardScaler`
Similarly we indicate which scaler class and implementation to load :
- `scaler_type: "simple"`
- `scaler_class: "standard_scaler"`
- `scaler: "StandardScaler"`

We then indicate which configuration will need to be used from the config.ini file (in this example we use the standard config presented in the 1st example of notebook 4):
- `config_name: "DEFAULT"`

#### simulator_extra_parameters:
This section is used to pass custom parameters to the model, it presents in the same form as training_config.
In this case, we do not pass any custom parameters and `simulator_extra_parameters` stay empty :
- `simulator_extra_parameters: {}`

#### training_config:
We now configure to run the training for 10 epochs :
- `training_config: {"epoch": 10}`

Architecture_type is not used for the submission process and stay at "Classical"

The resulting `parameters.json` file :

In [None]:
{
  "evaluateonly": false,
  "scoringonly": false,
  "simulator_config": {
    "simulator_type": "simple_torch",
    "simulator_file" : "my_augmented_simulator",
    "name": "MyAugmentedSimulator",
    "model": "MyCustomFullyConnected",
    "model_type": "my_augmented_simulator",
    "scaler_type": "simple",
    "scaler_class": "standard_scaler",
    "scaler": "StandardScaler",
    "config_name": "DEFAULT",
  },
  "simulator_extra_parameters": {},
  "training_config": {
    "epochs": 10
  }
}


### config.ini
This file is used to pass the configuration used in the model, as presented in notebook 4.
We had the configuration file as defined in previous notebook :

In [None]:
[DEFAULT]
name = "torch_fc"
layers = (64,64,8,64,64,64,8,64,64)
activation = "relu"
layer = "linear"
input_dropout = 0.0
dropout = 0.0
metrics = ("MAELoss",)
loss = {"name": "MSELoss",
        "params": {"size_average": None,
                   "reduce": None,
                   "reduction": 'mean'}}
device = "cpu"
optimizer = {"name": "adam",
             "params": {"lr": 2e-4}}
train_batch_size = 128000
eval_batch_size = 256000
epochs = 200
shuffle = False
save_freq = False
ckpt_freq = 50

### scaler_parameter.py

This file contains the function that return the arguments for the scaler. The function take in argument the `benchmark` object (see notebook 4). In this example we use a standard iterative scaler implemented in LIPS.

In [None]:
## define a function that return the parameters of the scaler
from lips.benchmark.airfransBenchmark import AirfRANSBenchmark

def compute_scaler_parameters(benchmark):
    chunk_sizes=benchmark.train_dataset.get_simulations_sizes()
    no_norm_x=benchmark.train_dataset.get_no_normalization_axis_indices()
    scalerParams={"chunk_sizes":chunk_sizes,"no_norm_x":no_norm_x}
    return scalerParams

### my_augmented_simulator.py

This file contains the implementation of a custom model. This implementation needs to be compatible with the LIPS simulator class in order for the simulation and evaluation processes to be able to access it. Here we implement and example of a fully connected pytorch model as seen in the 2nd example of the 4th notebook.

In [None]:
"""
Torch fully connected model
"""
import os
import pathlib
from typing import Union
import json

import numpy as np

import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import TensorDataset, DataLoader

from lips.dataset import DataSet
from lips.dataset.scaler import Scaler
from lips.logger import CustomLogger
from lips.config import ConfigManager
from lips.utils import NpEncoder

class MyCustomFullyConnected(nn.Module):
    def __init__(self,
                 sim_config_path: Union[pathlib.Path, str],
                 bench_config_path: Union[str, pathlib.Path],
                 sim_config_name: Union[str, None]=None,
                 bench_config_name: Union[str, None]=None,
                 name: Union[str, None]=None,
                 scaler: Union[Scaler, None]=None,
                 log_path: Union[None, pathlib.Path, str]=None,
                 **kwargs):
        super().__init__()
        if not os.path.exists(sim_config_path):
            raise RuntimeError("Configuration path for the simulator not found!")
        if not str(sim_config_path).endswith(".ini"):
            raise RuntimeError("The configuration file should have `.ini` extension!")
        # if test_custom_param:
        #     print("test_custom_param: ", test_custom_param)
        sim_config_name = sim_config_name if sim_config_name is not None else "DEFAULT"
        self.sim_config = ConfigManager(section_name=sim_config_name, path=sim_config_path)
        self.bench_config = ConfigManager(section_name=bench_config_name, path=bench_config_path)
        self.name = name if name is not None else self.sim_config.get_option("name")
        # scaler
        self.scaler = scaler
        # Logger
        self.log_path = log_path
        self.logger = CustomLogger(__class__.__name__, log_path).logger
        # model parameters
        self.params = self.sim_config.get_options_dict()
        self.params.update(kwargs)

        self.activation = {
            "relu": F.relu,
            "sigmoid": F.sigmoid,
            "tanh": F.tanh
        }

        self.input_size = None if kwargs.get("input_size") is None else kwargs["input_size"]
        self.output_size = None if kwargs.get("output_size") is None else kwargs["output_size"]

        self.input_layer = None
        self.input_dropout = None
        self.fc_layers = None
        self.dropout_layers = None
        self.output_layer = None

        #self.__build_model()

    def build_model(self):
        """Build the model architecture
        """
        linear_sizes = list(self.params["layers"])

        self.input_layer = nn.Linear(self.input_size, linear_sizes[0])
        self.input_dropout = nn.Dropout(p=self.params["input_dropout"])

        self.fc_layers = nn.ModuleList([nn.Linear(in_f, out_f) \
            for in_f, out_f in zip(linear_sizes[:-1], linear_sizes[1:])])

        self.dropout_layers = nn.ModuleList([nn.Dropout(p=self.params["dropout"]) \
            for _ in range(len(self.fc_layers))])

        self.output_layer = nn.Linear(linear_sizes[-1], self.output_size)

    def forward(self, data):
        """The forward pass of the model
        """
        out = self.input_layer(data)
        out = self.input_dropout(out)
        for _, (fc_, dropout) in enumerate(zip(self.fc_layers, self.dropout_layers)):
            out = fc_(out)
            out = self.activation[self.params["activation"]](out)
            out = dropout(out)
        out = self.output_layer(out)
        return out

    def process_dataset(self, dataset: DataSet, training: bool):
        """process the datasets for training and evaluation

        This function transforms all the dataset into something that can be used by the neural network (for example)

        Parameters
        ----------
        dataset : DataSet
            A dataset that should be processed
        training : bool, optional
            indicate if we are in training phase or not, by default False

        Returns
        -------
        DataLoader
            _description_
        """
        if training:
            self._infer_size(dataset)
            batch_size = self.params["train_batch_size"]
            extract_x, extract_y = dataset.extract_data()
            if self.scaler is not None:
                extract_x, extract_y = self.scaler.fit_transform(extract_x, extract_y)
        else:
            batch_size = self.params["eval_batch_size"]
            extract_x, extract_y = dataset.extract_data()
            if self.scaler is not None:
                extract_x, extract_y = self.scaler.transform(extract_x, extract_y)

        torch_dataset = TensorDataset(torch.from_numpy(extract_x).float(), torch.from_numpy(extract_y).float())
        data_loader = DataLoader(torch_dataset, batch_size=batch_size, shuffle=self.params["shuffle"])
        return data_loader

    def _post_process(self, data):
        """
        This function is used to inverse the predictions of the model to their original state, before scaling
        to be able to compare them with ground truth data
        """
        if self.scaler is not None:
            try:
                processed = self.scaler.inverse_transform(data)
            except TypeError:
                processed = self.scaler.inverse_transform(data.cpu())
        else:
            processed = data
        return processed

    def _infer_size(self, dataset: DataSet):
        """Infer the size of the input and ouput variables
        """
        *dim_inputs, self.output_size = dataset.get_sizes()
        self.input_size = np.sum(dim_inputs)

    def get_metadata(self):
        res_json = {}
        res_json["input_size"] = self.input_size
        res_json["output_size"] = self.output_size
        return res_json

    def _save_metadata(self, path: str):
        res_json = {}
        res_json["input_size"] = self.input_size
        res_json["output_size"] = self.output_size
        with open((path / "metadata.json"), "w", encoding="utf-8") as f:
            json.dump(obj=res_json, fp=f, indent=4, sort_keys=True, cls=NpEncoder)

    def _load_metadata(self, path: str):
        if not isinstance(path, pathlib.Path):
            path = pathlib.Path(path)
        with open((path / "metadata.json"), "r", encoding="utf-8") as f:
            res_json = json.load(fp=f)
        self.input_size = res_json["input_size"]
        self.output_size = res_json["output_size"]

## Example 3 : fully custom model independent from the LIPS framework

This example is available in submission/fully_custom_model

It correspond to a submission that use a custom model implemented in `my_augmented_simulator.py`. It recreates the 3b notebook.
This submission is composed of 2 files  :
- parameters.json
- my_augmented_simulator.py

In [None]:
### parameters.json


In this example we are using a custom model implemented in `my_augmented_simulator.py`

We also want to train and evaluate the model as such we indicate 
- `evaluateonly: false`
- `scoringonly": false`

#### simulator_config :
As we are use a custom simulator we use :
- `simulator_type : "custom"`
- `simulator_file : "my_augmented_simulator"`
Which indicate to the compute node that it will need to load the model from `my_augmented_simulator.py`

We name the model (used for saving an retrieving models, not important in this type of submission):
- `name: "MyAugmentedSimulator"`
And indicates to the compute node which model we are using :
- `model: "MyCustomFullyConnected"`
This correspond to the name of the class implemented in `my_augmented_simulator.py`.

In this type of submission all data treatment including scalers need to be implemented in the model, we therefore use :
- `scaler_type`: "None"

#### simulator_extra_parameters:
This section is used to pass custom parameters to the model, it presents in the same form as training_config. For this example we use the following:
- `simulator_extra_parameters: {    
    "encoder": [7, 64, 64, 8],
    "decoder": [8, 64, 64, 4],
    "nb_hidden_layers": 3,
    "size_hidden_layers": 64,
    "batch_size": 1,
    "nb_epochs": 600,
    "lr": 0.001,
    "bn_bool": true,
    "subsampling": 32000}`

#### training_config:
We do not use special parameters in the training (they are passed directly to the simulator in this implementation):
- `training_config: {}`

The resulting `parameters.json` file :

In [None]:
{
  "evaluateonly": false,
  "scoringonly": false,
  "simulator_config": {
    "simulator_type": "custom",
    "simulator_file": "my_augmented_simulator",
    "name": "MyAugmentedSimulator",
    "model": "AugmentedSimulator",
    "scaler_type": "None"
   },
  "simulator_extra_parameters": {
    "encoder": [7, 64, 64, 8],
    "decoder": [8, 64, 64, 4],
    "nb_hidden_layers": 3,
    "size_hidden_layers": 64,
    "batch_size": 1,
    "nb_epochs": 600,
    "lr": 0.001,
    "bn_bool": true,
    "subsampling": 32000
  },
  "training_config": {}
}

### my_augmented_simulator.py

This file contains the implementation of a custom model. The corresponding class need to be runnable by the ingestion process and as such needs the following functions :
- __init__(self,benchmark,**kwargs)
- train(self,train_dataset, save_path=None)
- predict(self,dataset,**kwargs)

In [None]:
import os
import time
import random
import math 

from tqdm import tqdm
import numpy as np

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch import Tensor
from torch.nn import BatchNorm1d, Identity
from torch.nn import Linear
from torch_geometric.loader import DataLoader
from torch_geometric.data import Data

from lips import get_root_path
from lips.benchmark.airfransBenchmark import AirfRANSBenchmark
from lips.dataset.airfransDataSet import download_data
from lips.dataset.scaler.standard_scaler_iterative import StandardScalerIterative

class MLP(torch.nn.Module):
    def __init__(self, channel_list, dropout = 0.,
                 batch_norm = True, relu_first = False):
        super().__init__()
        assert len(channel_list) >= 2
        self.channel_list = channel_list
        self.dropout = dropout
        self.relu_first = relu_first

        self.lins = torch.nn.ModuleList()
        for dims in zip(self.channel_list[:-1], self.channel_list[1:]):
            self.lins.append(Linear(*dims))

        self.norms = torch.nn.ModuleList()
        for dim in zip(self.channel_list[1:-1]):
            self.norms.append(BatchNorm1d(dim, track_running_stats = False) if batch_norm else Identity())

        self.reset_parameters()

    def reset_parameters(self):
        for lin in self.lins:
            lin.reset_parameters()
        for norm in self.norms:
            if hasattr(norm, 'reset_parameters'):
                norm.reset_parameters()


    def forward(self, x: Tensor) -> Tensor:
        """"""
        x = self.lins[0](x)
        for lin, norm in zip(self.lins[1:], self.norms):
            if self.relu_first:
                x = x.relu_()
            x = norm(x)
            if not self.relu_first:
                x = x.relu_()
            x = F.dropout(x, p = self.dropout, training = self.training)
            x = lin.forward(x)
        return x


    def __repr__(self) -> str:
        return f'{self.__class__.__name__}({str(self.channel_list)[1:-1]})'



class NN(torch.nn.Module):
    def __init__(self, hparams, encoder, decoder):
        super(NN, self).__init__()
        self.nb_hidden_layers = hparams['nb_hidden_layers']
        self.size_hidden_layers = hparams['size_hidden_layers']
        self.bn_bool = hparams['bn_bool']
        self.activation = nn.ReLU()
        self.encoder = encoder
        self.decoder = decoder
        self.dim_enc = hparams['encoder'][-1]
        self.nn = MLP([self.dim_enc] + [self.size_hidden_layers]*self.nb_hidden_layers + [self.dim_enc], batch_norm = self.bn_bool)

    def forward(self, data):
        z = self.encoder(data.x)        
        z = self.nn(z)
        z = self.decoder(z)
        return z


class AugmentedSimulator():
    def __init__(self,benchmark,**kwargs):
        self.name = "AirfRANSSubmission"
        chunk_sizes=benchmark.train_dataset.get_simulations_sizes()
        scalerParams={"chunk_sizes":chunk_sizes}
        self.scaler = StandardScalerIterative(**scalerParams)

        self.model = None
        self.hparams = kwargs
        use_cuda = torch.cuda.is_available()
        self.device = 'cuda:0' if use_cuda else 'cpu'
        if use_cuda:
            print('Using GPU')
        else:
            print('Using CPU')

        encoder = MLP(self.hparams['encoder'], batch_norm = False)
        decoder = MLP(self.hparams['decoder'], batch_norm = False)
        self.model = NN(self.hparams, encoder, decoder)

    def process_dataset(self, dataset, training: bool) -> DataLoader:
        coord_x=dataset.data['x-position']
        coord_y=dataset.data['y-position']
        surf_bool=dataset.extra_data['surface']
        position = np.stack([coord_x,coord_y],axis=1)

        nodes_features,node_labels=dataset.extract_data()
        if training:
            print("Normalize train data")
            nodes_features, node_labels = self.scaler.fit_transform(nodes_features, node_labels)
        else:
            print("Normalize not train data")
            nodes_features, node_labels = self.scaler.transform(nodes_features, node_labels)

        torchDataset=[]
        nb_nodes_in_simulations = dataset.get_simulations_sizes()
        start_index = 0
        for nb_nodes_in_simulation in nb_nodes_in_simulations:
            end_index = start_index+nb_nodes_in_simulation
            simulation_positions = torch.tensor(position[start_index:end_index,:], dtype = torch.float) 
            simulation_features = torch.tensor(nodes_features[start_index:end_index,:], dtype = torch.float) 
            simulation_labels = torch.tensor(node_labels[start_index:end_index,:], dtype = torch.float) 
            simulation_surface = torch.tensor(surf_bool[start_index:end_index])

            sampleData=Data(pos=simulation_positions,
                            x=simulation_features, 
                            y=simulation_labels,
                            surf = simulation_surface.bool()) 
            torchDataset.append(sampleData)
            start_index += nb_nodes_in_simulation
        return DataLoader(dataset=torchDataset,batch_size=1)

    def train(self,train_dataset, save_path=None):
        train_dataset = self.process_dataset(dataset=train_dataset,training=True)
        model = global_train(self.device, train_dataset, self.model, self.hparams,criterion = 'MSE_weighted')

    def predict(self,dataset,**kwargs):
        print(dataset)
        test_dataset = self.process_dataset(dataset=dataset,training=False)
        self.model.eval()
        avg_loss_per_var = np.zeros(4)
        avg_loss = 0
        avg_loss_surf_var = np.zeros(4)
        avg_loss_vol_var = np.zeros(4)
        avg_loss_surf = 0
        avg_loss_vol = 0
        iterNum = 0

        predictions=[]
        with torch.no_grad():
            for data in test_dataset:        
                data_clone = data.clone()
                data_clone = data_clone.to(self.device)
                out = self.model(data_clone)

                targets = data_clone.y
                loss_criterion = nn.MSELoss(reduction = 'none')

                loss_per_var = loss_criterion(out, targets).mean(dim = 0)
                loss = loss_per_var.mean()
                loss_surf_var = loss_criterion(out[data_clone.surf, :], targets[data_clone.surf, :]).mean(dim = 0)
                loss_vol_var = loss_criterion(out[~data_clone.surf, :], targets[~data_clone.surf, :]).mean(dim = 0)
                loss_surf = loss_surf_var.mean()
                loss_vol = loss_vol_var.mean()  

                avg_loss_per_var += loss_per_var.cpu().numpy()
                avg_loss += loss.cpu().numpy()
                avg_loss_surf_var += loss_surf_var.cpu().numpy()
                avg_loss_vol_var += loss_vol_var.cpu().numpy()
                avg_loss_surf += loss_surf.cpu().numpy()
                avg_loss_vol += loss_vol.cpu().numpy()  
                iterNum += 1

                out = out.cpu().data.numpy()
                prediction = self._post_process(out)
                predictions.append(prediction)
        print("Results for test")
        print(avg_loss/iterNum, avg_loss_per_var/iterNum, avg_loss_surf_var/iterNum, avg_loss_vol_var/iterNum, avg_loss_surf/iterNum, avg_loss_vol/iterNum)
        predictions= np.vstack(predictions)
        predictions = dataset.reconstruct_output(predictions)
        return predictions

    def _post_process(self, data):
        try:
            processed = self.scaler.inverse_transform(data)
        except TypeError:
            processed = self.scaler.inverse_transform(data.cpu())
        return processed


def global_train(device, train_dataset, network, hparams, criterion = 'MSE', reg = 1):
    model = network.to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr = hparams['lr'])
    lr_scheduler = torch.optim.lr_scheduler.OneCycleLR(
            optimizer,
            max_lr = hparams['lr'],
            total_steps = (len(train_dataset) // hparams['batch_size'] + 1) * hparams['nb_epochs'],
        )
    start = time.time()

    train_loss_surf_list = []
    train_loss_vol_list = []
    loss_surf_var_list = []
    loss_vol_var_list = []

    pbar_train = tqdm(range(hparams['nb_epochs']), position=0)
    for epoch in pbar_train:        
        train_dataset_sampled = []
        for data in train_dataset:
            data_sampled = data.clone()
            idx = random.sample(range(data_sampled.x.size(0)), hparams['subsampling'])
            idx = torch.tensor(idx)

            data_sampled.pos = data_sampled.pos[idx]
            data_sampled.x = data_sampled.x[idx]
            data_sampled.y = data_sampled.y[idx]
            data_sampled.surf = data_sampled.surf[idx]
            train_dataset_sampled.append(data_sampled)
        train_loader = DataLoader(train_dataset_sampled, batch_size = hparams['batch_size'], shuffle = True)
        del(train_dataset_sampled)

        train_loss, _, loss_surf_var, loss_vol_var, loss_surf, loss_vol = train_model(device, model, train_loader, optimizer, lr_scheduler, criterion, reg = reg)        
        if criterion == 'MSE_weighted':
            train_loss = reg*loss_surf + loss_vol
        del(train_loader)

        train_loss_surf_list.append(loss_surf)
        train_loss_vol_list.append(loss_vol)
        loss_surf_var_list.append(loss_surf_var)
        loss_vol_var_list.append(loss_vol_var)

    loss_surf_var_list = np.array(loss_surf_var_list)
    loss_vol_var_list = np.array(loss_vol_var_list)

    return model

def train_model(device, model, train_loader, optimizer, scheduler, criterion = 'MSE', reg = 1):
    model.train()
    avg_loss_per_var = torch.zeros(4, device = device)
    avg_loss = 0
    avg_loss_surf_var = torch.zeros(4, device = device)
    avg_loss_vol_var = torch.zeros(4, device = device)
    avg_loss_surf = 0
    avg_loss_vol = 0
    iterNum = 0
    
    for data in train_loader:
        data_clone = data.clone()
        data_clone = data_clone.to(device)   
        optimizer.zero_grad()  
        out = model(data_clone)
        targets = data_clone.y

        if criterion == 'MSE' or criterion == 'MSE_weighted':
            loss_criterion = nn.MSELoss(reduction = 'none')
        elif criterion == 'MAE':
            loss_criterion = nn.L1Loss(reduction = 'none')
        loss_per_var = loss_criterion(out, targets).mean(dim = 0)
        total_loss = loss_per_var.mean()
        loss_surf_var = loss_criterion(out[data_clone.surf, :], targets[data_clone.surf, :]).mean(dim = 0)
        loss_vol_var = loss_criterion(out[~data_clone.surf, :], targets[~data_clone.surf, :]).mean(dim = 0)
        loss_surf = loss_surf_var.mean()
        loss_vol = loss_vol_var.mean()

        if criterion == 'MSE_weighted':            
            (loss_vol + reg*loss_surf).backward()           
        else:
            total_loss.backward()
        
        optimizer.step()
        scheduler.step()
        avg_loss_per_var += loss_per_var
        avg_loss += total_loss
        avg_loss_surf_var += loss_surf_var
        avg_loss_vol_var += loss_vol_var
        avg_loss_surf += loss_surf
        avg_loss_vol += loss_vol 
        iterNum += 1

    return avg_loss.cpu().data.numpy()/iterNum, avg_loss_per_var.cpu().data.numpy()/iterNum, avg_loss_surf_var.cpu().data.numpy()/iterNum, avg_loss_vol_var.cpu().data.numpy()/iterNum, \
            avg_loss_surf.cpu().data.numpy()/iterNum, avg_loss_vol.cpu().data.numpy()/iterNum


In [None]:
## Example 4 : load trained model

This example is available in submission/load_trained_model

**Note : This type of submission is only for informative purpose, in order for a submission to be valid for the final ranking it needs to be trained and evaluated by the compute node.**
We offer the possibility to load a trained model in order to evaluate and score it on the compute node. This can be useful to test the submission while limiting the use of compute power on the competition part.

In this example we use the same custom model as the previous example.It recreates the 3b notebook with a pre-trained model.
This submission is composed of 2 files and a folder containing the pre-trained model :
- parameters.json
- my_augmented_simulator.py
- trained_model/

In [None]:
### parameters.json

As we are using the same model as example 3 we only need to change the evaluateonly parameter :
- `evaluateonly: true`
- `scoringonly": false`

The rest of the parameters stay the same and correspond to the parameters needed for the model being loaded:
#### simulator_config :
As we are use a custom simulator we use :
- `simulator_type : "custom"`
- `simulator_file : "my_augmented_simulator"`
Which indicate to the compute node that it will need to load the model from `my_augmented_simulator.py`

We name the model (used for saving an retrieving models, not important in this type of submission):
- `name: "MyAugmentedSimulator"`
And indicates to the compute node which model we are using :
- `model: "MyCustomFullyConnected"`
This correspond to the name of the class implemented in `my_augmented_simulator.py`.

In this type of submission all data treatment including scalers need to be implemented in the model, we therefore use :
- `scaler_type`: "None"

#### simulator_extra_parameters:
This section is used to pass custom parameters to the model, it presents in the same form as training_config. For this example we use the following:
- `simulator_extra_parameters: {    
    "encoder": [7, 64, 64, 8],
    "decoder": [8, 64, 64, 4],
    "nb_hidden_layers": 3,
    "size_hidden_layers": 64,
    "batch_size": 1,
    "nb_epochs": 600,
    "lr": 0.001,
    "bn_bool": true,
    "subsampling": 32000}`

#### training_config:
We do not use special parameters in the training (they are passed directly to the simulator in this implementation):
- `training_config: {}`

The resulting `parameters.json` file :

In [None]:
{
  "evaluateonly": false,
  "scoringonly": false,
  "simulator_config": {
    "simulator_type": "custom",
    "simulator_file": "my_augmented_simulator",
    "name": "MyAugmentedSimulator",
    "model": "AugmentedSimulator",
    "scaler_type": "None"
   },
  "simulator_extra_parameters": {
    "encoder": [7, 64, 64, 8],
    "decoder": [8, 64, 64, 4],
    "nb_hidden_layers": 3,
    "size_hidden_layers": 64,
    "batch_size": 1,
    "nb_epochs": 600,
    "lr": 0.001,
    "bn_bool": true,
    "subsampling": 32000
  },
  "training_config": {}
}

In order to be saved and loaded the simulator need to also implement the following function which is called while running the ingestion :
- restore(self, path:str) 




In [None]:
#The following functions are added to the simulator class in order to load the model and the scaler:
    def restore(self, path):
        self.load_model(path_model=os.path.join(path, 'SaveFCModel.pt'), path_scaler=os.path.join(path, 'SaveScaler'))

    def save_model(self, path_model:str,path_scaler:str):
        modelWeight=self.model.state_dict()
        torch.save(modelWeight,path_model)
        self.scaler.save(path_scaler)

    def load_model(self, path_model:str,path_scaler:str):
        model_loader=torch.load(path_model)
        self.model.load_state_dict(model_loader)
        self.model = self.model.to(self.device)
        self.scaler.load(path_scaler)