## Entraînement du modèle DIST_REL_C_05 et analyse des performances

Nous allons entraîner un nouveau modèle dont l'objectif est de prédire les longueurs de liaisons carbone-carbone. Ce modèle est identique au modèle DIST_REL_C_02 (cut off des distances au voisinage des liaisons à 2Å), à la différence que l'on va limiter la taille de l'entrée à 15 molécules au voisinage de la liaison (contre 58 précédemment).

Aucune liaison ne possédant plus de 15 atomes à son voisinage, cela ne devrait pas changer les performances du modèle mais devrait accélérer l'entraînement car la largeur de l'entrée du réseau et des couches cachées sera réduite.

### JSON

In [None]:
{
  "paths":{
        "train_set_loc":"../../data/train_set_riken_v2_reduced.h5",
        "test_set_loc":"../../data/test_set_riken_v2_reduced.h5",
        "train_prepared_input_loc":"../../data/DIST_REL_C_05/train_set_prepared_input.h5",
        "test_prepared_input_loc":"../../data/DIST_REL_C_05/test_set_prepared_input.h5",
        "train_labels_loc":"../../data/DIST_REL_C_05/train_set_labels.h5",
        "test_labels_loc":"../../data/DIST_REL_C_05/test_set_labels.h5",
        "model_loc":"../../models/DIST_REL_C_05/DIST_REL_C_05.tflearn",
        "logs_dir":"../../models/DIST_REL_C_05/logs/"
  },
  "tasks":[
    {
      "prepare_model_data": {
        "selected_mols": {
          "mol_min_size": "2",
          "mol_max_size": "60",
          "max_anum": "9",
          "anum_1": "6",
          "anum_2": "6",
          "min_bond_size": "0",
          "max_bond_size": "1.6",
          "bond_max_neighbours":"15"
        },
        "params": {
          "nb_mol_from_train": "400000",
          "nb_mol_from_test": "80000",
          "pos_class": "True",
          "one_hot_anums": "True",
          "amasses": "True",
          "distances": "True",
          "distances_cut_off": "2",
          "batch_size": "100000",
          "distances_fun":"squareinv"
        }
      }
    },
    {
      "model_train":{
        "model_name":"DIST_REL_C_05",
        "model_type":"NN",
        "params":{
          "epochs":"150",
          "last_layer_width":"225",
          "batch_size":"5000",
          "learning_rate":"0.01",
          "epsilon":"0.001",
          "stddev_init":"0.001",
          "hidden_act":"elu",
          "outlayer_act":"linear",
          "depth":"3",
          "weight_decay":"0.001",
          "gpu_mem_prop":"0.48",
          "save_model":"True",
          "dropout":"0.98"
        }
      }
    }
  ]
}

## Stats

#### JSON


In [None]:
{
  "paths":{
        "train_prepared_input_loc":"../../data/DIST_REL_C_05/train_set_prepared_input.h5",
        "test_prepared_input_loc":"../../data/DIST_REL_C_05/test_set_prepared_input.h5",
        "train_labels_loc":"../../data/DIST_REL_C_05/train_set_labels.h5",
        "test_labels_loc":"../../data/DIST_REL_C_05/test_set_labels.h5",
        "model_loc":"../../models/DIST_REL_C_05/DIST_REL_C_05.tflearn",
        "logs_dir":"../../models/DIST_REL_C_05/logs/",
        "bonds_lengths_loc":"/home/jleguy/data/stats/CC/CC_bonds_lengths_total_set.h5",
        "plots_dir":"../../figures/DIST_REL_C_05/"
  },
  "tasks":[
    {
      "plot_predictions": {
        "params": {
          "model_name": "DIST_REL_C_05",
          "model_type": "NN",
          "anum_1": "6",
          "anum_2": "6",
          "plot_error_distrib": "True",
          "plot_targets_error_distrib": "True",
          "plot_targets_predictions": "True",
          "batch_size": "10000",
          "last_layer_width": "225",
          "depth": "3",
          "hidden_act": "elu",
          "outlayer_act": "linear",
          "display_plots": "True"
        }
      }
    }
  ]
}

### DIST_REL_C_05

#### Statistiques erreurs

```
Plotting DIST_REL_C_05
Dataset size : 554434
Mean error : 0.7683516
Median error : 0.6100708
Standard deviation : 0.69729495
Min error : 0.0
Max error : 29.745947
```

#### Distribution des erreurs

![title](../figures/DIST_REL_C_05/DIST_REL_C_05_distrib_rmse_val.png)

#### Erreurs en fonction des distances cibles

![title](../figures/DIST_REL_C_05/DIST_REL_C_05_distrib_rmse_dist.png)


#### Prédiction en fonction des distances cibles

![title](../figures/DIST_REL_C_05/DIST_REL_C_05_preds_targets.png)