# `split` tutorial

Welcome in the `split` module tutorial/example. Here, we will 
show how to use the `split` module to generate a training and
a testing set.

We need to load all the packages, including the `split` module. 

## Loading modules

Here, we need the significant module allowing to load `split` and use properly (optional).

In [None]:
import sys
import os
from pprint import pprint as pp # For pretty printing, optional

Now, we need to load the `split` module.

In [None]:
from amppcmt.split import Train_test_split

## Building the `Train_test_split` object

We need to build the `Train_test_split` object. In this example, we will use the variable `trajectory` to store the class. We will use the `CO_disso.traj` for demonstration. We will also use the default values for splitting: 75% of the trajectory for the training set and 25% of the trajectory for the testing set. We can set this percentage using the optional argument `training_set_percentage`.

In [None]:
trajectory = Train_test_split(
    trajectory = 'CO_disso.traj',
    training_set_percentage = 0.75
)

## Split the trajectory

### Randomly

The initial trajectory can be then splitted randomly using the `split_sets_random()` from the `Train_test_split` class.

In [None]:
trajectory.split_sets_random()

### Sequentially (time-splitting)

This method is more suited if data comes from *ab initio* molecular dynamics simulations. Indeed, it avoids that data present in the training and in the testing set to be similar. Such a similarity can lead to a pollution of the testing dataset, and create artefacts in the results.

Time-splitting is done using the `split_set_time()` from the `Train_test_split` class.

In [None]:
trajectory.split_sets_time()

Three files were generated:

- `CO_disso_train.traj`:
    Containing the data for the training set
- `CO_disso_test.traj`:
    Containing the data for the testing set
- `amp_split.log`:
    Logfile containing the information about the treatment

We are now down for the use of this module., hooping that you enjoyed it.

> This module is under development, please refer to the 'In development' page from the documentation to have more details