# Train-test splitting with Calendars

As the Lilio calendar system was designed with machine-learning in mind, a train-test module is included which aids in generating train/test splits.

Currently this feature is only supported for `xarray` data.

Let's start by generating some dummy data:

In [None]:
import numpy as np
import pandas as pd
import xarray as xr
import lilio

# Hide the full data when displaying a dataset in the notebook
xr.set_options(display_expand_data=False) 

n = 50
time_index = pd.date_range("20151020", periods=n, freq="60d")
time_coord = {"time": time_index}
x1 = xr.DataArray(np.random.randn(n), coords=time_coord, name="precursor1")
x2 = xr.DataArray(np.random.randn(n), coords=time_coord, name="precursor2")
y = xr.DataArray(np.random.randn(n), coords=time_coord, name="target")
print(x1)

Next we will need a calendar, and use it to resample the dummy data:

In [None]:
calendar = lilio.daily_calendar(anchor="10-15", length="180d")
calendar.map_to_data(x1)
x1 = lilio.resample(calendar, x1)
x2 = lilio.resample(calendar, x2)
y = lilio.resample(calendar, y)

print(x1)

Now we are ready to create train and test splits of our data. We setup a strategy (`KFold`),
and give this to `lilio.traintest.TrainTestSplit`.

We can use this cross validator to split our datasets `x1` and `x2`, as well as the target data `y`:

In [None]:
# Cross-validation
from sklearn.model_selection import KFold
import lilio.traintest

kfold = KFold(n_splits=3)
cv = lilio.traintest.TrainTestSplit(kfold)
for (x1_train, x2_train), (x1_test, x2_test), y_train, y_test in cv.split(x1, x2, y=y):
    print("Train:", x1_train.anchor_year.values)
    print("Test:", x1_test.anchor_year.values)

print(x1_train)

With an alternative notation we can make this more compact:

In [None]:
# Alternative using shorthand notation
x = [x1, x2]
for x_train, x_test, y_train, y_test in cv.split(*x, y=y):
    x1_train, x2_train = x_train
    x1_test, x2_test = x_test
    print("Train:", x1_train.anchor_year.values)
    print("Test:", x1_test.anchor_year.values)

Now you are ready to train your models!