# Giotto-Time

Welcome to `giotto-time`, our new library for time series forecasting!

Let's start with an example.

## First example

### Ingredients

These are the main ingredients of `giotto-time`:

In [1]:
from giottotime.time_series_preparation import TimeSeriesPreparation
from giottotime.feature_creation import FeatureCreation, ShiftFeature, MovingAverageFeature
from giottotime.model_selection import FeatureSplitter
from giottotime.models import GAR

- `TimeSeriesPreparation`: checks the input format of the time series and converts it to the expected format.
- `FeatureCreation`, `ShiftFeature`, `MovingAverageFeature`: create the desired features on the time series that are 
    used for the forecasting.
- `FeatureSplitter`: prepares the custom `giotto-time` train-test matrices that are used in the model
- `GAR`: generalized-auto-regressive model. This is the only time series model that we provide for the first release.

We also need a `scikit-learn`-model. We go for a standard linear regressor for this example

In [2]:
from sklearn.linear_model import LinearRegression

### Data

We use the `pandas.testing` module to create a testing time series

In [3]:
def test_time_series():
    from pandas.util import testing as testing
    
    testing.N, testing.K = 500, 1
    df = testing.makeTimeDataFrame( freq="D" )
    return df

In [4]:
time_series = test_time_series()

### Time Series Preparation

The input time series has to be a `pandas.DataFrame` with a `PeriodIndex`. Use the provided class `TimeSeriesPreparation` to convert the time series in this format

In [5]:
time_series_preparation = TimeSeriesPreparation()

In [6]:
period_index_time_series = time_series_preparation.transform(time_series)

### Feature Creation

The feature creation part is one of the core part of our library and the bridge between traditional time series forecasting techniques and machine learning.

Starting with a time series in a `pandas.DataFrame`, we create two matrices `X` and `y` which can be used for training and testing.

We provide 12 different features. For simplicity we train a model using only `ShiftFeature` and `MovingAverageFeature`. 

`ShiftFeature` provides a temporal shift of the time series. Adding two `ShiftFeature` with shifts 1 and 2 is equivalent to an `AR(2)` model. 

The possibility to add the features that you want allows you to choose the model that best fits your data.

In [7]:
features = [
    ShiftFeature(1, output_name='shift_1'),
    ShiftFeature(2, output_name='shift_2'),
    MovingAverageFeature(3, output_name='moving_average_3'),
]

In [8]:
feature_creation = FeatureCreation(time_series_features=features, horizon=3)

In [9]:
features_X, features_y = feature_creation.fit_transform(period_index_time_series)

### Train-Test split

We use `FeatureSplitter` to split the matrices X and y in train and test. 

In [10]:
feature_splitter = FeatureSplitter()

In [11]:
X_train, y_train, X_test, y_test = feature_splitter.transform(features_X, features_y)

### Training

We provide a `GAR` (Generalized Auto Regressive) model to forecast the time series.

The traditional `AR` model is equivalent to our `GAR` model that uses only `ShiftFeature` columns in the `X` matrix.
`GAR` supports all the features compatible with the feature creation step.

Moreover, `GAR` internally uses a `scikit-learn` compatible model for the internal time series regression. 
In this example we use `LinearRegression`. A priori all the `fit-transform-predict` models are compatible (e.g. ridge regression, random forest, boosting, etc.. 

In [12]:
model = GAR(base_model=LinearRegression())

In [13]:
model = model.fit(X_train, y_train)

### Forecasting

We forecast 3 time steps of the time series (we set this parameter in `FeatureCreation`).

The format of the output is the following:
- the index is the step at which the prediction is made.
- the column `y_1` is the prediction one time step after and so on for `y_2` and `y_3`

In [14]:
predictions = model.predict(X_test)

In [15]:
predictions

Unnamed: 0,y_1,y_2,y_3
2001-05-12,-0.149298,-0.164899,-0.092473
2001-05-13,-0.150681,-0.08571,-0.063871
2001-05-14,-0.066199,-0.134353,-0.095745
