In [None]:
#| hide
%load_ext autoreload
%autoreload 2

# Custom training
> Customize the training procedure for your models

mlforecast abstracts away most of the training details, which is useful for iterating quickly. However, sometimes you want more control over the fit parameters, the data that goes into the model, etc. This guide shows how you can train a model in a specific way and then giving it back to mlforecast to produce forecasts with it.

## Data setup

In [None]:
from mlforecast.utils import generate_daily_series

In [None]:
series = generate_daily_series(5)

## Creating forecast object

In [None]:
import lightgbm as lgb
import numpy as np
from sklearn.linear_model import LinearRegression

from mlforecast import MLForecast

Suppose we want to train a linear regression with the default settings.

In [None]:
fcst = MLForecast(
    models={'lr': LinearRegression()},
    freq='D',
    lags=[1],
    date_features=['dayofweek'],
)

## Generate training set

Use `MLForecast.preprocess` to generate the training data.

In [None]:
prep = fcst.preprocess(series)
prep.head()

Unnamed: 0,unique_id,ds,y,lag1,dayofweek
1,id_0,2000-01-02,1.423626,0.428973,6
2,id_0,2000-01-03,2.311782,1.423626,0
3,id_0,2000-01-04,3.192191,2.311782,1
4,id_0,2000-01-05,4.148767,3.192191,2
5,id_0,2000-01-06,5.028356,4.148767,3


In [None]:
X = prep.drop(columns=['unique_id', 'ds', 'y'])
y = prep['y']

## Regular training

Since we don't want to do anything special in our training process for the linear regression, we can just call `MLForecast.fit_models`

In [None]:
fcst.fit_models(X, y)

MLForecast(models=[lr], freq=D, lag_features=['lag1'], date_features=['dayofweek'], num_threads=1)

This has trained the linear regression model and is now available in the `MLForecast.models_` attribute.

In [None]:
fcst.models_

{'lr': LinearRegression()}

## Custom training

Now suppose you also want to train a LightGBM model on the same data, but treating the day of the week as a categorical feature and logging the train loss.

In [None]:
model = lgb.LGBMRegressor(n_estimators=100, verbosity=-1)
model.fit(
    X,
    y,
    eval_set=[(X, y)],
    categorical_feature=['dayofweek'],
    callbacks=[lgb.log_evaluation(20)],
);

[20]	training's l2: 0.0823528
[40]	training's l2: 0.0230292
[60]	training's l2: 0.0207829
[80]	training's l2: 0.019675
[100]	training's l2: 0.018778


## Computing forecasts

Now we just assign this model to the `MLForecast.models_` dictionary. Note that you can assign as many models as you want.

In [None]:
fcst.models_['lgbm'] = model
fcst.models_

{'lr': LinearRegression(), 'lgbm': LGBMRegressor(verbosity=-1)}

And now when calling `MLForecast.predict`, mlforecast will use those models to compute the forecasts.

In [None]:
fcst.predict(1)

Unnamed: 0,unique_id,ds,lr,lgbm
0,id_0,2000-08-10,3.549124,5.166797
1,id_1,2000-04-07,3.154285,4.25249
2,id_2,2000-06-16,2.880933,3.224506
3,id_3,2000-08-30,4.061801,0.245443
4,id_4,2001-01-08,2.904872,2.225106
