AutoML searches for the optimal model and its parameter combination through several trials. At the end of the experiment, in addition to returning the optimal model, **model ensemble** can also be performed on ``topk`` to improve the generalization of the pipeline.

In HyperTS, we introduce a mechanism called GreedyEnsemble for model ensemble. Its specific process is as follows[1]:

1. Start with the empty ensemble.

2. Add to the ensemble the model in the library that maximizes the ensemble’s performance to the error metric on a hillclimb (validation) set.

3. Repeat Step 2 for a fixed number of iterations or until all the models have been used.

4. Return the ensemble from the nested set of ensembles that has maximum performance on the hillclimb (validation) set.

References

[1] Caruana, Rich, et al. "Ensemble selection from libraries of models." in ICML. 2004.

**Example of use:**

#### 1. Prepare Dataset

In [1]:
from hyperts.datasets import load_network_traffic
from sklearn.model_selection import train_test_split

In [2]:
df = load_network_traffic(univariate=True)
train_data, test_data = train_test_split(df, test_size=168, shuffle=False)

#### 2. Create Experiment and Run

In [3]:
from hyperts import make_experiment

Set parameter ``ensemble_size`` to control the number of ensemble models.

In [4]:
experiment = make_experiment(train_data=train_data.copy(),
                             task='forecast',
                             mode='dl',
                             timestamp='TimeStamp',
                             covariates=['HourSin', 'WeekCos', 'CBWD'],
                             forecast_train_data_periods=24*12,
                             max_trials=10,
                             ensemble_size=5)
model = experiment.run()




In [5]:
model.get_params

<bound method Pipeline.get_params of Pipeline(steps=[('data_preprocessing',
                 TSFDataPreprocessStep(covariate_cleaner=CovariateTransformer(covariables=['HourSin',
                                                                                           'WeekCos',
                                                                                           'CBWD'],
                                                                              data_cleaner_args={'correct_object_dtype': False,
                                                                                                 'int_convert_to': 'str'}),
                                       covariate_cleaner__covariables=['HourSin',
                                                                       'WeekCos',
                                                                       'CBWD'],
                                       covariate_cleaner__data_cleaner_args={'correct_object_dtype': False,
                

#### 3. Infer and Evaluation

In [6]:
X_test, y_test = model.split_X_y(test_data.copy())
forecast = model.predict(X_test)
results = model.evaluate(y_true=y_test, y_pred=forecast)
results

Unnamed: 0,Metirc,Score
0,mae,2.5579
1,mse,15.8978
2,rmse,3.9872
3,mape,0.4066
4,smape,0.3483
