## Solving Time Series Forecasting Tasks

First, import the class `AutoMLTimeSeries`

In [1]:
import numpy as np
import pandas as pd
from alpha_automl import AutoMLTimeSeries

### Generating Pipelines for CSV Datasets

In this example, we are generating pipelines for a CSV dataset. The [LL1_736_stock_market dataset](https://datasets.datadrivendiscovery.org/d3m/datasets/-/tree/master/seed_datasets_current/LL1[…]stock_market_MIN_METADATA/TRAIN/dataset_TRAIN/tables) is used for this example.

In [2]:
output_path = 'tmp/'
train_dataset = pd.read_csv('datasets/time_series/train.csv')
test_dataset = pd.read_csv('datasets/time_series/test.csv')

Filter out the stock we would like to predict and extract useful columns.

In [3]:
stock_name = "ebay"
train_data = train_dataset[train_dataset['Company'] == stock_name].reset_index(drop=True)
test_data = test_dataset[test_dataset['Company'] == stock_name].reset_index(drop=True)

X_train = train_data[["Date", "Close"]]
y_train = train_data[["Close"]]
X_test = test_data[["Date", "Close"]]
y_test = test_data[["Close"]]

X_train

Unnamed: 0,Date,Close
0,1/4/1999,4.2088
1,1/5/1999,3.9310
2,1/6/1999,4.9621
3,1/7/1999,5.2273
4,1/8/1999,5.1305
...,...,...
3707,6/2/2016,24.2500
3708,6/3/2016,23.9800
3709,6/6/2016,23.9900
3710,6/7/2016,24.2800


### Searching  Pipelines

In [4]:
automl = AutoMLTimeSeries(output_path, time_bound=10, verbose=False, 
                          date_column="Date", target_column="Close",
                          split_strategy_kwargs={'n_splits': 3, 'test_size': 20})
automl.fit(X_train, y_train)

INFO:gluonts.mx.context:Using CPU
DEBUG:matplotlib:matplotlib data path: /ext3/miniconda3/lib/python3.10/site-packages/matplotlib/mpl-data
DEBUG:matplotlib:CONFIGDIR=/home/yfw215/.config/matplotlib
DEBUG:matplotlib:interactive is False
DEBUG:matplotlib:platform is linux
DEBUG:matplotlib:CACHEDIR=/home/yfw215/.cache/matplotlib
DEBUG:matplotlib.font_manager:Using fontManager instance from /home/yfw215/.cache/matplotlib/fontlist-v330.json
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1
Epoch 9: 100%|██████████| 1/1 [00:00<00:00,  2.02it/s, v_num=3.41e+7, train_loss_step=0.486, train_loss_epoch=0.486]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 68.87it/s]
INFO:alpha_automl.builtin_primitives.time_series_forecasting:Estimated differencing term: 0
Performing stepwise search to minimize aic
 ARIMA(2,0,2)(0,0,0)[0]             : AIC=inf, Time=0.89 sec
 ARIMA(0,0,0)(0,0,0)[0]             : AIC=12851.608, Time=0.01 sec
 ARIMA(1,0,0)(0,0,0)[0]             : AIC

  0%|          | 0/50 [00:00<?, ?it/s]

Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=1.840, train_loss_epoch=1.840]        INFO:gluonts.trainer:Number of parameters in DeepARTrainingNetwork: 25884
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=2.150, train_loss_epoch=2.150]        

100%|██████████| 50/50 [00:05<00:00,  8.48it/s, epoch=1/5, avg_epoch_loss=3.59]


INFO:gluonts.trainer:Epoch[0] Elapsed time 5.936 seconds
INFO:gluonts.trainer:Epoch[0] Evaluation metric 'epoch_loss'=3.586218
INFO:gluonts.trainer:Epoch[1] Learning rate is 0.001


  0%|          | 0/50 [00:00<?, ?it/s]

Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=1.350, train_loss_epoch=1.350]        

100%|██████████| 50/50 [00:05<00:00,  8.57it/s, epoch=2/5, avg_epoch_loss=3.35]


INFO:gluonts.trainer:Epoch[1] Elapsed time 5.873 seconds
INFO:gluonts.trainer:Epoch[1] Evaluation metric 'epoch_loss'=3.350388
INFO:gluonts.trainer:Epoch[2] Learning rate is 0.001


  0%|          | 0/50 [00:00<?, ?it/s]

Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=1.000, train_loss_epoch=1.000]        

100%|██████████| 50/50 [00:05<00:00,  8.97it/s, epoch=3/5, avg_epoch_loss=3.3]


INFO:gluonts.trainer:Epoch[2] Elapsed time 5.617 seconds
INFO:gluonts.trainer:Epoch[2] Evaluation metric 'epoch_loss'=3.304204
INFO:gluonts.trainer:Epoch[3] Learning rate is 0.001


  0%|          | 0/50 [00:00<?, ?it/s]

Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=0.805, train_loss_epoch=0.805]        

100%|██████████| 50/50 [00:05<00:00,  8.94it/s, epoch=4/5, avg_epoch_loss=3.27]


INFO:gluonts.trainer:Epoch[3] Elapsed time 5.635 seconds
INFO:gluonts.trainer:Epoch[3] Evaluation metric 'epoch_loss'=3.268679
INFO:gluonts.trainer:Epoch[4] Learning rate is 0.001


  0%|          | 0/50 [00:00<?, ?it/s]

Epoch 9: 100%|██████████| 1/1 [00:02<00:00,  2.86s/it, v_num=3.41e+7, train_loss_step=0.868, train_loss_epoch=0.868]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 68.76it/s]
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1
Epoch 0: 100%|██████████| 1/1 [00:02<00:00,  2.90s/it, v_num=3.41e+7, train_loss_step=1.860, train_loss_epoch=1.860]

100%|██████████| 50/50 [00:05<00:00,  8.99it/s, epoch=5/5, avg_epoch_loss=3.24]

Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=1.860, train_loss_epoch=1.860]        




INFO:gluonts.trainer:Epoch[4] Elapsed time 5.613 seconds
INFO:gluonts.trainer:Epoch[4] Evaluation metric 'epoch_loss'=3.243720
INFO:root:Computing averaged parameters.
INFO:root:Loading averaged parameters.
INFO:gluonts.trainer:End model training
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=3.41e+7, train_loss_step=0.803, train_loss_epoch=0.803]        
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 19.39it/s]
Epoch 9: 100%|██████████| 1/1 [00:01<00:00,  1.98s/it, v_num=3.41e+7, train_loss_step=0.819, train_loss_epoch=0.819]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 66.31it/s]
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1
Epoch 9: 100%|██████████| 1/1 [00:02<00:00,  2.21s/it, v_num=3.41e+7, train_loss_step=0.816, train_loss_epoch=0.816]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 35.90it/s]
INFO:alpha_automl.automl_api:Scored pipeline, score=1

  0%|          | 0/50 [00:00<?, ?it/s]

INFO:gluonts.trainer:Number of parameters in DeepARTrainingNetwork: 25884


100%|██████████| 50/50 [00:08<00:00,  6.03it/s, epoch=1/5, avg_epoch_loss=2.27]

INFO:gluonts.trainer:Epoch[0] Elapsed time 8.314 seconds
INFO:gluonts.trainer:Epoch[0] Evaluation metric 'epoch_loss'=2.273453
INFO:gluonts.trainer:Epoch[1] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.32it/s, epoch=2/5, avg_epoch_loss=1.13]

INFO:gluonts.trainer:Epoch[1] Elapsed time 7.909 seconds
INFO:gluonts.trainer:Epoch[1] Evaluation metric 'epoch_loss'=1.134107
INFO:gluonts.trainer:Epoch[2] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.34it/s, epoch=3/5, avg_epoch_loss=1.06]

INFO:gluonts.trainer:Epoch[2] Elapsed time 7.895 seconds
INFO:gluonts.trainer:Epoch[2] Evaluation metric 'epoch_loss'=1.055313
INFO:gluonts.trainer:Epoch[3] Learning rate is 0.001



  0%|          | 0/50 [00:00<?, ?it/s]Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x148e271bae60>
Traceback (most recent call last):
  File "/ext3/miniconda3/lib/python3.10/site-packages/threadpoolctl.py", line 400, in match_module_callback
    self._make_module_from_path(filepath)
  File "/ext3/miniconda3/lib/python3.10/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
    module = module_class(filepath, prefix, user_api, internal_api)
  File "/ext3/miniconda3/lib/python3.10/site-packages/threadpoolctl.py", line 606, in __init__
    self.version = self.get_version()
  File "/ext3/miniconda3/lib/python3.10/site-packages/threadpoolctl.py", line 646, in get_version
    config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
100%|██████████| 50/50 [00:07<00:00,  6.49it/s, epoch=4/5, avg_epoch_loss=0.931]

INFO:gluonts.trainer:Epoch[3] Elapsed time 7.702 seconds
INFO:gluonts.trainer:Epoch[3] Evaluation metric 'epoch_loss'=0.930910
INFO:gluonts.trainer:Epoch[4] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.46it/s, epoch=5/5, avg_epoch_loss=0.833]

INFO:gluonts.trainer:Epoch[4] Elapsed time 7.742 seconds
INFO:gluonts.trainer:Epoch[4] Evaluation metric 'epoch_loss'=0.833180
INFO:root:Computing averaged parameters.
INFO:root:Loading averaged parameters.
INFO:gluonts.trainer:End model training
INFO:gluonts.trainer:Start model training
INFO:gluonts.trainer:Epoch[0] Learning rate is 0.001



  0%|          | 0/50 [00:00<?, ?it/s]

INFO:gluonts.trainer:Number of parameters in DeepARTrainingNetwork: 25884


100%|██████████| 50/50 [00:08<00:00,  6.16it/s, epoch=1/5, avg_epoch_loss=2.41]

INFO:gluonts.trainer:Epoch[0] Elapsed time 8.122 seconds
INFO:gluonts.trainer:Epoch[0] Evaluation metric 'epoch_loss'=2.413074
INFO:gluonts.trainer:Epoch[1] Learning rate is 0.001



100%|██████████| 50/50 [00:08<00:00,  6.24it/s, epoch=2/5, avg_epoch_loss=1.05]

INFO:gluonts.trainer:Epoch[1] Elapsed time 8.013 seconds
INFO:gluonts.trainer:Epoch[1] Evaluation metric 'epoch_loss'=1.052129
INFO:gluonts.trainer:Epoch[2] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.26it/s, epoch=3/5, avg_epoch_loss=0.908]

INFO:gluonts.trainer:Epoch[2] Elapsed time 7.988 seconds
INFO:gluonts.trainer:Epoch[2] Evaluation metric 'epoch_loss'=0.907724
INFO:gluonts.trainer:Epoch[3] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.43it/s, epoch=4/5, avg_epoch_loss=0.858]

INFO:gluonts.trainer:Epoch[3] Elapsed time 7.775 seconds
INFO:gluonts.trainer:Epoch[3] Evaluation metric 'epoch_loss'=0.857916
INFO:gluonts.trainer:Epoch[4] Learning rate is 0.001



100%|██████████| 50/50 [00:08<00:00,  6.16it/s, epoch=5/5, avg_epoch_loss=0.782]

INFO:gluonts.trainer:Epoch[4] Elapsed time 8.117 seconds
INFO:gluonts.trainer:Epoch[4] Evaluation metric 'epoch_loss'=0.782305
INFO:root:Computing averaged parameters.
INFO:root:Loading averaged parameters.
INFO:gluonts.trainer:End model training
INFO:gluonts.trainer:Start model training
INFO:gluonts.trainer:Epoch[0] Learning rate is 0.001



  0%|          | 0/50 [00:00<?, ?it/s]

INFO:gluonts.trainer:Number of parameters in DeepARTrainingNetwork: 25884


100%|██████████| 50/50 [00:08<00:00,  6.19it/s, epoch=1/5, avg_epoch_loss=2.15]

INFO:gluonts.trainer:Epoch[0] Elapsed time 8.089 seconds
INFO:gluonts.trainer:Epoch[0] Evaluation metric 'epoch_loss'=2.147190
INFO:gluonts.trainer:Epoch[1] Learning rate is 0.001



100%|██████████| 50/50 [00:08<00:00,  6.25it/s, epoch=2/5, avg_epoch_loss=1.17]

INFO:gluonts.trainer:Epoch[1] Elapsed time 8.004 seconds
INFO:gluonts.trainer:Epoch[1] Evaluation metric 'epoch_loss'=1.166852
INFO:gluonts.trainer:Epoch[2] Learning rate is 0.001



100%|██████████| 50/50 [00:08<00:00,  6.11it/s, epoch=3/5, avg_epoch_loss=1.02]

INFO:gluonts.trainer:Epoch[2] Elapsed time 8.188 seconds
INFO:gluonts.trainer:Epoch[2] Evaluation metric 'epoch_loss'=1.024162
INFO:gluonts.trainer:Epoch[3] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.26it/s, epoch=4/5, avg_epoch_loss=0.927]

INFO:gluonts.trainer:Epoch[3] Elapsed time 7.985 seconds
INFO:gluonts.trainer:Epoch[3] Evaluation metric 'epoch_loss'=0.927345
INFO:gluonts.trainer:Epoch[4] Learning rate is 0.001



100%|██████████| 50/50 [00:07<00:00,  6.27it/s, epoch=5/5, avg_epoch_loss=0.777]

INFO:gluonts.trainer:Epoch[4] Elapsed time 7.980 seconds
INFO:gluonts.trainer:Epoch[4] Evaluation metric 'epoch_loss'=0.776884
INFO:root:Computing averaged parameters.
INFO:root:Loading averaged parameters.
INFO:gluonts.trainer:End model training
INFO:alpha_automl.automl_api:Scored pipeline, score=2.8773987019881787e-13
INFO:alpha_automl.automl_api:Found pipeline, time=0:04:03, scoring...
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1





Epoch 9: 100%|██████████| 1/1 [00:04<00:00,  4.54s/it, v_num=3.41e+7, train_loss_step=1.120, train_loss_epoch=1.120]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 13.06it/s]
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1
Epoch 9: 100%|██████████| 1/1 [00:04<00:00,  4.64s/it, v_num=3.41e+7, train_loss_step=1.290, train_loss_epoch=1.290]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 18.77it/s]
INFO:lightning_fabric.utilities.seed:[rank: 0] Global seed set to 1
Epoch 9: 100%|██████████| 1/1 [00:02<00:00,  2.42s/it, v_num=3.41e+7, train_loss_step=1.210, train_loss_epoch=1.210]
Predicting DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 26.28it/s]
INFO:alpha_automl.automl_api:Scored pipeline, score=33.41865672031425
INFO:alpha_automl.automl_api:Found pipeline, time=0:06:15, scoring...
INFO:alpha_automl.automl_api:Scored pipeline, score=0.00010179999999999514
INFO:alpha_automl.automl_api:Found pipeline, time=0:06:15, scoring...
INFO:alpha_auto

### Exploring Pipelines

After the pipeline search is complete, we can display the leaderboard:

In [5]:
automl.plot_leaderboard()

ranking,pipeline,mean_squared_error
1,"ColumnTransformer, OrdinalEncoder, RobustScaler, LinearRegression",0.0
2,"ColumnTransformer, OrdinalEncoder, MaxAbsScaler, TheilSenRegressor",0.0
3,"ColumnTransformer, CyclicalFeature, MaxAbsScaler, LinearRegression",0.0
4,"ColumnTransformer, Datetime64ExpandEncoder, MaxAbsScaler, LinearRegression",0.0
5,"ColumnTransformer, OrdinalEncoder, MaxAbsScaler, LinearRegression",0.0
6,"ColumnTransformer, OrdinalEncoder, MaxAbsScaler, RANSACRegressor",0.0
7,"ColumnTransformer, OrdinalEncoder, MaxAbsScaler, Lars",0.0
8,"ColumnTransformer, OrdinalEncoder, ARDRegression",0.0
9,"ColumnTransformer, OrdinalEncoder, MaxAbsScaler, ARDRegression",0.0
10,"ColumnTransformer, Datetime64ExpandEncoder, MaxAbsScaler, ARDRegression",0.0


In order to explore the produced pipelines, we can use [PipelineProfiler](https://github.com/VIDA-NYU/PipelineVis). PipelineProfiler is a visualization that enables users to compare and explore the pipelines generated by the AlphaAutoML system.

After the pipeline search process is completed, we can use PipelineProfiler with:

In [None]:
automl.plot_comparison_pipelines()

For more information about how to use PipelineProfiler, click [here](https://towardsdatascience.com/exploring-auto-sklearn-models-with-pipelineprofiler-5b2c54136044). There is also a video demo available [here](https://www.youtube.com/watch?v=2WSYoaxLLJ8).

### Testing Pipelines

Pipeline predictions are accessed with:

In [6]:
y_pred = automl.predict(X_test)
y_pred

array([[24.33 ],
       [24.065],
       [23.89 ],
       [23.88 ],
       [23.96 ],
       [23.85 ],
       [23.79 ],
       [24.57 ],
       [24.7  ],
       [24.36 ],
       [24.85 ],
       [23.13 ],
       [22.72 ],
       [22.99 ],
       [23.31 ],
       [23.41 ],
       [23.78 ],
       [23.76 ],
       [23.83 ],
       [23.93 ],
       [24.61 ],
       [24.86 ],
       [25.13 ],
       [25.12 ],
       [26.09 ],
       [26.34 ],
       [26.49 ],
       [26.5  ],
       [26.99 ],
       [29.93 ],
       [30.49 ],
       [30.68 ],
       [31.4  ],
       [31.31 ],
       [31.17 ],
       [31.16 ],
       [31.25 ],
       [30.79 ],
       [30.95 ],
       [31.06 ],
       [31.39 ],
       [31.15 ],
       [31.11 ],
       [31.12 ],
       [31.2  ],
       [30.89 ],
       [31.05 ],
       [30.83 ],
       [30.61 ],
       [30.52 ],
       [30.63 ],
       [30.62 ],
       [30.67 ],
       [31.25 ],
       [31.34 ],
       [31.31 ],
       [31.4  ],
       [31.77 ],
       [32.16 

The pipeline can be evaluated against a held out dataset with the function call:

In [7]:
automl.score(X_test, y_test)

INFO:alpha_automl.automl_api:Metric: mean_squared_error, Score: 1.3176959065629877e-28


{'metric': 'mean_squared_error', 'score': 1.3176959065629877e-28}