Request to provide a tutorial of some sort to implement AutoML variants of ETS and prophet for univariate forecasting #40

riteshchhetri10 · 2021-10-29T11:44:02Z

Hey, I had been going through the paper "Merlion: A Machine Learning Library for Time Series" and I came across AutoML variants of ETS and prophet models for univariate forecasting. It would be of great help if you could show some tutorial for implementing them, on a simple univariate dataset like the "air-passengers dataset".
I have also tried the AutoSarima for the same dataset from the merlion.models.automl module. But it gives very large errors compared to the auto_arima model from the pmdarima library and even basic statsmodel.tsa SARIMAX methods.
What could be the reason , given that the air-passengers dataset isn't very complicated to forecast?

my code for autosarima.

max_iter = [10,20,50,100,200,400,1000]
list_autosarima_merlion_models = []  #stores all models with diff parameters
parameters_autosarima_merlion_models = [] #stores different params used for diff models

for mi in max_iter:
    config1 = AutoSarimaConfig(max_forecast_steps=len(test_df), order=("auto", "auto", "auto"),
                           seasonal_order=("auto", "auto", "auto", 12), approximation=True, maxiter=mi)
    model1  = SeasonalityLayer(model = AutoSarima(model = Sarima(config1)))
    train_pred, train_err = model1.train(train_df_merlion, train_config={"enforce_stationarity": True,"enforce_invertibility": True})
    list_autosarima_merlion_models.append(model1)
    parameters_autosarima_merlion_models.append(f'{mi} maximum iterations')

Link to the paper that I had gone through.
https://arxiv.org/abs/2109.09265

aadyotb · 2021-10-29T18:28:18Z

@chenghaoliu89, can you investigate this issue with AutoSARIMA?

@riteshchhetri10 re: AutoProphet and AutoETS, you may see the linked API docs. Prophet has a parameter add_seasonality which can be set to "auto", and ETS has parameter seasonal_periods which can be set to "auto" as well to enable auto seasonality detection. We plan to integrate these more directly into the AutoML framework and add the tutorials you requested in a future release.

aadyotb · 2021-10-29T18:30:05Z

@riteshchhetri10, would you mind including a code snippet of how you are computing the error as well?

riteshchhetri10 · 2021-10-30T10:25:07Z

from merlion.evaluate.forecast import ForecastMetric

test_df_merlion = TimeSeries.from_pd(test_df)
test_pred, test_err = model.forecast(len(test_df))
rmse = ForecastMetric.RMSE.value(ground_truth=test_df_merlion, predict=test_pred)
mae = ForecastMetric.MAE.value(ground_truth=test_df_merlion, predict=test_pred)

#Here the test_df is just a pandas dataframe containing the univariate air-passengers dataset. (~44 rows )
#The RMSE values were used for comparison between pmd autoarima, statsmodel.tsa.SARIMAX, and kats.models.sarima models.

Merlion AutoSarima did not give good results based on RMSE values for the air-passengers dataset.

chenghaoliu89 · 2021-11-02T15:54:12Z

@chenghaoliu89, can you investigate this issue with AutoSARIMA?

@riteshchhetri10 re: AutoProphet and AutoETS, you may see the linked API docs. Prophet has a parameter add_seasonality which can be set to "auto", and ETS has parameter seasonal_periods which can be set to "auto" as well to enable auto seasonality detection. We plan to integrate these more directly into the AutoML framework and add the tutorials you requested in a future release.

@riteshchhetri10 please remove SeasonalityLayer as follows:

    config1 = AutoSarimaConfig(max_forecast_steps=len(test_df), order=("auto", "auto", "auto"),
                           seasonal_order=("auto", "auto", "auto", 12), approximation=True, maxiter=mi)
    model1  =  AutoSarima(model = Sarima(config1))

If you wrap up the SeasonalityLayer, even you have specified the periodicity=12, it will automatically detect the periodicity. For the periodicity detection module, we use a more strict confidence level 0.975, which would lead to periodicity =1 in this dataset. Therefore, the results are not good, since the predictions do not include any period pattern. If we relax it to 0.95, it will output 12 as our expected.

@aadyotb I think we can change the API and expose this confidence level parameters in periodicity detection to the user, what do you think?

riteshchhetri10 · 2021-11-02T19:46:22Z

@chenghaoliu89 You are right, I removed the SeasonalityLayer and the RMSE dropped significantly. Although, the RMSE values are still higher compared to kats.models.prophet or statsmodels.tsa.Sarima models. Thanks for the help. I am closing this issue.

aadyotb · 2021-11-02T19:53:05Z

@chenghaoliu89 sounds good. Can you create a PR adding the confidence parameter to the SeasonalityLayer API?

chenghaoliu89 · 2021-11-03T01:27:29Z

@chenghaoliu89 You are right, I removed the SeasonalityLayer and the RMSE dropped significantly. Although, the RMSE values are still higher compared to kats.models.prophet or statsmodels.tsa.Sarima models. Thanks for the help. I am closing this issue.

@riteshchhetri10 Would you mind sharing the code for comparison with statsmodel and kats? I might help you diagnosis the reason if I can reproduce the RMSE results.

riteshchhetri10 · 2021-11-03T19:17:57Z

@chenghaoliu89

Please see the kaggle notebook containing the entire code which builds the following models and tracks the RMSE for them.
a) statsmodel.tsa SARIMAX model
b) kats.prophet model
c) merlion autosarima model
d) pmdautoarima model

You can simply upload the air passengers dataset available on Kaggle and run all cells to get a data frame with the name of the model and RMSE values along with the parameters used.

Link to kaggle notebook
https://www.kaggle.com/ritesh11/statsmodel-kats-merlion-pmdautoarima

resulting table

What could be the reason that merlion autosarima does not achieve similar scores on the RMSE metric?

Also could you please elaborate on how do we obtain the parameters of the best model chosen via merlion.autosarima?
Basically getting the (p,d,q) values of the best model that is obtained.

chenghaoliu89 · 2021-11-09T17:26:42Z

@riteshchhetri10
Your can print debug level info logging.basicConfig(level=logging.DEBUG) to display the details for hyper-parameter search. In your example, the automl procedure for Merlion is as follows:

Compared to that, you can also print the details of pmdarima by model = pm.auto_arima(train_df, seasonal=True, m=12, trace=True), the automl procedure is as follows:

We can find that the key difference is the detected difference order is different. I check the detailed implementation, both of Merlion and pmdarima use KPSS test to choose the difference order. In Merlion, we directly call it from statsmodel https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.kpss.html. The difference is Merlion now uses a trend regression model as default setting for the KPSS test by setting regression = ct, while pmdarima uses a level regression model as a default setting. Even we use the trend regression, the confidence to set the difference order to zero is not high. I think we can change the default setting to level regression, since this is the default setting from the original R implementation (https://rdrr.io/cran/forecast/src/R/unitRoot.R). Besides, It is hard to say which one is better, we can expose it to user to choose.

aadyotb assigned chenghaoliu89 Oct 29, 2021

riteshchhetri10 closed this as completed Nov 2, 2021

riteshchhetri10 reopened this Nov 3, 2021

chenghaoliu89 closed this as completed Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request to provide a tutorial of some sort to implement AutoML variants of ETS and prophet for univariate forecasting #40

Request to provide a tutorial of some sort to implement AutoML variants of ETS and prophet for univariate forecasting #40

riteshchhetri10 commented Oct 29, 2021 •

edited by aadyotb

aadyotb commented Oct 29, 2021

aadyotb commented Oct 29, 2021

riteshchhetri10 commented Oct 30, 2021

chenghaoliu89 commented Nov 2, 2021

riteshchhetri10 commented Nov 2, 2021

aadyotb commented Nov 2, 2021

chenghaoliu89 commented Nov 3, 2021

riteshchhetri10 commented Nov 3, 2021

chenghaoliu89 commented Nov 9, 2021

Request to provide a tutorial of some sort to implement AutoML variants of ETS and prophet for univariate forecasting #40

Request to provide a tutorial of some sort to implement AutoML variants of ETS and prophet for univariate forecasting #40

Comments

riteshchhetri10 commented Oct 29, 2021 • edited by aadyotb

aadyotb commented Oct 29, 2021

aadyotb commented Oct 29, 2021

riteshchhetri10 commented Oct 30, 2021

chenghaoliu89 commented Nov 2, 2021

riteshchhetri10 commented Nov 2, 2021

aadyotb commented Nov 2, 2021

chenghaoliu89 commented Nov 3, 2021

riteshchhetri10 commented Nov 3, 2021

chenghaoliu89 commented Nov 9, 2021

riteshchhetri10 commented Oct 29, 2021 •

edited by aadyotb