[BUG] SeasonalityACF `nlags` is not inclusive #4169

ngupta23 · 2023-01-29T02:18:45Z

Describe the bug

SeasonalityACF does not consider lags upto nlags but up to nlags - 1. This is because of this statement.

Line 143 in daf5f81

candidate_sp = range(2, nlags)

I think this should be increased by 1 to allows nlags to be inclusive.

To Reproduce

import numpy as np
import pandas as pd
from sktime.param_est.seasonality import SeasonalityACF

np.random.seed(42)
data = np.random.randint(0, 100, size=60)
# Data with seasonality of 60
data = pd.DataFrame(np.concatenate((np.tile(data, 2), [data[0]])))

sp_est = SeasonalityACF(nlags=60)
sp_est.fit(data)
print(sp_est.get_fitted_params().get("sp_significant"))

sp_est = SeasonalityACF(nlags=61)
sp_est.fit(data)
print(sp_est.get_fitted_params().get("sp_significant"))

[4]
[60  4]

Expected behavior

Would have expected nlags to be inclusive per the documentation.

sktime/sktime/param_est/seasonality.py

Lines 36 to 38 in daf5f81

    
               nlags : int, optional, default=None 
        
                   Number of lags to compute autocorrelations for and select from. 
        
                   At default None, uses min(10 * np.log10(nobs), nobs - 1).

Additional context

Versions

sktime == 0.15.0

The text was updated successfully, but these errors were encountered:

fkiraly · 2023-01-29T12:25:49Z

oh, you are right. I've been writing python for a while now and still sometimes get confused with the plus/minus one thing...

fkiraly · 2023-01-29T13:24:08Z

would appreciate a look at #4171, if it is now as expected for you, @ngupta23 (and whether the docstring is now clear)

fkiraly · 2023-01-29T13:25:06Z

also, general feedback about the parameter estimator module is appreciated! It's still a bit experimental, e.g., is the pipelining/plugin functionality useful?

ngupta23 · 2023-01-29T19:06:19Z

This is indeed a very nice piece of functionality. We have incorporated it into pycaret now for auto seasonal detection. We add a wrapper before this to detrend the series if needed before passing to SeasonalityACF.

https://github.com/pycaret/pycaret/blob/38f01856b5ca14ca79aa800677b489c381d5aa71/pycaret/utils/time_series/__init__.py#L222

One difference is that we detect the seasonality upfront (since it is sometimes hard to explain from a business perspective why seasonality is changing from fold to fold). Hence we are not using this in the forecasting pipeline for now. We just detect the SP upfront and set it to the same value for all folds.

fkiraly · 2023-01-30T10:18:53Z

Hence we are not using this in the forecasting pipeline for now. We just detect the SP upfront and set it to the same value for all folds.

You can just pipeline it outside, i.e., SeasonalityACF(etc) * ForecastingGridSearchCV(stuff), that will ensure it's detected upfront.
You just need to change the parameter name sp to component__component__sp

ngupta23 · 2023-01-31T12:31:11Z

You can just pipeline it outside, i.e., SeasonalityACF(etc) * ForecastingGridSearchCV(stuff), that will ensure it's detected upfront.
You just need to change the parameter name sp to component__component__sp

We could, but the flow in pycaret separates out the setup/EDA from model developent. The SP detection falls in the setup/EDA section so we have to do it upfront before doing model development. Moreover, since the SP detection step is common to all models, it should not be repeated in training (adds extra time).

from pycaret.time_series import TSForecastingExperiment

exp = TSForecastingExperiment()
exp.setup(data)  # SP detection happens here

# Uses SP detected in setup unless user passes SP explicitly through arguments here.
model1 = exp.create_model("arima")  
model2 = exp.create_model("ets")

fkiraly · 2023-01-31T15:24:46Z

I see - interesting point where one parameter is plugged into multiple models. What happens to those models then - are they being benchmarked? Or do you select them, like in a multiplexer forecaster?

ngupta23 · 2023-01-31T21:56:43Z

Yes similar to the multiplexer, we have a wrapper called compare_models that calls create_model in a loop and then sorts the models based on the metric of choice. Users can get back top N models, and follow up with tuning, ensemble, etc after that.

An example can be found here: https://nbviewer.org/github/pycaret/pycaret/blob/master/examples/time_series/forecasting/time_series_103.ipynb

…4171) Fixes #4169, by correcting the `range` expression, and clarifying the docstring.

ngupta23 added the bug Something isn't working label Jan 29, 2023

fkiraly mentioned this issue Jan 29, 2023

[BUG] fix nlag logic in SeasonalityACF and SeasonalityACFqstat #4171

Merged

fkiraly added the module:parameter-estimators parameter fitters and estimators label Jan 29, 2023

fkiraly closed this as completed in #4171 Feb 4, 2023

fkiraly added a commit that referenced this issue Feb 4, 2023

[BUG] fix nlag logic in SeasonalityACF and SeasonalityACFqstat (#…

87b9444

…4171) Fixes #4169, by correcting the `range` expression, and clarifying the docstring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] SeasonalityACF `nlags` is not inclusive #4169

[BUG] SeasonalityACF `nlags` is not inclusive #4169

ngupta23 commented Jan 29, 2023

fkiraly commented Jan 29, 2023

fkiraly commented Jan 29, 2023

fkiraly commented Jan 29, 2023

ngupta23 commented Jan 29, 2023

fkiraly commented Jan 30, 2023

ngupta23 commented Jan 31, 2023

fkiraly commented Jan 31, 2023

ngupta23 commented Jan 31, 2023

[BUG] SeasonalityACF nlags is not inclusive #4169

[BUG] SeasonalityACF nlags is not inclusive #4169

Comments

ngupta23 commented Jan 29, 2023

fkiraly commented Jan 29, 2023

fkiraly commented Jan 29, 2023

fkiraly commented Jan 29, 2023

ngupta23 commented Jan 29, 2023

fkiraly commented Jan 30, 2023

ngupta23 commented Jan 31, 2023

fkiraly commented Jan 31, 2023

ngupta23 commented Jan 31, 2023

[BUG] SeasonalityACF `nlags` is not inclusive #4169

[BUG] SeasonalityACF `nlags` is not inclusive #4169