Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SeasonalityACF nlags is not inclusive #4169

Closed
ngupta23 opened this issue Jan 29, 2023 · 8 comments · Fixed by #4171
Closed

[BUG] SeasonalityACF nlags is not inclusive #4169

ngupta23 opened this issue Jan 29, 2023 · 8 comments · Fixed by #4171
Labels
bug Something isn't working module:parameter-estimators parameter fitters and estimators

Comments

@ngupta23
Copy link
Contributor

Describe the bug

SeasonalityACF does not consider lags upto nlags but up to nlags - 1. This is because of this statement.

candidate_sp = range(2, nlags)

I think this should be increased by 1 to allows nlags to be inclusive.

To Reproduce

import numpy as np
import pandas as pd
from sktime.param_est.seasonality import SeasonalityACF

np.random.seed(42)
data = np.random.randint(0, 100, size=60)
# Data with seasonality of 60
data = pd.DataFrame(np.concatenate((np.tile(data, 2), [data[0]])))

sp_est = SeasonalityACF(nlags=60)
sp_est.fit(data)
print(sp_est.get_fitted_params().get("sp_significant"))

sp_est = SeasonalityACF(nlags=61)
sp_est.fit(data)
print(sp_est.get_fitted_params().get("sp_significant"))
[4]
[60  4]

Expected behavior

Would have expected nlags to be inclusive per the documentation.

nlags : int, optional, default=None
Number of lags to compute autocorrelations for and select from.
At default None, uses min(10 * np.log10(nobs), nobs - 1).

Additional context

Versions

sktime == 0.15.0

@ngupta23 ngupta23 added the bug Something isn't working label Jan 29, 2023
@fkiraly
Copy link
Collaborator

fkiraly commented Jan 29, 2023

oh, you are right. I've been writing python for a while now and still sometimes get confused with the plus/minus one thing...

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 29, 2023

would appreciate a look at #4171, if it is now as expected for you, @ngupta23 (and whether the docstring is now clear)

@fkiraly fkiraly added the module:parameter-estimators parameter fitters and estimators label Jan 29, 2023
@fkiraly
Copy link
Collaborator

fkiraly commented Jan 29, 2023

also, general feedback about the parameter estimator module is appreciated! It's still a bit experimental, e.g., is the pipelining/plugin functionality useful?

@ngupta23
Copy link
Contributor Author

This is indeed a very nice piece of functionality. We have incorporated it into pycaret now for auto seasonal detection. We add a wrapper before this to detrend the series if needed before passing to SeasonalityACF.

https://github.com/pycaret/pycaret/blob/38f01856b5ca14ca79aa800677b489c381d5aa71/pycaret/utils/time_series/__init__.py#L222

One difference is that we detect the seasonality upfront (since it is sometimes hard to explain from a business perspective why seasonality is changing from fold to fold). Hence we are not using this in the forecasting pipeline for now. We just detect the SP upfront and set it to the same value for all folds.

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 30, 2023

Hence we are not using this in the forecasting pipeline for now. We just detect the SP upfront and set it to the same value for all folds.

You can just pipeline it outside, i.e., SeasonalityACF(etc) * ForecastingGridSearchCV(stuff), that will ensure it's detected upfront.
You just need to change the parameter name sp to component__component__sp

@ngupta23
Copy link
Contributor Author

You can just pipeline it outside, i.e., SeasonalityACF(etc) * ForecastingGridSearchCV(stuff), that will ensure it's detected upfront.
You just need to change the parameter name sp to component__component__sp

We could, but the flow in pycaret separates out the setup/EDA from model developent. The SP detection falls in the setup/EDA section so we have to do it upfront before doing model development. Moreover, since the SP detection step is common to all models, it should not be repeated in training (adds extra time).

from pycaret.time_series import TSForecastingExperiment

exp = TSForecastingExperiment()
exp.setup(data)  # SP detection happens here

# Uses SP detected in setup unless user passes SP explicitly through arguments here.
model1 = exp.create_model("arima")  
model2 = exp.create_model("ets")

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 31, 2023

I see - interesting point where one parameter is plugged into multiple models. What happens to those models then - are they being benchmarked? Or do you select them, like in a multiplexer forecaster?

@ngupta23
Copy link
Contributor Author

Yes similar to the multiplexer, we have a wrapper called compare_models that calls create_model in a loop and then sorts the models based on the metric of choice. Users can get back top N models, and follow up with tuning, ensemble, etc after that.

An example can be found here: https://nbviewer.org/github/pycaret/pycaret/blob/master/examples/time_series/forecasting/time_series_103.ipynb

fkiraly added a commit that referenced this issue Feb 4, 2023
…4171)

Fixes #4169, by correcting the `range` expression, and clarifying the docstring.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:parameter-estimators parameter fitters and estimators
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants