Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does multi-stage forecasting supports weekly aggregation as-well #72

Closed
canamika27 opened this issue Mar 30, 2022 · 4 comments
Closed

Comments

@canamika27
Copy link

Hi Team,

Can you please confirm if multi-stage forecasting works on weekly aggregation as well.

I tried with data that has daily frequency, so for one stage I kept the daily frequency & next stage its the weekly aggregation of the daily data.

But getting the below error

TypeError: '<' not supported between instances of 'pandas._libs.tslibs.offsets.Day' and 'pandas._libs.tslibs.offsets.Week'

@KaixuYang
Copy link
Contributor

Hi @canamika27 Thanks for checking out the new functionality! The multistage algorithm is supposed to work with any frequencies. As an initial implementation of the algorithm, we don't see significant gains on weekly aggregation vs daily aggregation, thus the code in the current format does not work for weekly aggregation yet. But in our next release we will extend the multistage algorithm to support more scenarios.

@KaixuYang
Copy link
Contributor

@canamika27 But as a work around, in the aggregation string, instead of using "W", you could try "7D" and it should work in that way.

@canamika27
Copy link
Author

canamika27 commented Mar 31, 2022

@KaixuYang Thanks for the information.

Actually I tried using both 'W' & '7D'

So when I used weekly config with '364D' --(52W*7) like below:

the daily model

daily_config = SilverkiteMultistageTemplateConfig(
train_length="730D", # use 2 year of data to train
fit_length=None, # fit on the same period as training
agg_func="nanmean", # aggregation function is nanmean
agg_freq=None, # aggregation frequency is daily
model_template=ModelTemplateEnum.SILVERKITE.name, # the model template
model_components=daily_model_components # the daily model components specified above
)

the weekly model

weekly_config = SilverkiteMultistageTemplateConfig(
train_length="364D", # use 364 days data to train
fit_length=None, # fit on the same period as training
agg_func="nanmean", # aggregation function is nanmean
agg_freq='W-SUN', # None means no aggregation
model_template=ModelTemplateEnum.SILVERKITE.name, # the model template
model_components=weekly_model_components # the daily model components specified above
)

I got the below error:

/usr/local/lib/python3.7/dist-packages/greykite/sklearn/estimator/silverkite_multistage_estimator.py in fit(self, X, y, time_col, value_col, **fit_params)
300 self.forecast_horizons.append(sample_df_agg.shape[0])
301
--> 302 min_agg_freq = min([to_offset(freq) for freq in self.agg_freqs])
303 if min_agg_freq < to_offset(self.freq):
304 raise ValueError(f"The minimum aggregation frequency {min_agg_freq} "

TypeError: '<' not supported between instances of 'pandas._libs.tslibs.offsets.Day' and 'pandas._libs.tslibs.offsets.Week'

Similarly I also tried with '52W'

the daily model

daily_config = SilverkiteMultistageTemplateConfig(
train_length="730D", # use 2 year of data to train
fit_length=None, # fit on the same period as training
agg_func="nanmean", # aggregation function is nanmean
agg_freq=None, # aggregation frequency is daily
model_template=ModelTemplateEnum.SILVERKITE.name, # the model template
model_components=daily_model_components # the daily model components specified above
)

the weekly model

weekly_config = SilverkiteMultistageTemplateConfig(
train_length="52W", # use 52W data to train
fit_length=None, # fit on the same period as training
agg_func="nanmean", # aggregation function is nanmean
agg_freq='W-SUN', # None means no aggregation
model_template=ModelTemplateEnum.SILVERKITE.name, # the model template
model_components=weekly_model_components # the daily model components specified above
)

And for this I am getting below errors:

/usr/local/lib/python3.7/dist-packages/greykite/sklearn/estimator/silverkite_multistage_estimator.py in (.0)
365 # Assumes train length is integer multiples of 1 second, which is most of the cases.
366 self.train_lengths_in_seconds: List[int] = [
--> 367 to_offset(length).delta // timedelta(seconds=1) for length in self.train_lengths]
368 self.fit_lengths_in_seconds: List[int] = [
369 to_offset(length).delta // timedelta(seconds=1)

AttributeError: 'pandas._libs.tslibs.offsets.Week' object has no attribute 'delta'

Is there any other part we need to change?

@KaixuYang
Copy link
Contributor

Hi @canamika27 thanks for sharing the code. The current implementation of the algorithm does not support frequencies with "W". For the agg_freq parameter, instead of 'W-SUN', could you put "7D"? I know it may cause a little off depending on the starting point of your time series, but that could be the closest config that mimics weekly aggreagation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants