Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Check freq only when useful in forecasting. #7565

Closed
soham-d opened this issue Jul 9, 2021 · 6 comments · Fixed by #7574
Closed

BUG: Check freq only when useful in forecasting. #7565

soham-d opened this issue Jul 9, 2021 · 6 comments · Fixed by #7574
Assignees
Milestone

Comments

@soham-d
Copy link

soham-d commented Jul 9, 2021

Describe the bug

I'm getting above TypeError while executing line 23 (forcast()) even though my frequency is not 'str'.

Code Sample, a copy-pastable example if possible

from statsmodels.tsa.holtwinters import ExponentialSmoothing as HWES

#read the data file. the date column is expected to be in the mm-dd-yyyy format.
df_train_y = pd.DataFrame(data = tsne_train_output)
df_train_y.index.freq = 'd'

df_test_y = pd.DataFrame(data = tsne_test_output)
df_test_y.index.freq = 'd'

#plot the data
df_train_y.plot()
plt.show()


#build and train the model on the training data
model = HWES(df_train_y, seasonal_periods=144, trend='add', seasonal='add')
fitted = model.fit(optimized=True, use_brute=True)

#print out the training summary
print(fitted.summary())

#create an out of sample forcast for the next 12 steps beyond the final data point in the training data set
trend_forecast = fitted.forecast(steps= 157200)

#plot the training data, the test data and the forecast on the same plot
fig = plt.figure()
fig.suptitle('Actual #picups Vs Predicted #pickups')
past, = plt.plot(df_train_y.index, df_train_y, 'b.-', label='Actual #Pickups')
future, = plt.plot(df_test_y.index, df_test_y, 'r.-', label='Predicted #pickup')
predicted_future, = plt.plot(df_test_y.index, trend_forecast, 'g.-', label='#pickups forcasted')
plt.legend(handles=[past, future, predicted_future])
plt.show()

Error Message

TypeError                                 Traceback (most recent call last)
<ipython-input-153-3c04122733a1> in <module>()
     21 
     22 #create an out of sample forcast for the next 12 steps beyond the final data point in the training data set
---> 23 trend_forecast = fitted.forecast(steps= 157200)
     24 
     25 #plot the training data, the test data and the forecast on the same plot

1 frames
/usr/local/lib/python3.7/dist-packages/statsmodels/tsa/holtwinters.py in forecast(self, steps)
    344         try:
    345             freq = getattr(self.model._index, 'freq', 1)
--> 346             start = self.model._index[-1] + freq
    347             end = self.model._index[-1] + steps * freq
    348             return self.model.predict(self.params, start=start, end=end)

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Expected Output

I would like to get forcast values in trend_forcast

Output of import statsmodels.api as sm; sm.show_versions()

[paste the output of import statsmodels.api as sm; sm.show_versions() here below this line]

INSTALLED VERSIONS

Python: 3.7.10.final.0
OS: Linux 5.4.104+ #1 SMP Sat Jun 5 09:50:34 PDT 2021 x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

Statsmodels

Installed: 0.10.2 (/usr/local/lib/python3.7/dist-packages/statsmodels)

Required Dependencies

cython: 0.29.23 (/usr/local/lib/python3.7/dist-packages/Cython)
numpy: 1.19.5 (/usr/local/lib/python3.7/dist-packages/numpy)
scipy: 1.4.1 (/usr/local/lib/python3.7/dist-packages/scipy)
pandas: 1.1.5 (/usr/local/lib/python3.7/dist-packages/pandas)
dateutil: 2.8.1 (/usr/local/lib/python3.7/dist-packages/dateutil)
patsy: 0.5.1 (/usr/local/lib/python3.7/dist-packages/patsy)

Optional Dependencies

matplotlib: 3.2.2 (/usr/local/lib/python3.7/dist-packages/matplotlib)
backend: module://ipykernel.pylab.backend_inline
cvxopt: 1.2.6 (/usr/local/lib/python3.7/dist-packages/cvxopt)
joblib: 1.0.1 (/usr/local/lib/python3.7/dist-packages/joblib)

Developer Tools

IPython: 5.5.0 (/usr/local/lib/python3.7/dist-packages/IPython)
jinja2: 2.11.3 (/usr/local/lib/python3.7/dist-packages/jinja2)
sphinx: 1.8.5 (/usr/local/lib/python3.7/dist-packages/sphinx)
pygments: 2.6.1 (/usr/local/lib/python3.7/dist-packages/pygments)
pytest: 3.6.4 (/usr/local/lib/python3.7/dist-packages)
virtualenv: Not installed

@bashtage
Copy link
Member

bashtage commented Jul 9, 2021

Running the code below on master produces no errors.

from statsmodels.tsa.holtwinters import ExponentialSmoothing as HWES

import numpy as np

y = np.random.standard_normal(144*1000)

#build and train the model on the training data
model = HWES(y, seasonal_periods=144, trend='add', seasonal='add')
fitted = model.fit(optimized=True, use_brute=True)

#print out the training summary
print(fitted.summary())

#create an out of sample forcast for the next 12 steps beyond the final data point in the training data set
trend_forecast = fitted.forecast(steps= 157200)

If you run this and see an error, please upgrade to master. If you upgrade to master and see an error in your code, it is likely something with the dataset you are using.

@bashtage
Copy link
Member

bashtage commented Jul 9, 2021

You might reset the index to be an integer index. We really only support integer, DateTime or Period indices. We might not check this enough.

@bashtage bashtage changed the title Getting TypeError: unsupported operand type(s) for +: 'int' and 'str' when using forecast() in statsmodels/tsa/holtwinters.py BUG: Index type should be checked in holt winters and related codes Jul 9, 2021
@bashtage bashtage added comp-tsa type-invalid invalid bug reports, not a bug or issue type-bug and removed type-invalid invalid bug reports, not a bug or issue labels Jul 9, 2021
@soham-d
Copy link
Author

soham-d commented Jul 10, 2021

@bashtage Thank you for the reply!
I ran the code provided by you to check if that is working or not, that worked without any errors.

As suggested I also tried to reset index for the datafram and reran the code, it is showing the same error.
Also, when i'm trying to run type(fitted.model._index[-1]) on the model fitted on my dataset, the output is int.

This is how my dataframe looks like, for your reference:
image

Kindly help!

@bashtage
Copy link
Member

Do you see the bug on master? Which version of statsmodels are you using?

@bashtage
Copy link
Member

The solution is to not set .freq. Leave this as None.

@bashtage bashtage changed the title BUG: Index type should be checked in holt winters and related codes BUG: Check freq only when useful in forecasting. Jul 12, 2021
@bashtage bashtage self-assigned this Jul 12, 2021
@soham-d
Copy link
Author

soham-d commented Jul 13, 2021

The solution is to not set .freq. Leave this as None.

This helped to get rid of keyError, Thanks a ton!

bashtage added a commit to bashtage/statsmodels that referenced this issue Jul 14, 2021
Improve handeling of index that has a freq but is nto a date index

closes statsmodels#7565
bashtage added a commit to bashtage/statsmodels that referenced this issue Jul 14, 2021
Improve handeling of index that has a freq but is nto a date index

closes statsmodels#7565
@bashtage bashtage added this to the 0.13 milestone Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants