Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SARIMAX prediction DECAYS to ZERO for future data and does not print correspinding futures dates! #8459

Open
ZEVtheNukeGuy opened this issue Oct 21, 2022 · 0 comments

Comments

@ZEVtheNukeGuy
Copy link

ZEVtheNukeGuy commented Oct 21, 2022

Howdy folks, I'm new to python and trying to forecast a time series of total sales from a store using SARIMAX. The data spans over 5-years (daily) from '2013-01-01' to '2017-08-16' . I wish to predict sales for the next 30 days ( i.e. from '2017-08-17' to '2017-09-15').

Here is the sample data, read usingthis piece of code
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.sarimax import SARIMAX
import pmdarima as pmd

DF_store = pd.read_csv('Per_store_DF', index_col = "date", parse_dates = True)
DF_store
image

I decided to split the data into Train and Test sets - 80/20, Then use the pmdarima function to guess the optimal ARIMA model order, then train the model with SARIMAX on my endogenous data.
N = round((80/100)*len(DF_store))
DF_train = DF_store.iloc[:N]
DF_test = DF_store.iloc[N:]
DF_train

Model Function

def arima_generator(ts):
auto_model = pmd.auto_arima(ts, start_p = 1, start_q = 1, max_p = 3, max_q = 3, m = 30, test = "adf", trace = True)
return auto_model

Apply the model function

ARIMA_model = arima_generator(DF_train)
ARIMA_model.summary()

Porblems arise when I train the resulting model on my Training set and try to predict the Test data.
model = SARIMAX(DF_train, order =(1,0,0), seas_order = (0,0,0, 30))
result = model.fit()
result.summary()

start = len(DF_train)
end = len(DF_store)

Using

predict = result.predict(start, end, dynamic = False)' (same with True) DF_train.plot(legend = True, figsize=(15,10)) predict.plot(legend = True) predict`

  1. The outpust has no dates
  2. It decays to zero as shown in these pictures.
    image

Using "start = len(DF_test)" Only

start = len(DF_test)
predict = result.predict(start, dynamic = False)' (same with True)
image

Starting Somewhere along the Train dataset

It predicts to the end of the set but then decays to zero for future values as shown
start = len(DF_test)
end = len(DF_train)+ len(DF_test)-1
predict = result.predict(start, end, dynamic = False)
image

#Questions.
Summary is mymodel will NOT work for any future dates.

  • why is this happening?
  • Is there anything i'm missing?
    I'll be gald if someone could help.
    Thanksin advance.
    This is the dataset: Per_store_DF.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant