You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Howdy folks, I'm new to python and trying to forecast a time series of total sales from a store using SARIMAX. The data spans over 5-years (daily) from '2013-01-01' to '2017-08-16' . I wish to predict sales for the next 30 days ( i.e. from '2017-08-17' to '2017-09-15').
Here is the sample data, read usingthis piece of code import pandas as pd import numpy as np from statsmodels.tsa.statespace.sarimax import SARIMAX import pmdarima as pmd
I decided to split the data into Train and Test sets - 80/20, Then use the pmdarima function to guess the optimal ARIMA model order, then train the model with SARIMAX on my endogenous data. N = round((80/100)*len(DF_store)) DF_train = DF_store.iloc[:N] DF_test = DF_store.iloc[N:] DF_train
Porblems arise when I train the resulting model on my Training set and try to predict the Test data. model = SARIMAX(DF_train, order =(1,0,0), seas_order = (0,0,0, 30)) result = model.fit() result.summary()
It predicts to the end of the set but then decays to zero for future values as shown start = len(DF_test) end = len(DF_train)+ len(DF_test)-1 predict = result.predict(start, end, dynamic = False)
#Questions.
Summary is mymodel will NOT work for any future dates.
why is this happening?
Is there anything i'm missing?
I'll be gald if someone could help.
Thanksin advance.
This is the dataset: Per_store_DF.csv
The text was updated successfully, but these errors were encountered:
Howdy folks, I'm new to python and trying to forecast a time series of total sales from a store using SARIMAX. The data spans over 5-years (daily) from '2013-01-01' to '2017-08-16' . I wish to predict sales for the next 30 days ( i.e. from '2017-08-17' to '2017-09-15').
Here is the sample data, read usingthis piece of code
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.sarimax import SARIMAX
import pmdarima as pmd
DF_store = pd.read_csv('Per_store_DF', index_col = "date", parse_dates = True)
DF_store
I decided to split the data into Train and Test sets - 80/20, Then use the pmdarima function to guess the optimal ARIMA model order, then train the model with SARIMAX on my endogenous data.
N = round((80/100)*len(DF_store))
DF_train = DF_store.iloc[:N]
DF_test = DF_store.iloc[N:]
DF_train
Model Function
def arima_generator(ts):
auto_model = pmd.auto_arima(ts, start_p = 1, start_q = 1, max_p = 3, max_q = 3, m = 30, test = "adf", trace = True)
return auto_model
Apply the model function
ARIMA_model = arima_generator(DF_train)
ARIMA_model.summary()
Porblems arise when I train the resulting model on my Training set and try to predict the Test data.
model = SARIMAX(DF_train, order =(1,0,0), seas_order = (0,0,0, 30))
result = model.fit()
result.summary()
start = len(DF_train)
end = len(DF_store)
Using
predict = result.predict(start, end, dynamic = False)' (same with True)
DF_train.plot(legend = True, figsize=(15,10))Using "start = len(DF_test)" Only
start = len(DF_test)
predict = result.predict(start, dynamic = False)' (same with True)
Starting Somewhere along the Train dataset
It predicts to the end of the set but then decays to zero for future values as shown
start = len(DF_test)
end = len(DF_train)+ len(DF_test)-1
predict = result.predict(start, end, dynamic = False)
#Questions.
Summary is mymodel will NOT work for any future dates.
I'll be gald if someone could help.
Thanksin advance.
This is the dataset: Per_store_DF.csv
The text was updated successfully, but these errors were encountered: