Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Automatic forecasting GSoC-2018-Project #4621

Open
wants to merge 49 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
0f758a0
automatic package created
abhijeetpanda12 May 11, 2018
5d01590
function definition of order selection
abhijeetpanda12 May 13, 2018
17c697c
test file created for sarimax
abhijeetpanda12 May 14, 2018
cf0b2cb
intermediate commit
abhijeetpanda12 May 21, 2018
74771b0
stepwise algorithm added with the test function
abhijeetpanda12 May 22, 2018
dca4b1d
required changes were made
abhijeetpanda12 May 23, 2018
4857bd8
BUG: order p=-1 sometimes considered solved
abhijeetpanda12 May 25, 2018
d2a3d7b
Forecast and ForecastSet classes added
abhijeetpanda12 May 25, 2018
8b85370
solved an alert raised for call to a Non-Callable list
abhijeetpanda12 May 25, 2018
ec498f1
improvements in test_sample parameter
abhijeetpanda12 May 27, 2018
de2ca4b
modifications in forecast class
abhijeetpanda12 May 27, 2018
3a44396
smoke test for Forecast and ForecasSet class added
abhijeetpanda12 May 28, 2018
cac1ffc
new style class support added, naming of test functions were made cle…
abhijeetpanda12 May 30, 2018
a2918ed
auto order for stepwise algorith changed and support for seasonal var…
abhijeetpanda12 Jun 2, 2018
9fce742
seasonality and intercept added to both auto_order algorithms
abhijeetpanda12 Jun 4, 2018
e970049
minor improvements made to auto_order funtion - intercept return valu…
abhijeetpanda12 Jun 5, 2018
fbb45a3
test functions modified for sarimax auto_order
abhijeetpanda12 Jun 9, 2018
7a514da
auto_transform with lambda prediction added
abhijeetpanda12 Jun 10, 2018
5420d78
code cleaned with minor modifications
abhijeetpanda12 Jun 11, 2018
4492eac
BUG: predict_lambda generates undersired result
abhijeetpanda12 Jun 15, 2018
15dbf5b
R code for auto_transform added
abhijeetpanda12 Jun 16, 2018
f8b9833
simple unit test for predict lambda added
abhijeetpanda12 Jun 16, 2018
90fcea0
Basic documentation of the modules added.
abhijeetpanda12 Jun 19, 2018
7019966
alert fixed for syntax errors.
abhijeetpanda12 Jun 19, 2018
5bdb7d1
auto transform unit test fixed
abhijeetpanda12 Jun 20, 2018
1ad48ee
automatic model selection for exponential smoothing
abhijeetpanda12 Jun 23, 2018
17f51d2
alert fixes: syntax and imports
abhijeetpanda12 Jun 23, 2018
d7d5905
new parameters added to auto_es function
abhijeetpanda12 Jun 27, 2018
db90223
unified automatic Forecast interface for SARIMAX and ES models added
abhijeetpanda12 Jul 5, 2018
4634b44
temporarily removed lambda parameter in auto_es function
abhijeetpanda12 Jul 5, 2018
5a6a2f3
sarimax function made flexible. New unit tests and documentation added.
abhijeetpanda12 Jul 9, 2018
041ad88
Forecast and ForecastSet classes made flexible to understand various …
abhijeetpanda12 Jul 12, 2018
4500e1d
validate function added to ForecastSet class
abhijeetpanda12 Jul 14, 2018
0d49a5a
time-series cross validation added
abhijeetpanda12 Jul 25, 2018
92ae7b1
test cases and documentation forauto_order of SARIMAX updated
abhijeetpanda12 Jul 31, 2018
b14b904
tscv updated and tests for tscv added
abhijeetpanda12 Aug 2, 2018
8fcbd5c
TSCV test for CPI data added
abhijeetpanda12 Aug 3, 2018
40417b8
new test functions added for ES models
abhijeetpanda12 Aug 3, 2018
c5c04a3
code cleaned for Forecast classes
abhijeetpanda12 Aug 3, 2018
5b29eb8
intercept_val variable referenced before assignment for non stepwise …
abhijeetpanda12 Aug 6, 2018
c2861e8
variable typo fixed
abhijeetpanda12 Aug 6, 2018
c205bd6
documentation for TSCV updated
abhijeetpanda12 Aug 8, 2018
4d15d36
documentation for create_exog in auto transform updated
abhijeetpanda12 Aug 8, 2018
057fe84
auto order model looping fixed for stepwise algorithm
abhijeetpanda12 Aug 9, 2018
907f44b
test cases for SARIMAX auto order added
abhijeetpanda12 Aug 9, 2018
fb30ce6
test cases for auto_es Exponential Smoothing models added
abhijeetpanda12 Aug 9, 2018
9ce6cc3
unit test updated for auto transform
abhijeetpanda12 Aug 10, 2018
899e9cf
unit test updated for time series cross validation
abhijeetpanda12 Aug 10, 2018
a007dd7
fixed dataset path error
abhijeetpanda12 Aug 11, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions statsmodels/tsa/automatic/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from .forecast import Forecast
from .forecast import ForecastSet
82 changes: 82 additions & 0 deletions statsmodels/tsa/automatic/exponentialsmoothing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
"""Automatic selection of the Exponential Smoothing Model."""
import numpy as np
import warnings
import statsmodels.api as sm


def auto_es(endog, measure='aic', seasonal_periods=1, damped=False,
additive_only=False, alpha=None, beta=None,
gamma=None, phi=None):
"""Perform automatic calculation of the parameters used in ES models.

This uses a brute force approach to traverse through all the possible
parameters of the model and select the best model based on the measure values.

Parameters
----------
endog : list
input contains the time series data over a period of time.
measure : str
specifies which information measure to use for model evaluation.
'aic' is the default measure.
seasonal_periods : int
the length of a seaonal period.
damped : boolean
includes damped trends.
additive_only : boolean
Allows only additive trend and seasonal models.
alpha : float
Smoothing level.
beta : float
Smoothing slope.
gamma : float
Smoothing seasonal.
phi : float
damping slope.

Returns
-------
model : pair
Pair containing trend,seasonal component type.
Notes
-----
Status : Work In Progress.

"""
if damped:
trends = ["add", "mul"]
else:
trends = ["add", "mul", None]
# damped
seasonal = ["add", "mul", None]
if additive_only:
trends = ["add"]
seasonal = ["add"]
min_measure = np.inf
model = [None, None]
for t in trends:
if seasonal_periods > 1:
for s in seasonal:
try:
mod = sm.tsa.ExponentialSmoothing(endog, trend=t,
seasonal=s,
seasonal_periods=seasonal_periods)
res = mod.fit(smoothing_level=alpha, smoothing_slope=beta,
smoothing_seasonal=gamma, damping_slope=phi)
if getattr(res, measure) < min_measure:
min_measure = getattr(res, measure)
model = [t, s]
except Exception as e:
warnings.warn(str(e))
else:
try:
mod = sm.tsa.ExponentialSmoothing(endog, trend=t,
seasonal_periods=seasonal_periods)
res = mod.fit(smoothing_level=alpha, smoothing_slope=beta,
smoothing_seasonal=gamma, damping_slope=phi)
if getattr(res, measure) < min_measure:
min_measure = getattr(res, measure)
model = [t, None]
except Exception as e:
warnings.warn(str(e))
return model
224 changes: 224 additions & 0 deletions statsmodels/tsa/automatic/forecast.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
"""Classes to hold the Forecast results individually and in sets."""
import numpy as np
import warnings
import statsmodels.api as sm
from statsmodels.tools.decorators import cache_readonly

from statsmodels.tsa.automatic import sarimax
from statsmodels.tsa.automatic import exponentialsmoothing


class Forecast(object):
"""Class to hold the data of a single forecast model."""

def __init__(self, endog, model, s=1, test_sample=0.2, auto_params=False,
**spec):
"""Intialize the data for the Forecast class.

Parameters
----------
endog : list
input contains the time series data over a period of time.
model : class
specifies which model to use for forecasting the data.
s : int
the length of a seaonal period.
test_sample : float
ratio of the enod to keep for testing.
auto_params : boolean
ensures automatic selection of model parameters.

Notes
-----
Status : Work In Progress.

"""
# TODO: make date selection of test sample more robust
if test_sample is None:
self.endog_training = endog
self.indicator = True
else:
if type(test_sample) == str:
self.endog_training = endog[:test_sample][:-1]
self.endog_test = endog[test_sample:]
else:
if type(test_sample) == float or type(test_sample) == int:
if test_sample > 0.0 and test_sample < 1.0:
# here test_sample is containing the number of
# observations to consider for the endog_test
test_sample = int(test_sample * len(endog))
self.endog_training = endog[:-test_sample]
self.endog_test = endog[-test_sample:]
if auto_params:
if (model == sm.tsa.SARIMAX):
if s > 1:
trend, p, d, q, P, D, Q = sarimax.auto_order(self.endog_training,
stepwise=True,
s=s, **spec)
# update dictionary
spec['order'] = (p, d, q)
spec['seasonal_order'] = (P, D, Q, s)
if trend:
spec['trend'] = 'c'
else:
trend, p, d, q = sarimax.auto_order(self.endog_training,
stepwise=True, **spec)
# update dictionary
spec['order'] = (p, d, q)
if trend:
spec['trend'] = 'c'
if(model == sm.tsa.ExponentialSmoothing):
mod = exponentialsmoothing.auto_es(self.endog_training, **spec)
# update dictionary
spec['trend'] = mod[0]
spec['seasonal'] = mod[1]
# Fitting the appropriate model
self.model = model(self.endog_training, **spec)
# self.model = model(endog, **spec)
self.results = self.model.fit()

@property
def resid(self):
"""(array) The list of residuals while fitting the model."""
return self.results.resid

@property
def fittedvalues(self):
"""(array) The list of fitted values of the time-series model."""
return self.results.fittedvalues

@cache_readonly
def nobs_training(self):
"""(int) Number of observations in the training set."""
return len(self.endog_training)

@cache_readonly
def nobs_test(self):
"""(int) Number of observations in the test set."""
return len(self.endog_test)

@cache_readonly
def forecasts(self):
"""(array) The model forecast values."""
return self.results.forecast(self.nobs_test)

# In this case we'll be computing accuracy using forecast errors
# instead of residual values. The forecast error is calculated on test set.
@cache_readonly
def forecasts_error(self):
"""(array) The model forecast errors."""
return self.endog_test - self.forecasts

@cache_readonly
def mae(self):
"""(float) Mean Absolute Error."""
return np.mean(np.abs(self.forecasts_error))

@cache_readonly
def rmse(self):
"""(float) Root mean squared error."""
return np.sqrt(np.mean(self.forecasts_error ** 2))

@cache_readonly
def mape(self):
"""(float) Mean absolute percentage error."""
return np.mean(np.abs((self.forecasts_error) / self.endog_test)) * 100

@cache_readonly
def smape(self):
"""(float) symmetric Mean absolute percentage error."""
return np.mean(
200*np.abs(self.forecasts_error) /
np.abs(self.endog_test + self.forecasts))

@cache_readonly
def mase(self):
"""(float) Mean Absolute Scaled Error."""
# for non-seasonal time series
e_j = self.forecasts_error
sum_v = 0
for val in range(2, self.nobs_training):
sum_v += (self.endog_training[val] - self.endog_training[val - 1])
q_j = e_j*(self.nobs_training-1)/sum_v
return np.mean(np.abs(q_j))


class ForecastSet(object):
"""Class to hold various Forecast objects.

The class holds various Forecast objects and selects
the best Forecast object based on some evaluation measure.
"""

def __init__(self, endog, test_sample):
"""Initialize the values of the ForecastSet class.

Parameters
----------
endog : list
input contains the time series data over a period of time.
test_sample : float
ratio of the enod to keep for testing.

"""
self.endog = endog
self.test_sample = test_sample
self.models = []

def add(self, model, **spec):
"""Add a Forecast object to this ForecastSet.

Parameters
----------
model : class
specifies which model to use for forecasting the data.

"""
fcast = Forecast(self.endog, model, self.test_sample, **spec)
self.models.append(fcast)

def add_fcast(self, fcast_ob):
"""Add a Forecast object to this ForecastSet.

Parameters
----------
fcast_ob : Forecast object
already created Forecast object class.

"""
self.models.append(fcast_ob)

def validate(self):
train = self.models[0].endog_training
test = self.models[0].endog_test
for i in range(1, len(self.models)):
if not ((self.models[i].endog_training.equals(train)) and
(self.models[i].endog_test.equals(test))):
return False
return True


def select(self, measure='mae'):
"""Select the best forecast based on criteria provided by the user.

Parameters
----------
measure : str
specifies the measure to select the best performing model.
mae is the default measure.
Return
----------
model : class
specifies which model to use for forecasting the data.

"""
if (self.validate()):
measure_vals = np.zeros(len(self.models))
for mod in range(len(self.models)):
measure_vals[mod] = getattr(self.models[mod], measure)
min_measure = measure_vals.min()
model = np.where(measure_vals == min_measure)
# print(self.models[0][0])
return self.models[model[0][0]]
else:
warnings.warn("Cannot Compare models with diffetent training and test data")
Loading