New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: New Seasonal TimeSeries Model:TBATS/BATS/MTBATS #2892
base: main
Are you sure you want to change the base?
Conversation
statsmodels/tsa/statespace/tbats.py
Outdated
|
||
import numpy as np | ||
from numpy.linalg import LinAlgError | ||
import statsmodels.api as sm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS, this adds a circular import and causes the TravisCI test error.
inside the code we only import directly from the modules (or subpackages)
(I'm not sure what this circular import does and why it fails for some tests but not all.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking into it
statsmodels/tsa/statespace/tbats.py
Outdated
|
||
if self.boxcox: | ||
endog, lmbda = self.transform_boxcox(self.data.endog) | ||
if np.isnan(lmbda) or not (0 <= lmbda <= 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of the next commit you can specify bounds=(0, 1)
as an optional argument to transform_boxcox
. This will ensure optimisation takes place over the specified interval.
I finally had a chance to look at Hyndman's papers and I'm starting to look this over. |
elif key == 'transition': | ||
self.ssm['transition', :-1, :-1] = value | ||
elif key == 'selection': | ||
self.ssm['transition', :-1, -1:] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this should be 'selection'
, and the same below in __getitem__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, follow the construction of your blog post which you gave to me it should be like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why trying to set the selection should set things in the transition matrix. Can you remind me which blog post? It could have been a typo on my part.
Given this, it looks like you never set the actual selection matrix, which I think would render the model useless? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http://nbviewer.jupyter.org/gist/ChadFulton/f0b8fc2e7d47a0103f97
and the origin google groups discuss is here
confuse with the example of AR(1) State Space Model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, yes, you're correct; and now I see the setting of the actual selection matrix in the __init__
call, so that should be fine.
@ChadFulton Maybe you can help me to look at the start params initialization at arma error and regression part, actually I am not familar with this. |
statsmodels/tsa/statespace/tbats.py
Outdated
---------- | ||
periods: iterable of float or None or array, optional | ||
The period of the seasonal component. Default is None. Can be multiple | ||
k: int or None, optional, should be same length of periods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be renamed into a descriptive name.
Sure - you're asking about how to compute |
I can't quite tell what the status of this model is. It looks pretty complete, but there are no unit tests. Are things generally working, or is there something holding it up? I know you were running into a problem with "variance initialization of the model", but I'm not sure what that means - is that still an issue? |
It looks to me like this model pretty much works. Hard to test against the R version of TBATS because they allow larger parameter ranges for alpha and beta. I don't quite understand, but running: library(forecast)
dta <- read.csv('datasets/macrodata/macrodata.csv')
endog <- log(dta$realgdp) * 400
res <- tbats(endog) yields the following:
alpha > 1 which is not a huge problem if TBATS is allowing the larger parameter range, but I thought beta was always supposed to be positive (e.g. see Hyndman's book, page 45). I'm not an expert in these models though, so I may have misunderstood the applicable parameter ranges. @Leoyzen @davidljung do you know what's going on with R here? |
@ChadFulton According to the file checkAdmissibility.R I don't see any limitation to the alpha/beta/gamma, so I just use the information from the book. Your question is also my problem , it became more different result to TBATS of R. The model is almost done here , but some issue with arma and variance initial coefs(within initialize_state function). I'm not familar with this, so please help. The optimize should be done otherwise it takes long time and large nums of iteration to the fit, and I found that it's hard to optimization to converge with lbfgs. My next goal is to get more docs and right code style of statsmodels. |
Actually I think I have found the reference here - take a look at Table 10.1 in Hyndman's book. It looks like he's using the third set of constraints, which relax the positivity constraint if there is damping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
After 8 years, I've making a progress now. The key change is from the fomula below. Considering SSOE (or Innovation State space) formula: and the normal state space formula: according to the timing differences, we can rewrite the normal formual to then we make sure and def update(self, params, transformed=True, includes_fixed=False, complex_step=False):
......
# exclude ma from caculation.It's ok.
end = -q if q != 0 else None
# we assume $TR = g $
# so we have to resolve R from $R = T^ * g$
self["selection", 1:end, 0] = la.solve(
self["transition", 1:end, 1:end], self._internal_selection[:end, 0]
)
# we need ensure $ ZR = 1 $. so the first value of selection matrix should be
# caculate from others.
self["selection", 0, 0] = 1 - self["design", :, 1:].dot(
self["selection", 1:, :]
) Then the code from can be removed. def _initialize_constant_statespace(self, initial_state):
.....
# Apply the prediction step to get to what we need for our Kalman
# filter implementation
# This is not needed anymore because we compute within update step.
# constant = np.dot(self.ssm["transition"], constant)
self.initialization.constant = constant This implement has been verified by both official implements and python implements, also manually caculated. So it will be nice when a new class such as InnovationRepresentation which inherent from Representation and implement such changes. @ChadFulton Can you take a look and give some advices before I moving forward? |
Thanks @Leoyzen for coming back to this, I will take a look soon (unfortunately, it might not be until this weekend) |
2. Add fourier seasonal transform 3. Add some related tools
2. fix circular import issue
I just created a class named InnovationsMLEModel to simplify the procedure implementing the SSOE Model in Kalman Filters. The params and results has been verified with both Python tbats and R forecast package. |
A new timeseries forecast/decompose model based on Rob.J.Hyndman's paper.
http://robjhyndman.com/papers/complex-seasonality/
It is the same as the tbats model of R's forecast package.
Feature:
work: