ENH: Statespace: Add unobserved components. #2432
Conversation
Notebook looks very nice. I haven't looked at the PR itself. Is it possible to add an outlier dummy for the year 1913? |
Nevermind, I shouldn't just look at pretty plots and read some of the code also. |
Question: in 1502e11 I added specification of the level / trend component via a string name; for example the deterministic trend model (see #2416) can be specified as: mod = UnobservedComponents(endog, irregular=True, level=True, trend=True) or as: mod = UnobservedComponents(endog, 'dtrend') The question I have is that I used the string specifications that Stata uses. Stata didn't come up with the names themselves, which are pretty standard in the UCM literature, but they did come up with these specific abbreviations. There are other (maybe better) abbreviations (e.g. I personally like "local_level" more than "llevel"). So can we / should we use the Stata names, or should we use our own (possibly more verbose) names? |
Oh, or we could support both. |
I think I'll move forward with supporting them both, since there are a couple of specifications for which Stata does not have a name. I plan to primarily use more verbose names ('llevel' -> 'local_level') but then also accept Stata's names. |
In general I also prefer names that are not easy to misspell. With abbreviations I often have to guess how the word is abbreviated. question (I haven't looked at the details yet): Does this stay with trend is constant or linear trend or can this be extended to quadratic or higher order polynomial trend? |
Trend has a bit of a different meaning here, so the "level" refers to a possibly time-varying intercept, and the "trend" refers to a possibly time-varying slope, on time. If I think of the intercept as a "degree 1 trend" and the slope as a "degree 2 trend" then it is possible to have higher-order trends (and in fact R's package KFAS allows this), but I've never seen it in practice, and it isn't referenced in the standard texts on UC models. |
If you look at the trend component from #2416, the degree 3 trend would be allowing the degree 2 trend trend (beta) to have a new component in it (lets call it gamma) that evolves according to a random walk. And higher order trends are added in the same way. |
about #2416 in the table you have (local) trend and stochastic trend separately, but the 2 equations look to me like a stochastic trend. stochastic drift that is a random walk. which is the deterministic trend and which is the stochastic trend ? if beta = 0, then the mean mu is also zero ??? my initial analogy with local trend that I thought about was local polynomial regression (like in lowess or similar kernel regression) |
The distinction is whether the stochastic part is allowed or not. So, for example, if you had a trend but not a stochastic trend, that would mean that zeta_{t+1} = 0 for all t, or another way to put it is that sigma_{zeta}^2 = 0.
If beta = 0, then mu is not necessarily zero, although it is constant. This is how a mean component can be specified. For example, take a look at the notebook I provided for the Kalman smoother: http://nbviewer.ipython.org/gist/ChadFulton/c3bc720c2fa31c564844/. If you look at the "Deterministic level + stochastic cycle" model, the smoothed level component is non-zero (although here there's also a regression effect, but nonetheless the mu term is non-zero). Stata has a nice page in its ucm manual showing the formulas for several of the specifications. See http://www.stata.com/manuals13/tsucm.pdf, page 20 |
Ok, I guess I understand the stochastic distinction about the mean: if beta = 0, then the mean is equal to the initial condition. |
Yes, that's right. There may be a better way to set this up? But maybe people will mostly just use the keywords anyway.
Right now:
This behavior is not quite consistent, so maybe instead of (1), above, we should have: 1(b). If you set stochastic_trend=True, then it issues a warning "Stochastic trend component specified without trend component; trend component added" and sets trend=True
If beta=0, then it is just an intercept estimated by recursive OLS. If level=True and trend=True but neither are stochastic, then it is an intercept and the slope of a time trend, both estimated by recursive OLS. The initialization is diffuse (mean=0, variance = 1e6) which allows it to pretty quickly pick out the OLS estimate (e.g. the red line settles down relatively quickly in the graph in the notebook I mentioned above). You could also estimate the intercept (or even the slope) parameter by MLE (which is what we do in SARIMAX, which allows arbitrary time trends), but that is not typically done in the UC models because the stochastic extensions allow time-varying-parameters. I suppose if you wanted arbitrary (non-time-varying) time-trends, you could add them as regressors in the |
I've started adding the results class. Here's a sample of what I currently have the summary looking like:
Here I allow multiple lines for the "Model:" specification, because the string specification is potentially very long. Does this seem like a reasonable approach? |
401900d
to
05e28c1
The last commit (tentatively) adds a convenience function for plotting the components of the model. For example, the code: y = sm.datasets.nile.load_pandas().data
y.index = pd.date_range('1871', '1970', freq='AS')
mod = structural.UnobservedComponents(y['volume'], 'random walk', cycle=True, damped_cycle=True, stochastic_cycle=True)
res = mod.fit(method='powell')
print res.summary()
res.plot_components(figsize=(13,5)); produces the following summary table:
and the following graph: |
I'm still not sure about when adding convenience plotting functions is appropriate, so we can remove this if it is not in the right place. |
Failure due to time-out on coverage run. |
With the last four commits, I believe I have exhausted everything I had planned to do with this model. A decision needs to be made on |
There is one other thing I wanted to check on here, actually: right now, as with all state space models, the default optimization method is l_bfgs. The unobserved components models don't always have the best starting parameters though (I think that For that reason, I have found that doing a first round with powell and then a second round with l_bfgs, starting from powell's parameters, does the best job of optimizing. Is this something I should set up in the |
about the two method optimization: We have plans to add this more generally, Skipper has a PR to allow sequences in the optimization method. In GLM with the scipy optimizers, we use by default a few steps of
That depends on the tradeoffs between speed in nice cases and how often a preliminary more robust optimizer is necessary. You could, for example, add a It would also be useful to add a link to a difficult example case in #1649 |
- k_exog to specification Bunch - regression_coefficients returns all coefficients
I didn't see (check) that you have rebased the branch. Please add a comment, so it sends out a notification. merging now Thanks @ChadFulton |
ENH: Statespace: Add unobserved components.
PR corresponding to #2416.
This is a pretty good first draft, but it needs work in the following areas:
filtered_states
matrix)Here's a notebook showing an example usage (the Nile models considered there are from http://www.jstatsoft.org/v41/i02/paper):
http://nbviewer.ipython.org/gist/ChadFulton/9ec4aaa13841bbdbe4df/