Parameterless models #1259

Closed
bashtage opened this Issue Dec 19, 2013 · 9 comments

Projects

None yet

3 participants

@bashtage
Contributor

A recent comment on google groups about a ARIMA(0,1,0) got me thinking about the usefulness of parameterless models. These might be useful for likelihood ratio tests and are often useful when implementing model selection.

https://github.com/bashtage/statsmodels/compare/ARIMA-p-0-q-0

I've started working on this, but it will probably have to go deeper than just the actual estimation code.

Groups comment:

https://groups.google.com/forum/?fromgroups#!topic/pystatsmodels/mnKRw3eKzhU

@josef-pkt
Member

Can you try to avoid a bit the reformatting from the changes, it makes it very difficult to see what you actually do.
For Eclipse/pydev I had to turn off any automatic formatting because it was changing too many things around. The only thing I kept was remove trailing whitespace.

@josef-pkt
Member

I worry a bit that we get too many special cases into the classes, so that the code will get very difficult to follow.
maybe not ? lots of empty arrays

What is there actually to estimate in a "parameterless" model?
I understand the likelihood ratio part, but it's also possible to construct the parameterless likelihood without going through the full model.
ARMAX(0,0) is just OLS, isn't it?
(I'm not sure anymore or don't remember whether differencing in ARIMAX(0, 1, 0) ,matters?)

I'm just trying to get a rough idea of where this is going, for most parts I stay out of ARMA, ARIMA.

@bashtage
Contributor

ARMAX(0,d,0) is just OLS on the appropriate d-differenced data, which is good. The reason I think a parameterless model is useful is that it is far more intuitive to fit an ARIMA(1,1,0), ARIMA(0,1,1), and an ARIMA(0,1,0) (or even ARIMA(0,0,0)).

The code was mostly fine for the ARMA(0,0)/ARIMA(0,1,0) as long as it had a constant. Things got more difficult when there were literally no parameters in the model, but the primary fix was just to make sure the likelihood can handle this case correctly.

I used a hack to get around the printing of the parameters in summary() since it assumes any model will have len(parameters)>0. This (should) can be fixed in the SummaryTable code rather than in the high level function where I have put it for the time being.

Similarly, OLS should be able to return results when there are no regressors, since this is just a model where all coefficients have been restricted to be 0.

The more useful stuff of the parameterless model are things like information criteria, log-likelihood and predictions. Not that these aren't usually simple to compute, but to ensure that they are coherent with everything that is produced by a model that is not-parameterless. Suppose, for example, that ARIMA uses a scaled log-likelihood that didn't include the log(2*pi) term while OLS did - this would create a challenge to use different models.

Finally, since parameterless models often overlap with other parameterless models (but don't once parameters are introduced), this provides a stringent cross-model sanity check.

@bashtage
Contributor

Can you try to avoid a bit the reformatting from the changes, it makes it very difficult to see what you actually do.
For Eclipse/pydev I had to turn off any automatic formatting because it was changing too many things around. The only thing I kept was remove trailing whitespace.

Yeah, I can see that. I actually started without letting it clean up the code, but in a really large source file uneven formating really makes more it difficult to read code (even when I wrote it in the past). There were also a non-trivial number of things like unused imports and unused local variables that made it harder to read.

@bashtage
Contributor

I reverted most non-essential changes so it is quite a bit easier to see what's changed. I think 2 or 3 deep changes, mostly to handle empty parameters when estimating the model, could accomplish this on a general basis (and then on a tiny number of a model-specific basic)

@josef-pkt
Member

Some thoughts on this, while trying to figure out what this would mean in other models:

A parameterless OLS doesn't have any "model" y = nan + epsilon.
What we actually do have use cases for, are models with fixed, not estimated parameters. (For example at some point we had the discussion how to reuse predict or forecast from the model and results instances without estimating the model. For some features of TSA we have the Process classes.)

The "parameterless" OLS model, AFAICS, is just a model with a constant or other exog with params = zeros.
In the same way, a ARMA(0, 0) or ARIMA(0, 1, 0) is the same as an ARMA(1, 0) or ARIMA(1, 1, 0) or ARMA(1, 1) or ARIMA(1, 1, 1) with the lag coefficients fixed at zero.
We don't need to handle empty parameters, we just set them to zero and we can reuse the current loglike to calculate for example the LikelihoodRatioTest.

Similar if ARIMA(0, 1, 0) has a constant, trend or other exog, then we can just use OLS to estimate the parameters, concatenate some zeros, and we have a valid ARIMA(1,1,1) model, not for the estimation but for all other parts, llf, predict, and so on.

The only direct non-estimation loglike that might be similar is the saturated likelihood that is used in the deviance calculations in GLM. But that is explicitly coded for each case.

@bashtage
Contributor

I don't think the model is y = nan + eps, but is instead y = eps.

The difficulty I see with a 0 parameter is that these will need a special case in each model to first set the parameter, and then to forget the parameter (and possibly its standard error, the covariance matrix, information criteria, etc). This seems like it may be more complex than evaluating the model with params = array([]).

Similar if ARIMA(0, 1, 0) has a constant, trend or other exog, then we can just use OLS to estimate the parameters, concatenate some zeros, and we have a valid ARIMA(1,1,1) model, not for the estimation but for all other parts, llf, predict, and so on.

I agree with this, and I changed the behavior to use "css" whenever the model is only exogenous since this has a closed form estimator.

The only direct non-estimation loglike that might be similar is the saturated likelihood that is used in the deviance calculations in GLM. But that is explicitly coded for each case.

I was thinking about a GLS-type model with non-constant variances but no parameters. Or perhaps all of the parameters are restricted under the null so that the restricted model has no free parameters.

@josef-pkt
Member

I don't think the model is y = nan + eps, but is instead y = eps.

The difficulty I see with a 0 parameter is that these will need a special
case in each model to first set the parameter, and then to forget the
parameter (and possibly its standard error, the covariance matrix,
information criteria, etc). This seems like it may be more complex than
evaluating the model with params = array([]).

3 cases

If there are no parameters at all, then we can just NaN the entire
cov_params since there is no parameter inference possible.

We already have the provision to NaN some parameters in the cov_params and
related statistics. It's currently used only in discrete models after
estimating with L1 penalty, and hasn't been extended or checked for other
cases.

We don't have much on real restricted estimation, where we have some free
parameters that are estimated subject to restrictions. Only the new GEE and
some linear RLS in the sandbox can estimate under equality constraints. GEE
drops the parameters, as far as I have seen. Stata in some cases reports
fixed parameters in the summary table, but I never looked closely at that
case. (example I remember, offset or exposure in CountModels where the
parameter is fixed at 1, Stata shows it in summary, we don't, yet.) It's
not clear to me yet what will be the best way for us to handle this.

I was thinking about a GLS-type model with non-constant variances but no
parameters. Or perhaps all of the parameters are restricted under the null
so that the restricted model has no free parameters.

This case might require a separate discussion, because I'm not sure what
you have in mind, and it's something I'd like to see added soon. In the
standard two-step estimation for feasible GLS, we estimate the variance
parameters in a separate step. The mean parameter table in summary is
independent of the variance model and could be concatenated or not.
Currently GEE, GLSAR and the sandbox GLSHet don't report the (co)variance
parameters together with the mean parameters. (and I guess we don't have
standard errors on any parameters of the (co)variance function).
Similar "parameterless" variance function would appear with non-parametric
estimation of the variance function, I guess.

@jseabold
Member

Closing this. AFAIK this is now implemented.

@jseabold jseabold closed this Apr 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment