Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUMM/ENH: two-part models with heteroscedasticity, dispersion modelling #7639

Open
josef-pkt opened this issue Aug 11, 2021 · 6 comments
Open

Comments

@josef-pkt
Copy link
Member

(I didn't find a general issue for this)

issues

Beta regression has been merged. It includes two link functions, one for mean and one for precision (1/variance_scale)

The same pattern can be used for other distributions, e.g. gaussian, ...
This can include discrete models that have a dispersion parameter, e.g. Negbin and generalized Poisson.
possible alternative/complement is to add a subclass of GLM, but GLM implementation does not model scale as MLE.

We have many issues on diagnostics for heteroscedasticity, but not much on modelling heteroscedasticity.

@josef-pkt
Copy link
Member Author

I'm not sure what the best way to get started.

  • start with specific model, e.g. normal linear model with parametric heteroscedasticity
  • add generic Model with two link functions that delegates everything to distribution/family classes.

For simplicity we use full, joint MLE estimation/optimization which means we don't need to use any of the old computational tricks in the fit method.
Possible improvements with model specific optimization method, e.g. iterated FGLS type, can be used as extra _fit_xxx method or hidden in some _get_start_params method. (I think that's the case for Beta regression)
That is the model structure is full MLE, any FGLS type estimation is subordinated.
(In WLS and GLM we can also use var_weights, to get just the partial model conditional on estimated variance/heteroscedasticity.)

@josef-pkt
Copy link
Member Author

josef-pkt commented Sep 29, 2021

I think I'll do this next

Also as examples to work out design issues
structure similar to betareg, and related to parametric survival #7666 (which is scale not precision parameterization and no mean function)

location: othermod.dispersion_model (econometrics would use "heteroscedastic" in name)

first candidates: GLM families for continuous endog, gaussian and gamma (and maybe a reason to look at IG)
t-distribution would also be a good candidate. Estimates would be more robust to outliers. miscmodels.TModel can be built out to a full model (maybe start with fixed df).

One advantage of weight/precision parameterization is that we can have obs with weight = 0, as in RLM.
(but scipy distributions use scale parameterization)

not at first: dispersion models for discrete data
negative binomial: dispersion parameter need to be positive
generalized poisson: dispersion parameter can be negative, but lower bound is non-trivial
beta binomial: nothing yet
DiscretizedCountModel: is in distributions, but we don't have regression model yet.

@josef-pkt
Copy link
Member Author

relevant here (parking a reference when I remember, I don't have the article right now)
#1074 (comment)

Romano, Joseph P., and Michael Wolf. "Resurrecting weighted least squares." Journal of Econometrics 197, no. 1 (2017): 1-19.

@josef-pkt
Copy link
Member Author

josef-pkt commented Oct 1, 2021

update I had a bug in loglike, using squrt(scale) instead of scale in one term

now, without fixing scale=1 in WLS, the scale estimate differs from one only by df-correction, nobs / (nobs-k_mean)

That should make a good unit test.
We can similarly cross check GammaHet with GLM-Gamma using var_weights.

np.sqrt(np.diag(np.linalg.inv(-hess[:2, :2]))), res_wls2.bse
np.sqrt(np.diag(np.linalg.inv(-hess[:2, :2]))), res_wls2.bse, res_gau2.bse
(array([0.00943792, 0.22941801]),
 array([0.00960207, 0.23340819]),
 array([0.00951549, 0.23779801, 0.18804021, 4.03474698]))
)

1 / (np.sqrt(np.diag(np.linalg.inv(-hess[:2, :2]))) / res_wls2.bse), np.sqrt(res_wls2.scale), np.sqrt(59/(59-2))
(array([1.01739259, 1.01739259]), 1.0173925808044655, 1.0173926082384548)

initial comment below

something puzzling: scale estimate in WLS compared to GaussianHet, WLS scale is around 0.5, but should be 1, I thought.

As consequence bse are only around half of het MLE.
WLS weights are the variance estimates from the full MLE

If I fix WLS scale=1, then WLS cov_params agrees with the the inverted negative mean sub-matrix of full MLE hessian. This would correspond to EIM, while OIM differs a bit (hessian/cov_params is not block-diagonal in a smaller sample, nobs=59)

np.sqrt(np.diag(np.linalg.inv(-hess[:2, :2]))), res_wls2.bse, res_gau2.bse
(array([0.01334723, 0.32444601]),
 array([0.01334723, 0.32444601]),
 array([0.01345694, 0.33629707, 0.26592898, 5.70599272]))

(maybe I still have a bug somewhere )

@josef-pkt
Copy link
Member Author

We need also GMMHet, e.g. for quasi-likelihood models with variance functions #1777
#1790 for replicating GLM by GMM

example excess dispersion for Binomial counts.

  • weighted mean function
  • variance is e.g. binomial variance function with adjustment factor.
    • Beta-Binomial is proper distribution with overdispersion factor
    • underdispersion could have the same variance function but with "negative overdispersion", e.g. because of correlation.
      Simonoff 2003 Categorical book p. 94
      However, there is no explicit distribution for that. We still can estimate mean and variance and get joint inference.

Outside of GLM/LEF (and elliptical, symmetric distributions), mean is not (asymptotically) orthogonal to variance.

Even if we don't use it as estimation model, we can use it for specification tests and robustness checks.

more general: classes of moment conditions that use (i)link functions.

I guess we need weight functions in (core) moment condition and not in "instruments"
e.g. use pearson residual for mean model.
moment conditions for variance/dispersion would be similar to those used in excess dispersion tests.

One limitation of GMM, if we have multiple local minima, then GMM doesn't provide a (parametric) objective function to choose between those local optima.

@josef-pkt
Copy link
Member Author

another thought:

offset in var function in GaussianHet should be able to replicate var_weights.
e.g. linpred_scale = offset + constant = log(var_weights) + constant

Then, with exp function (log-link)

exp(linpred_scale) = var_weights * exp(constant)
where exp(constant) corresponds to scalar scale estimate in WLS or similar.

I don't have added offsets yet to MultiLinkModel.
But I guess, this would be a test case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant