Weights #505

Open
jseabold opened this Issue Oct 5, 2012 · 8 comments

Projects

None yet

2 participants

Owner
jseabold commented Oct 5, 2012

Make sure weights are correctly handled throughout the models. This includes GLM, RLM, ANOVA, and the discrete choice models. I think it also might make sense to have weights objects. It might also be interesting to see how far we can get with those provided by PySAL, but I haven't spoken with their developers since the summer. Many of their estimators are just duplicating ours. We should make it easy for them to use our code.

Owner

I just read this a few days ago

Carroll, Raymond J., and David Ruppert. "Robust estimation in heteroscedastic linear models." The annals of statistics (1982): 429-441.

There are also articles for RLM, M-estimation, with AR(1) and with spatial errors.

So far I don't know what (prior) heteroscedasticity weights would mean in discrete models and the same models in GLM.

Owner

to check what matlab has:
robust option in curvefit
http://www.mathworks.com/help/curvefit/least-squares-fitting.html#bq_5kr9-4
and robust regression without a weights options (wfun is our norms M)
http://www.mathworks.com/help/stats/robustfit.html

Owner

GLM https://groups.google.com/d/msg/pystatsmodels/QtSH8T47pZg/KYwJCrxD3eYJ
Stata and SAS use weights for loglikeobs w_i * loglike_i
Stata poisson only mentions fweights and pweights (and iweights), but doesn't have aweights.
Stata glm also has aweights but not clear how it's used

more on robust:
Some papers use weighted likelihood to discount influential observations, x outliers
Trimmed MLE uses 0-1 weights for loglike to cut outliers. (same as subset selection in this case).

Owner

to the last point: importance weights for Poisson and GLM, question on stackoverflow
http://stackoverflow.com/questions/28951982/using-weightings-in-a-poisson-model-using-statsmodels-module

GEE has weights, #2090

Owner

a stackoverflow question asking for weights in GLM or Logit to compensate for imbalanced sample
http://stackoverflow.com/questions/31661552/statsmodels-python-weighted-glm
This might be similar to inverse probability weights #2443 #2442 in the interpretation.

Owner

also related: using the variance function in GLM to introduce weights and heteroscedasticity #1777

Owner

another similar question on stack overflow (imbalanced sample in Logit)
http://stackoverflow.com/questions/33605979/statsmodels-logistic-regression-class-imbalance
(by now I figured out caseweights in GLM Binomial a bit better)

I'm opening issue specific to rare events, unbalanced sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment