You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to estimate the conditional expectation E(y | x) that is robust to other misspecifications. Which models in statsmodels can I use? What are the properties and advantages of each?
Suppose we have a mean function E(y | x) = m(x, b) where m is an known function and we want to estimate the parameters b. m is the inverse link in GLM or GEE terminology.
All distributions in the linear exponential family can estimate the parameters b consistently without requiring any additional assumptions. However, efficiency of the estimate depends on the specific distribution or more precisely it's implied weighting function.
Also, if we don't want to make variance or distributional assumptions, then we need to use robust covariance matrices for inference.
Caveat: Since we only require that we estimate the mean function, we cannot use other properties of the distribution for prediction. For example the implied distribution of future observations might be incorrect, even though the expectation is consistently estimated.
This robustness applies to all currently implemented families and links in genmod, GLM and GEE, and to the corresponding distributions in discrete and regression.
normal/OLS: for data on the real line R, implied variance function is constant, independent of mean
(GLM family Gaussian, OLS)
Bernoulli : for data bound in a compact interval, specifically set {0,1} or real interval [0,1]. Example for first is binary response and for second it is proportions. variance function p * (1 - p) where p = m = E(y | x) ? check
Binomial : similar to Bernoulli for integer or real data between zero and known upper bounds. check
Note: robust cov most likely does not work
Poisson : for nonnegative data, nonnegative integers or positve real line R_+ (including zero)
implied variance function, equal or proportional to m
NegativeBinomial, Negbin 1: nonnegative data as in Poisson, variance function m + c m**2 ???
geometric : special case of NegativeBinomial
...
gamma
inverse...
The text was updated successfully, but these errors were encountered:
I would like to estimate the conditional expectation E(y | x) that is robust to other misspecifications. Which models in statsmodels can I use? What are the properties and advantages of each?
Suppose we have a mean function
E(y | x) = m(x, b)
where m is an known function and we want to estimate the parametersb
.m
is the inverse link in GLM or GEE terminology.All distributions in the linear exponential family can estimate the parameters
b
consistently without requiring any additional assumptions. However, efficiency of the estimate depends on the specific distribution or more precisely it's implied weighting function.Also, if we don't want to make variance or distributional assumptions, then we need to use robust covariance matrices for inference.
Caveat: Since we only require that we estimate the mean function, we cannot use other properties of the distribution for prediction. For example the implied distribution of future observations might be incorrect, even though the expectation is consistently estimated.
This robustness applies to all currently implemented families and links in
genmod
, GLM and GEE, and to the corresponding distributions indiscrete
andregression
.(GLM family Gaussian, OLS)
Note: robust cov most likely does not work
implied variance function, equal or proportional to m
The text was updated successfully, but these errors were encountered: