Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc work GEE, GMM, sphinx warnings #1264

Merged
merged 4 commits into from Dec 24, 2013
Merged
Changes from 3 commits
Commits
File filter...
Filter file types
Jump to…
Jump to file or symbol
Failed to load files and symbols.

Always

Just for now

@@ -33,12 +33,8 @@ Module Reference
GMM
GMMResults
IV2SLS

not sure what the status is on the following

.. autosummary::
:toctree: generated/

IVGMM
IVGMMResults
IVRegressionResults
LinearIVGMM
NonlinearIVGMM
DistQuantilesGMM
@@ -91,6 +91,7 @@ Table of Contents

regression
glm
gee
rlm
discretemod
anova
@@ -9,7 +9,7 @@
mdata = ds.macrodata.load_pandas().data

# prepare the dates index
dates = mdata[['year', 'quarter']].astype(int).astype(str)
dates = mdata[['year', 'quarter']].astype(int).astype('S4')
quarterly = dates["year"] + "Q" + dates["quarter"]
quarterly = dates_from_str(quarterly)

@@ -11,6 +11,10 @@ Release summary.

Major changes:

Addition of Generalized Estimating Equations GEE



Header for Change
~~~~~~~~~~~~~~~~~

@@ -117,7 +117,7 @@ Vector Autogressive Processes (VAR)
vector_ar.var_model.VARResults
vector_ar.dynamic.DynamicVAR

.. seealso:: :ref:`VAR documentation <var>`
.. seealso:: tutorial :ref:`VAR documentation <var>`

.. currentmodule:: statsmodels.tsa

@@ -137,7 +137,7 @@ estimation are available for vector autoregressive processes.
vector_ar.var_model.FEVD
vector_ar.dynamic.DynamicVAR

.. seealso:: :ref:`VAR documentation <var>`
.. seealso:: tutorial :ref:`VAR documentation <var>`

ARMA Process
""""""""""""
@@ -3,11 +3,11 @@

class CovStruct(object):
"""
A base class for correlation and covariance structures of repeated
measures data. Each implementation of this class takes the
residuals from a regression model that has been fit to clustered
data, and uses them to estimate the within-cluster variance and
dependence structure of the model errors.
The base class for correlation and covariance structures of cluster data.
Each implementation of this class takes the residuals from a regression
model that has been fit to clustered data, and uses them to estimate the

This comment has been minimized.

Copy link
@vincentarelbundock
within-cluster variance and dependence structure of the model errors.
"""

# Parameters describing the dependency structure
@@ -505,8 +505,7 @@ def summary(self):

class GlobalOddsRatio(CovStruct):
"""
Estimate the global odds ratio for a GEE with either ordinal or
nominal data.
Estimate the global odds ratio for a GEE with ordinal or nominal data.
References
----------
@@ -155,6 +155,12 @@ def unpack_cov(self, bcov):

class GEE(base.Model):
__doc__ = """
Generalized Estimating Equations Models
GEE estimates Generalized Linear Models when the data has a cluster
structure and the observations are possibly correlated within but not
across clusters
Parameters
----------
endog : array-like
@@ -909,6 +915,60 @@ def _derivative_exog(self, params, exog=None, transform='dydx',


class GEEResults(base.LikelihoodModelResults):
'''
Class to contain GEE results.
GEEResults inherits from statsmodels.LikelihoodModelResults
Parameters
----------
See statsmodels.LikelihoodModelReesults
Returns
-------
**Attributes**
naive_covariance : ndarray

This comment has been minimized.

Copy link
@vincentarelbundock

vincentarelbundock Dec 22, 2013

Contributor

Not sure if we have a standard for this, but it would make sense to me to reverse these: "covariance_naive", "covariance_robust". That way, they would all show up together when using tab completion (which usually prints in alphabetical order).

Also, can "robust_covariance_bc" just be called "robust_covariance"?

Edit: OK, I see that there's a choice below between "robust" and "robust bias reduced"

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Dec 23, 2013

Author Member

I'll add this to the GEE-followup issue,
I also prefer post-fix qualifiers, covariance_xxx, resid_xxx
( how the robust cov are attached still needs to change to make t_test, wald_test work correctly)

covariance of the parameter estimates that is not robust to correlation
or variance misspecification
robust_covariance_bc : ndarray
covariance of the parameter estimates that is robust and bias reduced
converged : bool
indicator for convergence of the optimization.
True if the norm of the score is smaller than a threshold
covariance_type : string
string indicating whether a "robust", "naive" or "robust bias reduced"
covariance is used as default
fit_history : dict
Contains information about the iterations. Its keys are `iterations`,
`deviance` and `params`.
fittedvalues : array
Linear predicted values for the fitted model.
dot(exog, params)
model : class instance
Pointer to GLM model instance that called fit.

This comment has been minimized.

Copy link
@vincentarelbundock

vincentarelbundock Dec 22, 2013

Contributor

GLM -> GEE?

nobs : float

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Dec 23, 2013

Author Member

nobs is not available as attribute of results

The number of observations n.
normalized_cov_params : array
See GEE docstring
params : array
The coefficients of the fitted model. Note that interpretation
of the coefficients often depends on the distribution family and the
data.
scale : float
The estimate of the scale / dispersion for the model fit.
See GLM.fit and GLM.estimate_scale for more information.

This comment has been minimized.

Copy link
@vincentarelbundock

vincentarelbundock Dec 22, 2013

Contributor

GLM -> GEE

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Dec 23, 2013

Author Member

estimate_scale in GEE has different signature and pattern than in GLM, and no "more information" in docstring.

score_norm : float
norm of the score at the end of the iterative estimation.
stand_errors : array
The standard errors of the fitted GLM. #TODO still named bse

This comment has been minimized.

Copy link
@vincentarelbundock

vincentarelbundock Dec 22, 2013

Contributor

GLM -> GEE?

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Dec 23, 2013

Author Member

including stand_error here might be wrong, needs checking

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Dec 23, 2013

Author Member

change to bse which is still the inherited attribute

See Also
--------
statsmodels.LikelihoodModelResults
GEE
'''


def __init__(self, model, params, cov_params, scale):

@@ -260,17 +260,17 @@ class GLS(RegressionModel):
%(params)s
sigma : scalar or array
`sigma` is the weighting matrix of the covariance.
The default is None for no scaling. If `sigma` is a scalar, it is
assumed that `sigma` is an n x n diagonal matrix with the given
scalar, `sigma` as the value of each diagonal element. If `sigma`
is an n-length vector, then `sigma` is assumed to be a diagonal
matrix with the given `sigma` on the diagonal. This should be the
same as WLS.
`sigma` is the weighting matrix of the covariance.
The default is None for no scaling. If `sigma` is a scalar, it is
assumed that `sigma` is an n x n diagonal matrix with the given
scalar, `sigma` as the value of each diagonal element. If `sigma`
is an n-length vector, then `sigma` is assumed to be a diagonal
matrix with the given `sigma` on the diagonal. This should be the
same as WLS.
%(extra_params)s
Attributes
----------
**Attributes**
pinv_wexog : array
`pinv_wexog` is the p x n Moore-Penrose pseudoinverse of `wexog`.
cholsimgainv : array
@@ -1563,7 +1563,7 @@ def get_robustcov_results(self, cov_type='HC1', use_t=None, **kwds):
currently available:
- 'HC0', 'HC1', 'HC2', 'HC3' and no keyword arguments:
heteroscedasticity robust covariance
heteroscedasticity robust covariance
- 'HAC' and keywords
- `maxlag` integer (required) : number of lags to use
@@ -1591,8 +1591,8 @@ def get_robustcov_results(self, cov_type='HC1', use_t=None, **kwds):
adjusted.
- 'hac-groupsum' Driscoll and Kraay, heteroscedasticity and
autocorrelation robust standard errors in panel data
keywords
autocorrelation robust standard errors in panel data
keywords
- `time` array_like (required) : index of time periods
- `maxlag` integer (required) : number of lags to use
@@ -1608,12 +1608,13 @@ def get_robustcov_results(self, cov_type='HC1', use_t=None, **kwds):
#TODO: we need more options here
- 'hac-panel' heteroscedasticity and autocorrelation robust standard
errors in panel data.
The data needs to be sorted in this case, the time series for
each panel unit or cluster need to be stacked.
keywords
errors in panel data.
The data needs to be sorted in this case, the time series for
each panel unit or cluster need to be stacked.
keywords
- `time` array_like (required) : index of time periods
- `maxlag` integer (required) : number of lags to use
- `kernel` string (optional) : kernel, default is Bartlett
- `use_correction` False or string in ['hac', 'cluster'] (optional) :
@@ -1625,10 +1626,10 @@ def get_robustcov_results(self, cov_type='HC1', use_t=None, **kwds):
Reminder:
`use_correction` in "nw-groupsum" and "nw-panel" is not bool,
needs to be in [False, 'hac', 'cluster']
needs to be in [False, 'hac', 'cluster']
TODO: Currently there is no check for extra or misspelled keywords,
except in the case of cov_type `HCx`
except in the case of cov_type `HCx`
"""

ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.