orphan

0.6 Release

Release 0.6.0

Release summary.

Major changes:

Addition of Generalized Estimating Equations GEE

Generalized Estimating Equations

Generalized Estimating Equations (GEE) provide an approach to handling dependent data in a regression analysis. Dependent data arise commonly in practice, such as in a longitudinal study where repeated observations are collected on subjects. GEE can be viewed as an extension of the generalized linear modeling (GLM) framework to the dependent data setting. The familiar GLM families such as the Gaussian, Poisson, and logistic families can be used to accommodate dependent variables with various distributions.

Here is an example of GEE Poisson regression in a data set with four count-type repeated measures per subject, and three explanatory covariates.

import numpy as np import pandas as pd from statsmodels.genmod.generalized_estimating_equations import GEE from statsmodels.genmod.dependence_structures import Independence from statsmodels.genmod.families import Poisson

data_url = "http://vincentarelbundock.github.io/Rdatasets/csv/MASS/epil.csv" data = pd.read_csv(data_url)

fam = Poisson() ind = Independence() md1 = GEE.from_formula("y ~ age + trt + base", data, groups=data["subject"], covstruct=ind, family=fam) mdf1 = md1.fit() print mdf1.summary()

The dependence structure in a GEE is treated as a nuisance parameter and is modeled in terms of a "working dependence structure". The statsmodels GEE implementation currently includes five working dependence structures (independent, exchangeable, autoregressive, nested, and a global odds ratio for working with categorical data). Since the GEE estimates are not maximum likelihood estimates, alternative approaches to some common inference procedures have been developed. The statsmodels GEE implementation currently provides standard errors and allows score tests for arbitrary parameter contrasts.

Seasonality Plots

Adding functionality to look at seasonality in plots. Two new functions are sm.graphics.tsa.month_plot and sm.graphics.tsa.quarter_plot. Another function sm.graphics.tsa.seasonal_plot is available for power users.

import statsmodels.api as sm
import pandas as pd

dta = sm.datasets.elnino.load_pandas().data
dta['YEAR'] = dta.YEAR.astype(int).astype(str)
dta = dta.set_index('YEAR').T.unstack()
dates = map(lambda x : pd.datetools.parse('1 '+' '.join(x)),
                                       dta.index.values)

dta.index = pd.DatetimeIndex(dates, freq='M')
fig = sm.tsa.graphics.month_plot(dta)

Other important new features

Added sm.tsa.arma_order_select_ic. A convenience function to quickly get the information criteria for use in tentative order selection of ARMA processes.
Plotting functions for timeseries is now imported under the sm.tsa.graphics namespace in addition to sm.graphics.tsa.

Major Bugs fixed

Bullet list of major bugs
With a link to its github issue.
Use the syntax :ghissue:`###`.

Backwards incompatible changes and deprecations

RegressionResults.norm_resid is now a readonly property, rather than a function.

Development summary and credits

A blurb about the number of changes and the contributors list.

Note

Obtained by running git log v0.5.0..HEAD --format='* %aN <%aE>' | sed 's/@/\-at\-/' | sed 's/<>//' | sort -u.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

version0.6.rst

version0.6.rst

0.6 Release

Release 0.6.0

Generalized Estimating Equations

Seasonality Plots

Other important new features

Major Bugs fixed

Backwards incompatible changes and deprecations

Development summary and credits

Files

version0.6.rst

Latest commit

History

version0.6.rst

File metadata and controls

0.6 Release

Release 0.6.0

Generalized Estimating Equations

Seasonality Plots

Other important new features

Major Bugs fixed

Backwards incompatible changes and deprecations

Development summary and credits