Skip to content
Epidemiology analysis package
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs Adding missing_model() to GEstimationSNM Jul 17, 2019
tests Adding missing_model() to GEstimationSNM Jul 17, 2019
zepid some ReadTheDocs fixes Jul 17, 2019
.travis.yml TravisCI 3.7 update Feb 2, 2019 v0.8.0 information Jul 16, 2019
LICENSE.txt Add files via upload Feb 13, 2018 v0.1.6 Jul 16, 2018 Improved testing for SNM Apr 24, 2019
setup.cfg Updated documents fro PyPI distribution Jun 25, 2018 update for matplotlib 3.1+ Apr 21, 2019
travis-requirements.txt removing TravisCI SciPy restriction Jul 9, 2019



PyPI version Build Status Documentation Status Join the chat at

zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is to provide a toolset to make epidemiology e-z. A variety of calculations and plots can be generated through various functions. For a sample walkthrough of what this library is capable of, please look at the tutorials available at

A few highlights: basic epidemiology calculations, easily create functional form assessment plots, easily create effect measure plots, and causal inference tools. Implemented estimators include; inverse probability of treatment weights, inverse probability of censoring weights, inverse probabilitiy of missing weights, augmented inverse probability of treatment weights, time-fixed g-formula, Monte Carlo g-formula, Iterative conditional g-formula, and targeted maximum likelihood (TMLE). Additionally, generalizability/transportability tools are available including; inverse probability of sampling weights, g-transport formula, and doubly robust generalizability/transportability formulas.

If you have any requests for items to be included, please contact me and I will work on adding any requested features. You can contact me either through GitHub (, email (gmail: zepidpy), or twitter (@zepidpy).



You can install zEpid using pip install zepid


pandas >= 0.18.0, numpy, statsmodels >= 0.7.0, matplotlib >= 2.0, scipy, tabulate

Module Features


Calculate measures directly from a pandas dataframe object. Implemented measures include; risk ratio, risk difference, odds ratio, incidence rate ratio, incidence rate difference, number needed to treat, sensitivity, specificity, population attributable fraction, attributable community risk

Measures can be directly calculated from a pandas DataFrame object or using summary data.

Other handy features include; splines, Table 1 generator, interaction contrast, interaction contrast ratio, positive predictive value, negative predictive value, screening cost analyzer, counternull p-values, convert odds to proportions, convert proportions to odds

For guided tutorials with Jupyter Notebooks:


Uses matplotlib in the background to generate some useful plots. Implemented plots include; functional form assessment (with statsmodels output), p-value function plots, spaghetti plot, effect measure plot (forest plot), receiver-operator curve, dynamic risk plots, and L'Abbe plots

For examples see:


The causal branch includes various estimators for causal inference with observational data. Details on currently implemented estimators are below:

G-Computation Algorithm

Current implementation includes; time-fixed exposure g-formula, Monte Carlo g-formula, and iterative conditional g-formula

Inverse Probability Weights

Current implementation includes; IP Treatment W, IP Censoring W, IP Missing W. Diagnostics are also available for IPTW. IPMW supports monotone missing data

Augmented Inverse Probability Weights

Current implementation includes the augmented-IPTW estimator described by Funk et al 2011 AJE

Targeted Maximum Likelihood Estimator

TMLE can be estimated through standard logistic regression model, or through user-input functions. Alternatively, users can input machine learning algorithms to estimate probabilities. Supported machine learning algorithms include sklearn

Generalizability / Transportability

For generalizing results or transporting to a different target population, several estimators are available. These include inverse probability of sampling weights, g-transport formula, and doubly robust formulas

Tutorials for the usage of these estimators are available at:

G-estimation of Structural Nested Mean Models

Single time-point g-estimation of structural nested mean models are supported.

Sensitivity Analyses

Includes trapezoidal distribution generator, corrected Risk Ratio

Tutorials are available at:

You can’t perform that action at this time.