Skip to content
Targeted Maximum Likelihood Estimation for a binary treatment: A tutorial. Statistics in Medicine. 2017
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Appendix_SIM_2018.pdf MALUQUE-SIM-MAY-2018 Apr 13, 2018
OneSimulationBoxes_SIM_2018.R MALUQUE-SIM-MAY-2018 Apr 13, 2018 update.5.14.19 May 15, 2019
Table-2.SimulationsTable2a_SIM_2018.R MALUQUE-SIM-MAY-2018 Apr 13, 2018

Targeted Maximum Likelihood Estimation for a Binary Treatment: A tutorial. Statistics in Medicine, 2018

This repository made available to the scientific community the data and the code presented in the Statistics and Medicine (SIM) manuscript:

Luque‐Fernandez, MA, Schomaker, M, Rachet, B, Schnitzer, ME. Targeted maximum likelihood estimation for a binary treatment: A tutorial. Statistics in Medicine. 2018; 37: 2530– 2546.

Link to the Statistics in Medicine Open Access Article

CITE this repository:


Miguel Angel Luque Fernandez, Michael Schomaker, Bernard Rachet, Mireille Schnitzer


When estimating the average effect of a binary treatment (or exposure), methods that incorporate propensity scores, the G-formula, or targeted maximum likelihood estimation (TMLE) are preferred over naive regression approaches which are biased under misspecification of a parametric outcome model. Contrastingly, propensity score methods require the correct specification of an exposure model. Double-robust methods only require correct specification of one of these models. TMLE is a semi-parametric double-robust method that improves the chances of correct model specification by allowing for flexible estimation using non-parametric machine-learning methods. It therefore requires weaker assumptions than its competitors. We provide a step-by-step guided implementation of TMLE and illustrate it in a realistic scenario based on cancer epidemiology where assumptions about correct model specification and positivity (i.e., when a study participant had zero probability of receiving the treatment) are nearly violated. This article provides a concise and reproducible educational introduction to TMLE for a binary outcome and exposure. The reader should gain sufficient understanding of TMLE from this introductory tutorial to be able to apply the method in practice. Extensive R-code is provided in easy-to-read boxes throughout the article for replicability. Stata users will find a testing implementation of TMLE and additional material in the appendix and at the following GitHub repository:

KEYWORDS: Causal Inference, Machine Learning, Observational Studies, Targeted Maximum Likelihood Estimation, Super Learner, Epidemiology, Statistics

You can’t perform that action at this time.