Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

acdcourse: Course on Analyzing Cohort Datasets with R

Travis build status DOI

Cohort studies are a powerful study design that allows researchers to better understand how to reduce, manage, or prevent disease in a population. The findings from cohorts are critical to creating effective public health interventions. Because peoples’ lives and health are directly impacted by the findings from cohort studies, there is immense pressure to ensure that the analysis done correctly and appropriately and that the presentation of results is as meaningful and as simple as possible. The difficulty in the analysis also comes from the wide range in study designs, data collection and type, and research questions. In this course, we’ll be covering how and what to analytically ask of cohort data, what are special considerations for data processing, which statistical techniques to choose, and how to present the results for effective communication to public health professionals.

Installation and usage

You can install acdcourse from GitHub with

# install.packages("remotes")

To work through each chapter, run these commands:

learnr::run_tutorial("chapter1", package = "acdcourse")
learnr::run_tutorial("chapter2", package = "acdcourse")
learnr::run_tutorial("chapter3", package = "acdcourse")
learnr::run_tutorial("chapter4", package = "acdcourse")

General course outline

  • Chapter 1: Introduction to cohorts and to analyzing them
    • Lesson 1: Introduction to cohort designs
    • Lesson 2: Cohort types, variables, and the Framingham Study
    • Lesson 3: Prevalence and incidence in cohorts
  • Chapter 2: Exploring, wrangling, and transforming cohort data
    • Lesson 1: Pre-wrangling exploration
    • Lesson 2: Discrete data and tidying it for later analysis
    • Lesson 3: Variable transformations
  • Chapter 3: Statistical methods for cohort data
    • Lesson 1: Statistical analyses for cohort studies
    • Lesson 2: Adjustment, confounding, and model building
    • Lesson 3: Testing for interactions and sensitivity analyses
    • Lesson 4: Tidying and interpreting model results
  • Chapter 4: Presentation of results from cohort analyses
    • Lesson 1: Presenting cohort findings is tricky, be careful
    • Lesson 2: Communicating cohort findings through graphs
    • Lesson 3: Use tables effectively to show your results

The dataset used is:

  • framingham: Framingham Heart Study.

Learner persona

This course will be useful / primarily targetted to three (hypothetical) learners:

  • Catherine: She will likely already know most of the basic statistical techniques in the course. However, this course may give her material on creating her own course, as this course will present various types of cohort study designs and how to interpret the results from the statistical analyses (in the cohort context). This will be useful to her medical students, who rely on this type of data to make medical decisions that could save people’s lives.
  • Jamie: The material in this course will teach her some fundamentals on how to apply statistical techniques using code. If she teaches health insurance policy research at her alma mater, knowing the specifics of analyzing cohort datasets would help make her students better with creating/analyzing/studying health insurance policies.
  • Moe: He will likely not be interested in the statistical portion of this course, though he may find the sections with the code to analyze and interpret the cohort data useful. If he was in any biomedical or health sciences graduate programs, he would find this course very useful.

Learning Objectives

There general learning objectives are to:

  • Learn how cohort data looks like and what questions to ask of it
  • Learn what to look for when exploring and pre-processing the cohort data before analyzing it
  • Learn how to think about and approach processing, analyzing, and presenting results from cohort studies
  • Approach analyzing cohort in a way that allows the results to be meaningful and interpretable to main users, the clinicians and the public health professionals

Prerequisite or familiar knowledge / concepts

  • Tidyverse R packages
  • Descriptive epidemiology
  • Statistics:
    • Correlation and regression
    • Multiple and Logistic Regression
    • Mixed Effects Models
    • Network Analysis

Citing this learning resource


#> To cite package 'acdcourse' in publications use:
#>   Luke Johnston (2019). acdcourse: Course on Analyzing Cohort
#>   Datasets in R. R package version 0.1.0.
#> A BibTeX entry for LaTeX users is
#>   @Manual{,
#>     title = {acdcourse: Course on Analyzing Cohort Datasets in R},
#>     author = {Luke Johnston},
#>     year = {2019},
#>     note = {R package version 0.1.0},
#>     url = {},
#>   }


Course Material on Analyzing Cohort Datasets in R.







No packages published