Skip to content

An R package for UK Biobank data wrangling.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

rmgpanw/ukbwranglr

Repository files navigation

ukbwranglr

R build status Codecov test coverage pkgdown Launch RStudio Cloud DOI Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.

Overview

The goal of ukbwranglr is to facilitate analysing UK Biobank data, including:

  1. Reading a selection of UK Biobank variables into R.
  2. Summarising repeated continuous variable measurements.1
  3. Extracting phenotypic outcomes of interest from clinical events data.2

Installation

You can install the development version of ukbwranglr with:

# install.packages("devtools")
devtools::install_github("rmgpanw/ukbwranglr")

Basic workflow

The basic workflow is as follows:

  1. Create a data dictionary for your main UK Biobank dataset with make_data_dict().
  2. Read selected variables into R with read_ukb().
  3. Summarise continuous variables with summarise_numerical_variables().
  4. Tidy clinical events data with tidy_clinical_events() or make_clinical_events_db(), and extract outcomes of interest with extract_phenotypes().
  5. Analyse.

Please see vignette('ukbwranglr') for further details.

Footnotes

  1. For example, calculating a mean/minimum/maximum body mass index (BMI) from repeated BMI measurements.

  2. For example, identifying participants with a diagnosis of hypertension from linked primary and secondary health care records.