Skip to content
R package for accessing and analyzing CDC NHANES data
R
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
data-raw Raw data for detection limits and summary table lookup Jun 9, 2016
docs More pkdgown files Nov 16, 2017
man Updated docs Nov 1, 2017
tests Updated test to account for new datasets Oct 3, 2017
vignettes Added histogram and sample size examples to vignette Sep 26, 2016
.Rbuildignore Added pkgdown package website Nov 16, 2017
.gitattributes 💥🐫 Added .gitattributes & .gitignore files Dec 17, 2015
.gitignore
.travis.yml Install libxml2 as an apt-addon on travis CI Sep 13, 2016
DESCRIPTION Updated docs Nov 1, 2017
LICENSE Added year/copyright holder Mar 10, 2016
NAMESPACE Geometric mean function Sep 13, 2017
NEWS.md Added NEWS and cran-comments markdown files Nov 22, 2016
README.md Update README.md Nov 16, 2017
cran-comments.md Added pkgdown package website Nov 16, 2017

README.md

RNHANES

RNHANES is an R package for accessing and analyzing CDC NHANES (National Health and Nutrition Examination Survey) data that was developed by Silent Spring Institute.

CRAN Version Build Status codecov.io downloads per month grand total downloads

Demo of RNHANES

Features

  • Download and search NHANES variable and data file lists
  • Download and cache NHANES data files
  • Compute survey-weighted detection frequencies, quantiles, and geometric means
  • Plot weighted histograms

Install

You can install the latest stable version through CRAN:

install.packages("RNHANES")

Or you can install the latest development version from github:

library(devtools)

install_github("silentspringinstitute/RNHANES")

Documentation

You can browse the package's documentation on the RNHANES website: http://silentspringinstitute.github.io/RNHANES/.

Examples

library(RNHANES)

# Download environmental phenols & parabens data from the 2011-2012 survey cycle
dat <- nhanes_load_data("EPH", "2011-2012")

# Download the same data, but this time include demographics data (which includes sample weights)
dat <- nhanes_load_data("EPH", "2011-2012", demographics = TRUE)

# Find the sample size for urinary triclosan
nhanes_sample_size(dat,
  column = "URXTRS",
  comment_column = "URDTRSLC",
  weights_column = "WTSA2YR")

# Compute the detection frequency of urinary triclosan
nhanes_detection_frequency(dat,
  column = "URXTRS",
  comment_column = "URDTRSLC",
  weights_column = "WTSA2YR")

# Compute 95th and 99th quantiles for urinary triclosan
nhanes_quantile(dat,
  column = "URXTRS",
  comment_column = "URDTRSLC",
  weights_column = "WTSA2YR",
  quantiles = c(0.95, 0.99))
  
# Compute geometric mean of urinary triclosan
nhanes_geometric_mean(dat,
  column = "URXTRS",
  weights_column = "WTSA2YR")

# Plot a histogram of the urinary triclosan distribution
nhanes_hist(dat,
  column = "URXTRS",
  comment_column = "URDTRSLC",
  weights_column = "WTSA2YR")

# Build a survey design object for use with survey package
design <- nhanes_survey_design(dat, weights_column = "WTSA2YR")

Geometric mean

An easy way to calculate geometric means is now built into RNHANES via the nhanes_geometric_mean function, but the version in CRAN hasn't been updated yet. If you are using the CRAN version, however, you can compute them by taking the arithmetic mean of a log-transformed variable and exponentiating. Here's an example:

library(survey)
library(RNHANES)
library(tidyverse)

dat <- nhanes_load_data("EPHPP_H", "2013-2014", demographics = TRUE) %>%
  filter(!is.na(URXBPH))

des <- nhanes_survey_design(dat, "WTSB2YR")

logmean <- svymean(~log(URXBPH), des, na.rm = TRUE)

# Geometric mean lower 95% confidence interval
exp(logmean[1] - 1.96 * sqrt(attr(logmean, "var")))

# Geometric mean
exp(logmean)[1]

# Geometric mean upper 95% confidence interval
exp(logmean[1] + 1.96 * sqrt(attr(logmean, "var")))

Correlations

I recommend using the svycor function from the jtools package to compute survey-weighted Pearson correlations between NHANES variables:

library(RNHANES)
library(tidyverse)
library(jtools)

# Download PAH dataset
nhanes_dat <- nhanes_load_data("PAH_H", "2013-2014", demographics = TRUE)

# Build the survey design object
des <- nhanes_survey_design(nhanes_dat)

svycor(~log(URXP01) + log(URXP04) + log(URXP06) + log(URXP10), design = des, na.rm = TRUE)
You can’t perform that action at this time.