Skip to content
Tools for visualizing uncertainty with ggplot2
R
Branch: master
Clone or download
Latest commit aeae12b Jan 24, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R add code comment Jan 24, 2019
data-raw/cacao add data set, update readme Jan 2, 2019
data add data set, update readme Jan 2, 2019
docs rebuild animations Jan 7, 2019
man rebuild animations Jan 7, 2019
tests fixes to get cran check to succeed Jan 5, 2019
vignettes add code comment Jan 24, 2019
.Rbuildignore add pkgdown build Jan 4, 2019
.gitignore started a vignette Jan 3, 2019
DESCRIPTION
ISSUE_TEMPLATE.md basic package structure Aug 26, 2018
NAMESPACE modify code for pending new strapgod version Jan 5, 2019
README.Rmd site update Jan 5, 2019
README.md site update Jan 5, 2019
_pkgdown.yml add pkgdown build Jan 4, 2019
ungeviz.Rproj

README.md

ungeviz

Tools for visualizing uncertainty with ggplot2.

This package is meant to provide helpful add-on functionality for ggplot2 to visualize uncertainty. The package is particularly focused on hypothetical outcome plots (HOPs) and provides bootstrapping and sampling functionality that integrates well with the ggplot2 API.

The package name comes from the German word “Ungewissheit”, which means uncertainty.

Installation

devtools::install_github("wilkelab/ungeviz")

Sampling and bootstrapping

The sampler() and bootstrapper() functions generate sampling and bootstrapping objects, respectively, that can be used in place of data in ggplot2 layers. These objects are helpful when creating HOPs.

library(ggplot2)
library(dplyr)
library(forcats)
library(ungeviz)
library(gganimate)

cacao %>% filter(location %in% c("Canada", "U.S.A.")) %>%
  ggplot(aes(rating, location)) +
  geom_point(
    position = position_jitter(height = 0.3, width = 0.05), 
    size = 0.4, color = "#0072B2", alpha = 1/2
  ) +
  geom_vpline(data = sampler(25, group = location), height = 0.6, color = "#D55E00") +
  theme_bw() + 
  # `.draw` is a generated column indicating the sample draw
  transition_states(.draw, 1, 3)

Both the bootstrapper and sampler objects can be used for repeated reproducible sampling, by passing the same bootstrapper or sampler object as data to multiple ggplot2 layers.

data(BlueJays, package = "Stat2Data")

# set up bootstrapping object that generates 20 bootstraps
# and groups by variable `KnownSex`
bsr <- bootstrapper(20, KnownSex)

ggplot(BlueJays, aes(BillLength, Head, color = KnownSex)) +
  geom_smooth(method = "lm", color = NA) +
  geom_point(alpha = 0.3) +
  # `.row` is a generated column providing a unique row number for all rows
  geom_point(data = bsr, aes(group = .row)) +
  geom_smooth(data = bsr, method = "lm", fullrange = TRUE, se = FALSE) +
  facet_wrap(~KnownSex, scales = "free_x") +
  scale_color_manual(values = c(F = "#D55E00", M = "#0072B2"), guide = "none") +
  theme_bw() +
  transition_states(.draw, 1, 1) + 
  enter_fade() + exit_fade()

Smooth draws

Instead of bootstrapping smoothers or regression lines, we can also fit a smoothing model to the data and then generate fit lines by randomly drawing from the posterior distribution. This strategy is automated in stat_smooth_draws(), which works similar to stat_smooth() but generates multiple equally probable fit draws rather than one best-fit line.

ggplot(mtcars, aes(hp, mpg)) + 
  geom_point() +
  stat_smooth_draws(times = 20, aes(group = stat(.draw))) + 
  theme_bw() +
  transition_states(stat(.draw), 1, 2) +
  enter_fade() + exit_fade()

Miscellaneous geoms and stats

Several geoms and stats are provided that can be helpful when visualizing uncertainty, including geom_hpline() and geom_vpline() used in the sampling example above, and stat_confidence_density() which can draw confidence strips.

library(broom)
library(emmeans)

cacao_lumped <- cacao %>%
  mutate(
    location = fct_lump(location, n = 20)
  )
  
cacao_means <- lm(rating ~ location, data = cacao_lumped) %>%
  emmeans("location") %>%
  tidy() %>%
  mutate(location = fct_reorder(location, estimate))

ggplot(cacao_means, aes(x = estimate, moe = std.error, y = location)) +
  stat_confidence_density(fill = "lightblue", height = 0.8, confidence = 0.68) +
  geom_point(aes(x = estimate), size = 2) +
  geom_errorbarh(aes(xmin = estimate - std.error, xmax = estimate + std.error), height = 0.5) +
  xlim(2.6, 3.7) +
  theme_minimal()

You can’t perform that action at this time.