-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
I would like to be able to add a small amount of jittering to outliers in a boxplot or alternatively stack the points to avoid having them overlap.
Here is an example of where points in a boxplot overlap:
library(ggplot2)
library(dplyr)
# outliers are overlapping
ggplot(mpg, aes(drv, cty)) +
geom_boxplot()
To add jittering to these outliers, we currently have to result to the following hack, by creating a separate dataset of outliers and plotting them using geom_jitter()
manually.
# adding jittering to outliers is a bit of work
outliers <-
mpg %>%
group_by(drv) %>%
filter(cty > quantile(cty, 0.75) + 1.5 * IQR(cty) |
cty < quantile(cty, 0.25) - 1.5 * IQR(cty))
ggplot(mpg, aes(drv, cty)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(height = 0, width = 0.1, data = outliers)
I understand that the position
argument in geom_boxplot()
is already "occupied", so maybe the simplest solution would probably to just add a new argument outlier.jitter = c(0, 0)
(for x and y coordinate jittering respectively).
An even better solution would of course be to incorporate the beeswarm algorithm from ggbeeswarm:
library(ggbeeswarm)
ggplot(mpg, aes(drv, cty)) +
geom_boxplot(outlier.shape = NA) +
geom_beeswarm(data = outliers)
Created on 2021-05-17 by the reprex package (v2.0.0)