Skip to content

Add option to jitter outliers in a boxplot #4480

@jolars

Description

@jolars

I would like to be able to add a small amount of jittering to outliers in a boxplot or alternatively stack the points to avoid having them overlap.

Here is an example of where points in a boxplot overlap:

library(ggplot2)
library(dplyr)

# outliers are overlapping
ggplot(mpg, aes(drv, cty)) +
  geom_boxplot()

To add jittering to these outliers, we currently have to result to the following hack, by creating a separate dataset of outliers and plotting them using geom_jitter() manually.

# adding jittering to outliers is a bit of work
outliers <- 
  mpg %>%
  group_by(drv) %>%
  filter(cty > quantile(cty, 0.75) + 1.5 * IQR(cty) | 
           cty < quantile(cty, 0.25) - 1.5 * IQR(cty))

ggplot(mpg, aes(drv, cty)) +
  geom_boxplot(outlier.shape = NA) +
  geom_jitter(height = 0, width = 0.1, data = outliers)

I understand that the position argument in geom_boxplot() is already "occupied", so maybe the simplest solution would probably to just add a new argument outlier.jitter = c(0, 0) (for x and y coordinate jittering respectively).

An even better solution would of course be to incorporate the beeswarm algorithm from ggbeeswarm:

library(ggbeeswarm)

ggplot(mpg, aes(drv, cty)) +
  geom_boxplot(outlier.shape = NA) +
  geom_beeswarm(data = outliers)

Created on 2021-05-17 by the reprex package (v2.0.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions