Skip to content
master
Go to file
Code

README.md

Beeswarm-style plots with ggplot2

Build Status CRAN status

Introduction

Beeswarm plots (aka column scatter plots or violin scatter plots) are a way of plotting points that would ordinarily overlap so that they fall next to each other instead. In addition to reducing overplotting, it helps visualize the density of the data at each point (similar to a violin plot), while still showing each data point individually.

ggbeeswarm provides two different methods to create beeswarm-style plots using ggplot2. It does this by adding two new ggplot geom objects:

  • geom_quasirandom: Uses a van der Corput sequence or Tukey texturing (Tukey and Tukey "Strips displaying empirical distributions: I. textured dot strips") to space the dots to avoid overplotting. This uses sherrillmix/vipor.

  • geom_beeswarm: Uses the beeswarm library to do point-size based offset.

Features:

  • Can handle categorical variables on the y-axis (thanks @smsaladi, @koncina)
  • Automatically dodges if a grouping variable is categorical and dodge.width is specified (thanks @josesho)

See the examples below.

Installation

This package is on CRAN so install should be a simple:

install.packages('ggbeeswarm')

If you want the development version from GitHub, you can do:

devtools::install_github("eclarke/ggbeeswarm")

Examples

Here is a comparison between geom_jitter and geom_quasirandom on the iris dataset:

set.seed(12345)
library(ggplot2)
library(ggbeeswarm)
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()

plot of chunk ggplot2-compare

ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()

plot of chunk ggplot2-compare

geom_quasirandom()

Using geom_quasirandom:

#default geom_quasirandom
ggplot(mpg,aes(class, hwy)) + geom_quasirandom()

plot of chunk ggplot2-examples

# With categorical y-axis
ggplot(mpg,aes(hwy, class)) + geom_quasirandom(groupOnX=FALSE)

plot of chunk ggplot2-examples

# Some groups may have only a few points. Use `varwidth=TRUE` to adjust width dynamically.
ggplot(mpg,aes(class, hwy)) + geom_quasirandom(varwidth = TRUE)

plot of chunk ggplot2-examples

# Automatic dodging
sub_mpg <- mpg[mpg$class %in% c("midsize", "pickup", "suv"),]
ggplot(sub_mpg, aes(class, displ, color=factor(cyl))) + geom_quasirandom(dodge.width=1)

plot of chunk ggplot2-examples

Alternative methods

geom_quasirandom can also use several other methods to distribute points. For example:

ggplot(iris, aes(Species, Sepal.Length)) + geom_quasirandom(method = "tukey") + 
    ggtitle("Tukey texture")

plot of chunk ggplot2-methods

ggplot(iris, aes(Species, Sepal.Length)) + geom_quasirandom(method = "tukeyDense") + 
    ggtitle("Tukey + density")

plot of chunk ggplot2-methods

ggplot(iris, aes(Species, Sepal.Length)) + geom_quasirandom(method = "frowney") + 
    ggtitle("Banded frowns")

plot of chunk ggplot2-methods

ggplot(iris, aes(Species, Sepal.Length)) + geom_quasirandom(method = "smiley") + 
    ggtitle("Banded smiles")

plot of chunk ggplot2-methods

ggplot(iris, aes(Species, Sepal.Length)) + geom_quasirandom(method = "pseudorandom") + 
    ggtitle("Jittered density")

plot of chunk ggplot2-methods

ggplot(iris, aes(Species, Sepal.Length)) + geom_beeswarm() + ggtitle("Beeswarm")

plot of chunk ggplot2-methods

geom_beeswarm()

Using geom_beeswarm:

ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()

plot of chunk ggplot2-beeswarm

ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm(beeswarmArgs=list(side=1))

plot of chunk ggplot2-beeswarm

ggplot(mpg,aes(class, hwy)) + geom_beeswarm(size=.5)

plot of chunk ggplot2-beeswarm

# With categorical y-axis
ggplot(mpg,aes(hwy, class)) + geom_beeswarm(size=.5,groupOnX=FALSE)

plot of chunk ggplot2-beeswarm

# Also watch out for points escaping from the plot with geom_beeswarm
ggplot(mpg,aes(hwy, class)) + geom_beeswarm(size=.5,groupOnX=FALSE) + scale_y_discrete(expand=expand_scale(add=c(0.5,1)))

plot of chunk ggplot2-beeswarm

ggplot(mpg,aes(class, hwy)) + geom_beeswarm(size=1.1)

plot of chunk ggplot2-beeswarm

# With automatic dodging
ggplot(sub_mpg, aes(class, displ, color=factor(cyl))) + geom_beeswarm(dodge.width=0.5)

plot of chunk ggplot2-beeswarm

#With different beeswarm point distribution priority
dat<-data.frame(x=rep(1:3,c(20,40,80)))
dat$y<-rnorm(nrow(dat),dat$x)
ggplot(dat,aes(x,y)) + geom_beeswarm(size=2) + ggtitle('Default (ascending)') + scale_x_continuous(expand=expand_scale(add=c(0.5,.5)))

plot of chunk ggplot2-beeswarm

ggplot(dat,aes(x,y)) + geom_beeswarm(size=2,priority='descending') + ggtitle('Descending') + scale_x_continuous(expand=expand_scale(add=c(0.5,.5)))

plot of chunk ggplot2-beeswarm

ggplot(dat,aes(x,y)) + geom_beeswarm(size=2,priority='density') + ggtitle('Density') + scale_x_continuous(expand=expand_scale(add=c(0.5,.5)))

plot of chunk ggplot2-beeswarm

ggplot(dat,aes(x,y)) + geom_beeswarm(size=2,priority='random') + ggtitle('Random') + scale_x_continuous(expand=expand_scale(add=c(0.5,.5)))

plot of chunk ggplot2-beeswarm


Authors: Erik Clarke and Scott Sherrill-Mix

You can’t perform that action at this time.