Rebooting ggplot2 for scalable big data visualization
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
R Update bin2d computation error Jul 22, 2016
man fix document Mar 24, 2016
DESCRIPTION Test installation Apr 19, 2017
LICENSE reduce WARNINGS and NOTES in check() Mar 23, 2016
NAMESPACE reduce WARNINGS and NOTES in check() Mar 23, 2016 commit test Apr 17, 2017


ggplot2.SparkR is an R package for scalable visualization of big data represented in Spark DataFrame. It is an extension to the original ggplot2 package and can seamlessly handle both R data.frame and Spark DataFrame with no modifications to the original API.

ggplot2.SparkR requires no additional training for existing R users who are already familiar with ggplot2 and allows them to benefit from powerful distributed processing capabilities of Spark for efficient visualization of big data.

Until now, 6 graph types (bar, bin2d, boxplot, freqpoly, histogram, stat-sum graphs) and 15 options are supported. We plans to further extend it in the future.

Find out more at


Get the development version from github:

# install.packages("devtools")

Mailing list

Your are welcome to ask ggplot2.SparkR questions or bugs on ggplot2.SparkR or send an email to Anyone can read the archived discussion that you post messages.

Other Resources

  • ggplot2: Plotting system for R by Hadley Wickham
  • Apache Spark: Large-scale data processing engine.