Skip to content

Latest commit



80 lines (58 loc) · 3.25 KB


File metadata and controls

80 lines (58 loc) · 3.25 KB

sparklyr Project Proposal

Name of the project: sparklyr

Requested maturity level: Incubation


The sparklyr R package provides a modern interface to Apache Spark, a fast and general engine for big data processing. This package supports connecting to local and remote Apache Spark clusters, provides a dplyr compatible back-end, an interface to Spark’s built-in machine learning algorithms, support for Spark structured streams, Spark pipelines, support to execute custom R code across Spark clusters, and enables multiple extensions to use H2O, XGBoost, GraphFrames, MLeap and many others in Spark from R.

Possible integrations with existing LF AI projects: - Enable support for Horovod in R with sparklyr. - Enable support to export models and pipelines.

License: Apache License 2.0

External dependencies:

Depends on:

Initial committers:

Infrastructure requests: None

Current mailing lists: None, we will request LF to create lists for users, developers, and TSC.


Website: Currently, need to transfer site to new domain.

Release methodology & mechanics: The release is performed when committers pass the vote to do so. The release is performed by the Release Manager (Javier Luraschi).

The release procedure is describede in

Social media accounts: None

Existing sponsorship: RStudio, Databricks, and Qubole have provided developer resources to improve sparklyr.