Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Hai Qian edited this page · 18 revisions
Clone this wiki locally

This page is for master branch only.


Development Guide

Newest features in master

  • ARIMA wrapper function for MADlib

  • Full support for HAWQ

  • Better delete function

  • Better organization of user manual

  • A graphical user interface based upon package shiny

  • sample with or without replacement

  • for cross-validation

  • generic.bagging for bagging

  • crossprod for multiplication of two matrices

  • Full support for columns with array values

PivotalR is an R package, which you can download from CRAN. However, GitHub has the latest code, which has many more functionalities but is less stable.


  • Display the summary of a table in GUI

Table's summary view

  • Linear regression in GUI

Linear regression

Big Data

PivotalR is an R front-end to PostgreSQL and all PostgreSQL-like databases like Pivotal Inc.'s Greenplum Database (GPDB), Pivotal HD / HAWQ.

When running on the products of Pivotal Inc., PivotalR utilizes the full power of parallel computation and distributive storage embedded in Greenplum Database or HAWQ, and thus gives the normal R user access to Big Data.

Machine Learning

PivotalR also provides the R wrapper for MADlib. MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning algorithms for structured and unstructured data. The number of machine learning algorithms that MADlib covers is quickly increasing.

Easily Exploring Data Using the Familiar R Syntax

PivotalR mimics the regular R syntax for manipulating data.frame to operate on the tables stored in the databases. We strive hard to make PivotalR's learning curve as smooth as possible.

PivotalR also brings R's powerful graphical functionalities to Big Data stored in database or Hadoop.

Quick Prototype of Data-Parallel Machine Learning Algorithms

PivotalR enables the user to create prototypes of machine learning algorithms quickly using the regular R syntax. These prototypes acquire parallel computation power automatically when running on GPDB or HAWQ.

Thus, first copy your R script and then make proper changes to make the script runnable in PivotalR. The goal of PivotalR is to minimize the amount of changes that are needed to convert a normal R script to parallel R script.

See here for some examples.

Minimizing the data flow between R and databases

PivotalR class hierarchy structure

The demo script of PivotalR

Something went wrong with that request. Please try again.