This page is for master branch only.
Newest features in master
ARIMA wrapper function for MADlib
Full support for HAWQ
Better organization of user manual
A graphical user interface based upon package
sample with or without replacement
generic.cv for cross-validation
generic.bagging for bagging
crossprod for multiplication of two matrices
Full support for columns with array values
When running on the products of Pivotal Inc., PivotalR utilizes the full power of parallel computation and distributive storage embedded in Greenplum Database or HAWQ, and thus gives the normal R user access to Big Data.
PivotalR also provides the R wrapper for MADlib. MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning algorithms for structured and unstructured data. The number of machine learning algorithms that MADlib covers is quickly increasing.
PivotalR mimics the regular R syntax for manipulating
data.frame to operate on the tables stored in the databases. We strive hard to make PivotalR's learning curve as smooth as possible.
PivotalR also brings R's powerful graphical functionalities to Big Data stored in database or Hadoop.
PivotalR enables the user to create prototypes of machine learning algorithms quickly using the regular R syntax. These prototypes acquire parallel computation power automatically when running on GPDB or HAWQ.
Thus, first copy your R script and then make proper changes to make the script runnable in PivotalR. The goal of PivotalR is to minimize the amount of changes that are needed to convert a normal R script to parallel R script.
See here for some examples.