Skip to content
This repository

R and Hadoop Integrated Programming Environment

branch: master
README.org

RHIPE: R and Hadoop Integrated Processing Environment

Analyze Data using Hadoop tools from within the R environment (e.g. MapReduce)

Documentation: http://saptarshiguha.github.com/RHIPE/

Binary downloads: http://ml.stat.purdue.edu/rhipebin/

Changes

Many! And better documentation to come soon. Very importantly, rhmr has gone and is now replaced by rhwatch. Though the warning says ‘use the same arguments’, this is not true. Here is a quick conversion

The parameters ifolder, ofolder and inout have gone.

  • For sequence input and sequence output, rhwatch(, input=path-to-input, output=path-to-output)
  • For text input and sequence output, rhwatch(, input=rhfmt(path-to-input,type‘text’), output=path-to-output=
  • For text output, sequence input, rhwatch(, input=path-to-input, output=rhfmt(path-to-output, type‘text’)=
  • For the mapreduce equivalent of lapply(1:N, F), do rhwatch(map=rhmap({ rhcollect(k, F(k) )}), input=N) for documentation regarding these , ask the google groups mailing list.

And there is support for reading from HBase, writing to HBase too(experimental).

Something went wrong with that request. Please try again.