Skip to content
/ RHIPE Public
forked from delta-rho/RHIPE

R and Hadoop Integrated Programming Environment

Notifications You must be signed in to change notification settings

hlin09/RHIPE

 
 

Repository files navigation

RHIPE: R and Hadoop Integrated Processing Environment

Analyze Data using Hadoop tools from within the R environment (e.g. MapReduce)

Documentation: http://saptarshiguha.github.com/RHIPE/

Binary downloads: http://ml.stat.purdue.edu/rhipebin/

Changes

Many! And better documentation to come soon. Very importantly, rhmr has gone and is now replaced by rhwatch. Though the warning says ‘use the same arguments’, this is not true. Here is a quick conversion

The parameters ifolder, ofolder and inout have gone.

  • For sequence input and sequence output, rhwatch(, input=path-to-input, output=path-to-output)
  • For text input and sequence output, rhwatch(, input=rhfmt(path-to-input,type‘text’), output=path-to-output=
  • For text output, sequence input, rhwatch(, input=path-to-input, output=rhfmt(path-to-output, type‘text’)=
  • For the mapreduce equivalent of lapply(1:N, F), do rhwatch(map=rhmap({ rhcollect(k, F(k) )}), input=N) for documentation regarding these , ask the google groups mailing list.

And there is support for reading from HBase, writing to HBase too(experimental).

About

R and Hadoop Integrated Programming Environment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published