Skip to content
Browse files

Added install instructions for cluster

  • Loading branch information...
1 parent f1f27d0 commit f0031119dff4d97a696a30182c86bfe00b373e1a @laserson laserson committed Feb 20, 2013
Showing with 38 additions and 0 deletions.
  1. +38 −0 INSTALL.md
View
38 INSTALL.md
@@ -0,0 +1,38 @@
+Installing R and RHadoop on a small CentOS 6 cluster
+====================================================
+
+Using csshX to install R and RHadoop on the cluster.
+
+ csshX --login root --hosts ~/cloudera/cluster/hosts.txt
+
+Install R:
+
+ yum -y --enablerepo=epel install R R-devel
+ R CMD javareconf
+
+Start R REPL and install some packages:
+
+ install.packages(c('Rcpp', 'RJSONIO', 'itertools', 'digest'), repos="http://cran.revolutionanalytics.com", INSTALL_opts=c('--byte-compile') )
+ install.packages(c('functional', 'stringr', 'plyr'), repos="http://cran.revolutionanalytics.com", INSTALL_opts=c('--byte-compile') )
+ install.packages(c('rJava'), repos="http://cran.revolutionanalytics.com" )
+ install.packages(c('randomForest'), repos="http://cran.revolutionanalytics.com" )
+
+Then dnload RHadoop and install
+
+ wget --no-check-certificate https://github.com/RevolutionAnalytics/RHadoop/tarball/master -O - | tar zx
+ mv RevolutionAnalytics-RHadoop* RHadoop
+ R CMD INSTALL --byte-compile RHadoop/rmr2/pkg/
+
+Set some environ vars in `.bashrc`:
+
+ export HADOOP_HOME=/usr/lib/hadoop
+ export HADOOP_CMD=/usr/bin/hadoop
+ export HADOOP_STREAMING=/usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.0-cdh4.1.2.jar
+
+Make sure to source the new variables before continuing.
+
+Installed `rhdfs`:
+
+ R CMD INSTALL --byte-compile RHadoop/rhdfs/pkg/
+
+Done.

0 comments on commit f003111

Please sign in to comment.
Something went wrong with that request. Please try again.