JRuby on Hadoop
JRuby on Hadoop is a thin wrapper for Hadoop Mapper / Reducer by JRuby.
Required gems are all on GemCutter.
Upgrade your rubygem to 1.3.5
$ gem install jruby-on-hadoop
Run Hadoop cluster on your machines and set HADOOP_HOME env variable.
put files into your hdfs. ex) test/inputs/file1
Now you can run 'joh' like below:
$ joh examples/wordcount.rb test/inputs test/outputs
You can get Hadoop job results in your hdfs test/outputs/part-*
Script example. (see also examples/wordcount.rb)
def setup(conf) # setup jobconf end def map(script, key, value, output, reporter) # mapper process end def reduce(script, key, values, output, reporter) # reducer process end
You can build hadoop-ruby.jar by “ant”.
Required to set env HADOOP_HOME for your system. Assumed Hadoop version is 0.19.2.
Koichi Fujikawa <email@example.com>
License: Apache License