Hortonworks demo of Enron emails using Hadoop, Pig, HBase, JRuby, Sinatra
Ruby
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
lib
test
README.md
hbase_example.rb
load_hbase.pig
pig_jruby_avro_hbase.pig
udf.rb
web.rb

README.md

enron-jruby-sinatra-hbase-pig

Hortonworks demo of Enron emails using Hadoop, Pig, HBase, JRuby, Sinatra.

See blog post at: http://hortonworks.com/blog

Installing HBase

Download HBase: http://www.apache.org/dyn/closer.cgi/hbase/

wget http://apache.mesi.com.ar/hbase/hbase-0.94.1/hbase-0.94.1.tar.gz
tar -xvzf hbase-0.94.1.tar.gz
sudo mkdir /var/hbase

Now edit hbase-0.94.1/conf/hbase-site.xml to include:

<property>
  <name>hbase.rootdir</name>
  <value>file:///var/hbase</value>
</property>

And launch HBase in local mode:

./bin/start-hbase.sh

echo 'export HBASE_HOME=/me/hbase-0.94.1' >> ~/.bash_profile

Installing Pig

You can download Pig here: http://www.apache.org/dyn/closer.cgi/pig

wget http://apache.mirrors.lucidnetworks.net/pig/pig-0.10.0/pig-0.10.0.tar.gz
tar -xvzf pig-0.10.0.tar.gz

Installing JRuby

You can download JRuby at http://jruby.org/download or better yet, install it via rvm, which you can install via the instructions here: https://rvm.io/rvm/install/.

Installing Sinatra

jgem install sinatra
jgem install json

JRuby and HBase

cd $HBASE_HOME_
wget http://central.maven.org/maven2/org/jruby/jruby-complete/1.6.7.2/jruby-complete-1.6.7.2.jar
export CLASSPATH=$CLASSPATH:`java -jar jruby-complete-1.6.7.2.jar -e "puts Dir.glob('{.,build,lib}/*.jar').join(':')"`