Skip to content


ekoontz edited this page Sep 14, 2010 · 12 revisions


hbase-ec2 is a Ruby library to help manage a set of Amazon EC2 instances as a single HBase cluster.


hbase-ec2 is currently supplied as a set of ruby files that currently is :

  • lib/hcluster.rb : the Hadoop::HCluster and Hadoop::HImage class definitions
  • lib/TestDFSIO.rb : a subclass of HCluster, intended as an example for testing Hadoop filesystem functionality.


  • export RUBYOPT="rubygems"
  • AWS::EC2. You can install this with gem install amazon-ec2.
  • AWS::S3. You can install this with gem install aws-s3.
  • Net::SSH. You can install this with gem install net-ssh.
  • Net::SCP. You can install this with gem install net-scp.
  • OpenSSL support for Ruby. This might also be installed with your ruby, but on Ubuntu, I had to do: apt-get install libruby-extras.
    As the Puppet Installation docs write:

You can test for it by running ‘ruby -ropenssl -e “puts :yep”’. If that errors out, you’re missing the library.

  • An Amazon EC2 account. You must add the following to your environment prior to starting irb:
export AWS_ACCESS_KEY_ID=...
export AWS_ACCOUNT_ID=...
  • A EC2 key pair called “root”. This should be stored in your home directory in ~/.ec2/root.pem.

Optional configuration

You can set your preferred EC2 region with the EC2_URL environment variable; for example:

export EC2_URL=""

By default, will be used. You can see a complete list of available regions by using the ec2-describe-regions function (see Amazon’s Region and Availability Zone FAQ).

Downloading hbase-ec2

git clone git://



$ irb
>> $:.unshift("~/hbase-ec2/lib")
=> ["~/hbase-ec2/lib", ...]
>> load 'hcluster.rb'
=> true
>> include Hadoop
=> Object

Creating an image from hadoop-core and hbase source trees

See: Himage Usage

Starting a new Amazon HBase cluster

>> cluster = :label => 'hbase-0.20.5-x86_64'
=> #<Hadoop::HCluster:0x1010e2098 @rs_key_name="root",
>> cluster.launch
=> "running"
>> cluster.run_test("TestDFSIO -write -nrFiles 10 -fileSize 1000")
(stderr): 10/06/22 19:43:24 INFO mapred.FileInputFormat: nrFiles = 10
(stderr): 10/06/22 19:43:24 INFO mapred.FileInputFormat: fileSize (MB) = 1000
10/06/22 19:44:32 INFO mapred.FileInputFormat:  IO rate std deviation: 1.0992092756403666
10/06/22 19:44:32 INFO mapred.FileInputFormat:     Test exec time sec: 67.721
10/06/22 19:44:32 INFO mapred.FileInputFormat: 
=> nil

Terminating a Cluster

>> cluster.terminate
terminating zookeeper: i-5144a73b
terminating master: i-9344a7f9
terminating regionserver: i-4d4aa927
terminating regionserver: i-434aa929
terminating regionserver: i-414aa92b
terminating regionserver: i-474aa92d
terminating regionserver: i-454aa92f
=> {"name"=>"hdfs", "num_zookeepers"=>1, "master"=>"i-9344a7f9", "launchTime"=>"2010-06-22T23:22:13.000Z", "num_regionservers"=>5, "dnsName"=>"", "state"=>"terminated"}
Something went wrong with that request. Please try again.