Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Ruby ORM for HBase - Forked from sqs and updated for current HBase
Fetching latest commit...
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


= Rhino - a Ruby ORM for HBase

Rhino is a Ruby object-relational mapping (ORM) for HBase[].

=== Support & contact

Authors: Quinn Slack[], Jay Hoover

Contributors: Dru Jensen

== Getting started

=== Install Rhino

  gem install rhino
=== Installing HBase and Thrift

Since Rhino uses the HBase Thrift API, you must first install both HBase and Thrift. Downloading the latest trunk revisions of each is recommended, but if you encounter problems, try using the latest stable release instead. Here are the basic steps for installing both.

==== Installing HBase

There are a number of tutorials and help pages available for
installing HBase. (The HBase Wiki)[]
is the best place to get started.

==== Installing Thrift

Thrift[] instructions
can help you with installing a recent version of the Thrift libraries.

== Usage  

=== Connect to HBase

The following code points Rhino to the Thrift server you just started (which by default listens on localhost:9090).
  Rhino::Model.connect('localhost', 9090)
=== Describe your table

A class definition like:

  class Page < Rhino::Model
    include Rhino::Constraints

    column_family :title
    column_family :contents
    column_family :links
    column_family :meta
    column_family :images
    alias_attribute :author, 'meta:author'

    has_many :links, Link
    has_many :images, Image

    constraint(:title_required) { |page| page.title and !page.title.empty? }
  end mapped to the following HBase table (described in {HBase Query Language}[]

  CREATE TABLE pages(title:, contents:, links:, meta:, images:);
Or, in version 0.2's JRuby shell language:
  create 'pages', 'title', 'contents', 'links', 'meta', 'images'

=== Basic operations

==== Getting records

  page = Page.get('some-page')
  all_pages = Page.get_all()
==== Creating new records

  # data can be specified in the second argument of
  page ='the-row-key', {:title=>"my title"})
  # ...or as attributes on the model
  page.contents = "<p>welcome</p>"

==== Reading and updating attributes

  page = Page.get('some-key')
  puts "the old title is: #{page.title}"
  page.title = "another title"
  puts "the new title is: #{page.title}"

You can also read from and write to specific columns in a column family.
Since we already defined the <tt>meta:</tt> column family, Rhino knows we want to set the <tt>meta:author</tt> column:

  page = Page.get('some-key')
  page.meta_author = "John Doe"
  puts "the author is: #{page.meta_author}"

=== has_many and belongs_to

In the model definition above, we stated that a Page <tt>has_many :links</tt> and <tt>has_many :images</tt>. We can 
define what a Link and an Image is with greater detail now.

  class Link < Rhino::Cell
    belongs_to :page
    def url
      url_parts = key.split('/')
      backwards_host = url_parts.shift
      path = url_parts.join('/')
      host = backwards_host.split('.').reverse.join('.')

  class Image < Rhino::Cell
    belongs_to :page
Now that we've defined <tt>Link</tt> and <tt>Image</tt>, we can work with them easily. The following code adds a link to the page <tt>com.example</tt>, which when written to HBase becomes a cell in the <tt>links:</tt> column family named <tt></tt> with the contents <tt>search engine</tt>.

  page = Page.get('com.example')
  page.links << '' => 'search engine'

You can add cells to an association in two ways. First, you can pass
in a hash of 'key' to 'contents' data. Second, you can pass in
cells. Either mechanism allows you to add one or more cells at a time.

  page.links <<'', 'search engine')
  page.links.add ['', 'search engine'),
         ', 'search engine') ]

You can also add cells to a row when you are creating it:

  page = 'my_page',
                   :title => 'Basic Page',
                   :links => ['', 'search engine') ] )

You can iterate over the collection of links. In this example, we use
<tt>Link#url</tt>, a method we defined on the Link class to convert
from the common <tt>com.example/path</tt> URL storage style to

  page.links.each do |link|
    puts "Link to #{link.url} with text: '#{link.contents}'"
You can also get a specific link.

  google_link_text = page.links.find('').contents

The find method can also take either a regular expression or a block:

  first_match = page.links.find(/google$/)
  first_match = page.links.find{ |cell| cell.contents == 'search engine' }

Similarly you can iterate cells based on expressions as well:$/) do |cell|
    puts cell.contents

=== JSON Cells

TODO: document JSON cells

=== ActiveModel Validations

TODO: document ActiveModel Validations on models, cells and

=== Setting timestamps and retrieving by timestamp
First, let's create some Pages with different timestamps.

  a_week_ago = - 7 * 24 * 3600
  a_month_ago = - 30 * 24 * 3600
  newer_page = Page.create('', {:title=>'newer google'}, {:timestamp=>a_week_ago})
  older_page = Page.create('', {:title=>'older google'}, {:timestamp=>a_month_ago})
Now you can <tt>get</tt> by the timestamps you just set.

  Page.get('', :timestamp=>a_week_ago).title # => "newer google"
  Page.get('', :timestamp=>a_month_ago).title # => "older google"
If you call <tt>get</tt> with no arguments, you will get the most recent Page.

  Page.get('').title # => "newer google"
== More information

Read the specs in the spec/ directory to see more usage examples. Also look at the spec models in spec/spec_helper.rb.
Something went wrong with that request. Please try again.