Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Ruby lib for interacting with a Hadoop JobTracker / TaskTrackers

branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

README.rdoc

houdah – Ruby lib for interacting with a Hadoop JobTracker / TaskTrackers

Houdah is a Ruby thrift lib for interfacing with a Hadoop jobtracker.

To install:

gem install houdah

Starting thrift server

Houdah expects that you have Cloudera's Hue installed and working. As a part of the Hue installation, you will compile and install a thrift interface that runs when the jobtracker is running. To be clear, you do not have to run Hue in order for Houdah to work - you just need to make sure that the org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin is working.

Houdah has been tested with Hue 1.1.0. If you have it working with a later version of Hue, patches are welcome.

Basic Usage

require 'rubygems'
require 'houdah'

# args are host, optional port (9290 default), optional username (default "houdah"), and optional timeout
client = Houdah::Client.new "192.168.1.1"

# get list of running jobs
client.jobs

# get a list of failed jobs
client.jobs(:failed)

# get jobtracker name
client.name

# get percent mapped / reduced for all running jobs
client.jobs.each { |j| puts "#{j.percent_mapped} mapped #{j.percent_reduced} reduced" }

# get percent done for the most recent job
client.jobs.first.percent_done

# print out info about each tracker
client.trackers.each { |t| puts "#{t.host}: #{t.failureCount} failures, #{t.tasks.length} active tasks" }

# get number of active / blacklisted trackers
client.status.numActiveTrackers
client.status.numBlacklistedTrackers

# get number of excluded nodes
client.status.numExcludedNodes

# get the mapred.local.dir config variable for the most recent job
puts client.jobs.first.config['mapred.local.dir']

# get the hive query for the hive job running
puts client.jobs.first.config['hive.query.string']

# kill all jobs
client.jobs.each { |j| j.kill! }

# close the client connection
client.close

# Quick and dirty way to get the number of killed jobs.  The run class method takes care of closing the client.
puts Houdah::Client.run("192.168.1.1") { |h| h.jobs(:killed).length }

Bugs / Contact

Bugs and patches welcome at github.com/livingsocial/houdah.

Something went wrong with that request. Please try again.