Skip to content
This repository has been archived by the owner. It is now read-only.
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Timberlake is a Job Tracker for Hadoop.


Timberlake is a Go server paired with a React.js frontend. It improves on existing Hadoop job trackers by providing a lightweight realtime view of your running and finished MapReduce jobs. Timberlake exposes the counters and configuration that are the most useful, allowing you to get a quick overview of the whole cluster or dig into the performance and behavior of a single job.

It also provides waterfall and boxplot visualizations for jobs. We've found that these visualizations can be really helpful for figuring out why a job is slow. Is it launching too many mappers and overloading the cluster? Are reducers launching early and starving the mappers? Does the job have reducer skew? You can use the counters of bytes written, shuffled, and read to understand the network and I/O behavior of your jobs. And when there's a crash, Timberlake will show you tracebacks from the logs to help you debug the job.

Timberlake pairs well with Scalding and Cascading. It uses extra data from the Cascading planner to show the relationships between steps, and to clarify which jobs' outputs are used as inputs to other jobs in the flow. Visualizing that flow makes it much easier to figure out which steps are causing bottlenecks.

Finally, we've included a Slackbot that has significantly improved our Hadooping lives. The bot can notify you when your jobs start and finish, and provides links back to Timberlake.


Job Details

Job Details

List of Jobs

List of Jobs


The best way to install is with tarballs, which are available on the release page.

Download it somewhere on your server, and then untar it:

$ tar zxvf timberlake-v1.0.2-linux-amd64.tar.gz
$ mv -T timberlake-v1.0.2-linux-amd64 /opt/timberlake

Now you can start the server:

$ /opt/timberlake/bin/timberlake \
    --bind :8000 \
    --resource-manager-url http://resourcemanager:8088 \
    --history-server-url http://resourcemanager:19888 \
    --namenode-address namenode:9000

And optionally, start the Slackbot:

$ /opt/timberlake/bin/slack \
    --internal-timberlake-url http://localhost:8000 \
    --external-timberlake-url \

You'll need to create a new Incoming Webhook to generate the Slack URL for your bot.

Building from Source

You'll need npm, go and node on your path.

$ go get -u \ \ \

$ git clone
$ cd timberlake
$ make


Timberlake only works with the YARN Resource Manager API. It's been tested on v2.4.x and v2.5.x, but the Kill Job feature uses an endpoint that's only available in v2.5.x+.

Our cluster has 10-40 jobs running simultaneously and about 2,000 jobs running per day. Timberlake's performance has not been tested outside these bounds.

You can’t perform that action at this time.