Timberlake is a Job Tracker for Hadoop.
Go JavaScript CSS Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bots Don't try to close the response body if the request fails May 31, 2018
css add links to resourcemanager / jobhistory Mar 1, 2018
flow-typed/npm Fix DAG label regex and more (#95) Feb 27, 2018
henson Use HENSON_SERVICE in henson/restart Oct 24, 2017
img This is the Job Tracker. This is JT. This is Timberlake. Nov 21, 2014
js fix some broken requests Mar 1, 2018
test Parse error logs for failed tasks. Mar 17, 2016
vendor persistent storage support Jan 10, 2018
.babelrc Fix DAG label regex and more (#95) Feb 27, 2018
.eslintignore Fix DAG label regex and more (#95) Feb 27, 2018
.eslintrc.json Fix DAG label regex and more (#95) Feb 27, 2018
.flowconfig Add multicluster support (#86) Dec 1, 2017
.gitignore test current functionality Dec 19, 2017
.ignore Request jobConf directly (#81) Nov 9, 2017
.travis.yml rm golang 1.6 from travis Jun 1, 2018
Dockerfile Bump Go 1.6->1.9 Feb 1, 2018
LICENSE Oh wow I guess we can OSS this now. Nov 21, 2014
Makefile Add golint to test suite (#78) Nov 3, 2017
README.md Node must be in path as well Nov 9, 2015
api_test.go add racecondition test Mar 17, 2018
conf.go test current functionality Dec 19, 2017
conf_test.go Request jobConf directly (#81) Nov 9, 2017
gulpfile.js fetch related jobs Jan 10, 2018
history.go go fmt Mar 20, 2018
history_test.go Restore job counters streaming (#83) Nov 13, 2017
index.html Fix DAG label regex and more (#95) Feb 27, 2018
jenkins_build.sh Test the timberlake-slackbot binary Mar 4, 2016
jest.config.js Fix DAG label regex and more (#95) Feb 27, 2018
job.go go fmt Mar 20, 2018
jobtracker.go go fmt Mar 20, 2018
kill.go fix the url Feb 6, 2018
logs.go Make namenodeAddress part of jobTracker struct Nov 3, 2017
main.go add racecondition test Mar 17, 2018
package-lock.json Fix DAG label regex and more (#95) Feb 27, 2018
package.json Fix DAG label regex and more (#95) Feb 27, 2018
persistedjobclient.go fix missing details from s3 Mar 2, 2018
race_test.go go fmt Mar 20, 2018
recentjobclient.go add links to resourcemanager / jobhistory Mar 1, 2018
s3jobresponse.go go fmt Mar 20, 2018
sse.go Making the sse fail continue instead of break Mar 12, 2016
tasks.go Parse error logs for failed tasks. Mar 17, 2016

README.md

Timberlake is a Job Tracker for Hadoop.

Intro

Timberlake is a Go server paired with a React.js frontend. It improves on existing Hadoop job trackers by providing a lightweight realtime view of your running and finished MapReduce jobs. Timberlake exposes the counters and configuration that are the most useful, allowing you to get a quick overview of the whole cluster or dig into the performance and behavior of a single job.

It also provides waterfall and boxplot visualizations for jobs. We've found that these visualizations can be really helpful for figuring out why a job is slow. Is it launching too many mappers and overloading the cluster? Are reducers launching early and starving the mappers? Does the job have reducer skew? You can use the counters of bytes written, shuffled, and read to understand the network and I/O behavior of your jobs. And when there's a crash, Timberlake will show you tracebacks from the logs to help you debug the job.

Timberlake pairs well with Scalding and Cascading. It uses extra data from the Cascading planner to show the relationships between steps, and to clarify which jobs' outputs are used as inputs to other jobs in the flow. Visualizing that flow makes it much easier to figure out which steps are causing bottlenecks.

Finally, we've included a Slackbot that has significantly improved our Hadooping lives. The bot can notify you when your jobs start and finish, and provides links back to Timberlake.

Screenshots

Job Details

Job Details

List of Jobs

List of Jobs

Installation

The best way to install is with tarballs, which are available on the release page.

Download it somewhere on your server, and then untar it:

$ tar zxvf timberlake-v1.0.2-linux-amd64.tar.gz
$ mv -T timberlake-v1.0.2-linux-amd64 /opt/timberlake

Now you can start the server:

$ /opt/timberlake/bin/timberlake \
    --bind :8000 \
    --resource-manager-url http://resourcemanager:8088 \
    --history-server-url http://resourcemanager:19888 \
    --namenode-address namenode:9000

And optionally, start the Slackbot:

$ /opt/timberlake/bin/slack \
    --internal-timberlake-url http://localhost:8000 \
    --external-timberlake-url https://timberlake.example.com \
    --slack-url https://hooks.slack.com/services/...

You'll need to create a new Incoming Webhook to generate the Slack URL for your bot.

Building from Source

You'll need npm, go and node on your path.

$ git clone https://github.com/stripe/timberlake.git
$ cd timberlake
$ make

Limitations

Timberlake only works with the YARN Resource Manager API. It's been tested on v2.4.x and v2.5.x, but the Kill Job feature uses an endpoint that's only available in v2.5.x+.

Our cluster has 10-40 jobs running simultaneously and about 2,000 jobs running per day. Timberlake's performance has not been tested outside these bounds.