Skip to content
Branch: master
Go to file
Code

Latest commit

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
doc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

doopla

H(ad)oopla! A Python script to fetch the output of failed Python Hadoop streaming jobs. It scraps the hadoop web interface and gets a random failed mapper and reducer task. It outputs it with code highlighting for easy reading.

doopla -h

Usage:
doopla [<jobid>]
doopla -h | --help
doopla --version

Options:
-h --help       Show this screen.
--version       Show version.

Features

  • Automatically get the last failed job for a user
  • Code highlighting via Pygments.

Install

Two options for installing:

Via Pip::

pip install doopla

git clone and setup.py:

git clone git@github.com:trustyou/doopla.git
cd doopla
python setup.py install

Usage

Before using doopla please create a file in your home directory called .doopla and add the follwoing:

[main]
hadoop_version: <HADOOP_VERSION> # either 1 or 2 - defaults to 2
hadoop_user: <HADOOP_USER>
hadoop_url: <HADOOP_URL> # For Hadoop 2.x use the Job history URL
http_user: <USER>
http_password: <THE_PASSWORD>

Replace HADOOP_URL for the HTTP URL of your the Hadoop Web interface. Replace HADOOP_USER for your hadoop user (or the one you want to check) and the HTTP_PASSWORD for the http password you normally use to log into the web interface.

The is simple a mather of executing

$ doopla

It will search for the most recently failed job and get the output.

Or

$ doopla JOB_ID

If you want to get the output of a specific job.

You can also add 2>/dev/null if you want to shut down the HTTPS certificate warnings.

Screenshot

alt text

Development

This is a 4 hours hack while skipping lunch and waiting for a job to finish so it is in alpha stage and it is full of bugs. So feel free to create pull requests if you see something that can be improved.

Requirements

  • Python >= 2.6 or >= 3.3
  • Colorama
  • BeautifulSoup
  • Requests
  • Pygments

License

MIT licensed. See the bundled LICENSE <https://github.com/mfcabrera/doopla/blob/master/LICENSE>_ file for more details.

About

A tiny tool for fetch the Python error coming from Hadoop Streaming jobs.

Resources

License

Releases

No releases published

Languages

You can’t perform that action at this time.