Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


The "helpful" contrib is an example of using Bixo to mine the
Hadoop mailing list archives to find the "most helpful Hadooper".

This was created as part of a presentation at the Bay Area
Hadoop User Group meetup in Sept 2009. The URL for this is:

It should also be available at

In a nutshell, this web mining app is a Cascading workflow that
uses a Bixo FetchPipe to fetch pages and mailbox archive files,
then parses the mailbox archives using a custom Tika mbox parser.
The analysis phase gives points to a Hadoop mailing list user
when they post to the list, and somebody replies with some
expression of thanks/gratitude.

To build this code:

% cd <path to Bixo>/contrib/helpful
% ant clean compile

To create an Eclipse project:

% cd <path to Bixo>/contrib/helpful
% ant eclipse

after which you would import the project into your Eclipse workspace.

You can’t perform that action at this time.