Skip to content
Tools for working with Hadoop, written with performance in mind.
Haskell Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Hadoop Tools Hackage Travis Circle CI

Tools for working with Hadoop written with performance in mind.

This has been tested with the HDFS protocol used by CDH 5.x

Where can I get it?

See our latest release v1.0.1!


By default, hh will behave the same as hdfs dfs or hadoop fs in terms of which user name to use for HDFS, or which namenodes to use.


The default is to use your current unix username when accessing HDFS.

This can be overridden either by using the HADOOP_USER_NAME environment variable:

# This trick also works with `hdfs dfs` and `hadoop fs`
export HADOOP_USER_NAME=amber

or by adding the following to your ~/.hh configuration file:

hdfs {
  user = "amber"


The default is to lookup the namenode configuration from /etc/hadoop/conf/core-site.xml and /etc/hadoop/conf/hdfs-site.xml.

This can be overridden by adding the following to your ~/.hh configuration file:

namenode {
  host = "hostname or ip address"

or if you're using a non-standard namenode port:

namenode {
  host = "hostname or ip address"
  port = 7020 # defaults to 8020

NOTE: You cannot currently specify multiple namenodes using the ~/.hh config file, but this would be easy to add. If you would like this feature then please add an issue.


Sometimes it can be convenient to access HDFS over a SOCKS proxy. The easiest way to get this to work is to connect to a server which can access the namenode using ssh <host> -D1080. This sets up a SOCKS proxy locally on port 1080 which can access everything that <host> can access.

To get hh to make use of this proxy, add the following to your ~/.hh configuration file:

proxy {
  host = ""
  port = 1080

Kerberos / SASL

In order to use Kerberos authentication you must supply information about the principal for both your user and your namenode. These are looked up in /etc/hadoop/conf/core-site.xml and /etc/hadoop/conf/hdfs-site.xml by default.

namenode {
  principal = "hdfs/hostname@REALM.COM"

auth {
    user = "username@REALM.COM"

If you don't provide an auth.user field it will assume it is hdfs.user@REALM.COM, where REALM.COM cames from the principal of one of the namenodes.

You can’t perform that action at this time.