Skip to content

Tools for working with Hadoop, written with performance in mind.

License

Notifications You must be signed in to change notification settings

sis-tools/hadoop-tools

 
 

Repository files navigation

Hadoop Tools Hackage version Build Status

Join the chat at https://gitter.im/CommBank/hadoop-tools

Tools for working with Hadoop written with performance in mind.

This has been tested with the HDFS protocol used by CDH 5.x

Where can I get it?

See our latest release v0.7.2!

Configuration

By default, hh will behave the same as hdfs dfs or hadoop fs in terms of which user name to use for HDFS, or which namenodes to use.

User

The default is to use your current unix username when accessing HDFS.

This can be overridden either by using the HADOOP_USER_NAME environment variable:

# This trick also works with `hdfs dfs` and `hadoop fs`
export HADOOP_USER_NAME=amber

or by adding the following to your ~/.hh configuration file:

hdfs {
  user = "amber"
}

Namenode

The default is to lookup the namenode configuration from /etc/hadoop/conf/core-site.xml and /etc/hadoop/conf/hdfs-site.xml.

This can be overridden by adding the following to your ~/.hh configuration file:

namenode {
  host = "hostname or ip address"
}

or if you're using a non-standard namenode port:

namenode {
  host = "hostname or ip address"
  port = 7020 # defaults to 8020
}

NOTE: You cannot currently specify multiple namenodes using the ~/.hh config file, but this would be easy to add. If you would like this feature then please add an issue.

SOCKS Proxy

Sometimes it can be convenient to access HDFS over a SOCKS proxy. The easiest way to get this to work is to connect to a server which can access the namenode using ssh <host> -D1080. This sets up a SOCKS proxy locally on port 1080 which can access everything that <host> can access.

To get hh to make use of this proxy, add the following to your ~/.hh configuration file:

proxy {
  host = "127.0.0.1"
  port = 1080
}

About

Tools for working with Hadoop, written with performance in mind.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Haskell 98.1%
  • Shell 1.9%