Skip to content
Shashank Sahni edited this page Oct 13, 2013 · 6 revisions

To automate our deployments we decided to use Chef to configure Hadoop. In order to skip the time required in installation of Hadoop on Instances, we decided to use an image with Hadoop pre-installed and its daemons configured as services. We're currently using pre-configured images of Ubuntu.

Brief steps are mentioned below.

  • Download and install Hadoop Deb image from http://archive.apache.org/dist/hadoop/core/hadoop-1.2.1/
  • /etc/init.d/hadoop-{jobtracker,tasktracker} - Add a line exporting user "export USER=mapred". Apparently, hadoop-user wasn't getting updated while using start-stop-daemon.
  • /etc/init.d/hadoop-{namenode,datanode} - Add a line exporting user "export USER=hdfs".
  • /etc/init.d/hadoop-namenode - Add a "-nonInteractive" option in the format() command.
  • Install s3fs
  • Included Hadoop compression library

Our AMI is public

  • ami-bddd8dd4(us-east-1).
  • ami-74ce5744(us-west-2).

Note that, we keep on updating and making changes to it. This page will be kept updated with most recent image.

Our cookbook can be found here. https://github.com/siel-iiith/hadoop-cookbook

Clone this wiki locally