apachecon-bigtop

Resources for the demo presented at Bigtop presentation at ApacheCon 2013

Introduction

This repository contains data and code for use by a beginner's introduction to Apache Bigtop. All instructions have been inspired from the instructions on the Bigtop wiki page for 0.5.0

File list and description

README.md: This file, with instructions for your use
demo-setup.sh: A script I used to provision a single node for using in my demo. This script installs some essential tools and software, grabs a well known publicly available dataset and inserts it into a relational DB. Note that this script was run on Ubuntu Lucid 64 bit machine. It should work on other Ubuntu/Debian variants as well. It can be easily ported for use on RPM based systems. I haven't gotten a chance to do so but if you decide to do so, please send me a pull request, thanks! Note that Ubuntu doesn't come with JDK6 by default so this script also installs Oracle JDK6. By the time this script is done, your machine is ready for installing Bigtop as per the instructions below.
median_income_by_zipcode_census_2000.zip: A household income dataset from 2000 United States Census for use in the demo.

Instructions

Inspired from the Bigtop wiki page

Add Bigtop key so you can use the Bigtop artifacts with apt-get

wget -O- http://archive.apache.org/dist/bigtop/bigtop-0.5.0/repos/GPG-KEY-bigtop | sudo apt-key add -

Add Bigtop list, so apt-get knows where to find the Bigtop artifacts

sudo wget -O /etc/apt/sources.list.d/bigtop-0.5.0.list http://archive.apache.org/dist/bigtop/bigtop-0.5.0/repos/`lsb_release --codename --short`/bigtop.list

Update apt-get so it sees our newly added Bigtop repository

sudo apt-get update

Install our pseduo-distributed hadoop package from Apache Bigtop

sudo apt-get install hadoop-conf-pseudo

Initialize the name (needs to be done only once). Don't do it again, it will format (i.e wipe off) the data in your cluster (i.e. data on HDFS). sudo service hadoop-hdfs-namenode init
Start the namenode and datanode

sudo service hadoop-hdfs-namenode start
sudo service hadoop-hdfs-datanode start

Initialize HDFS. This creates a bunch of directories on HDFS that are required for YARN to run

./init-hdfs.sh

Restart YARN daemons. Yarn needs the directories created by the previous step to work properly. Since we just created those directories, let's restart YARN daemons.

sudo service hadoop-yarn-resourcemanager restart
sudo service hadoop-yarn-nodemanager restart

Time to run our first MapReduce Job. This is run using MapReduce v2, running on top of YARN.

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples*.jar pi 10 1000

If you want to install more artifacts, all you do is run a simple apt-get command

sudo apt-get install hive sqoop

In order to use sqoop with MySQL, you need to download the MySQL connector and drop it in Sqoop's lib directory. It doesn't get shipped with Sqoop due to licensing reasons

curl -L 'http://www.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.24.tar.gz/from/http://mysql.he.net/' | tar xz
sudo cp mysql-connector-java-5.1.24/mysql-connector-java-5.1.24-bin.jar /usr/lib/sqoop/lib/

Now we can run the command to import our table containing census data from MySQL to Hive, using Sqoop

sqoop import --connect jdbc:mysql://localhost/demo --table zipcode_incomes --username root -P -m 1 --create-hive-table --hive-import --hive-overwrite

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Grover-Bigtop Presentation.pdf		Grover-Bigtop Presentation.pdf
README.md		README.md
demo-setup.sh		demo-setup.sh
median_income_by_zipcode_census_2000.zip		median_income_by_zipcode_census_2000.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

apachecon-bigtop

Introduction

File list and description

Instructions

About

Releases

Packages

Languages

markgrover/apachecon-bigtop

Folders and files

Latest commit

History

Repository files navigation

apachecon-bigtop

Introduction

File list and description

Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages