Create a box which include openjdk 1.7, Apache Hadoop 1.2.1 and Apache Hive 0.11.0
Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
hadoop-hive-installing
README.md
Vagrantfile
hedgefund-data.sql

README.md

vagrant-hadoop-hive

Create a box which includes

  • OpenJDK 1.7
  • Apache Hadoop 1.2.1
  • Apache Hive 0.11.0
  • Mysql 5.5
  • Apache Sqoop 1.4.4

Use

Required:

  • Installed latest Vagrant
  • Installed latest Virtual box

Clone this project to local machine and run

vagrant up

This would up the ubuntu server box and install some packages. Once running is done, you should see the "Good luck!" message, so everything shoud be ready. NOTE: If you running on Windows, you might meet the error "/bin/sh^M: bad interpreter: No such file or directory". for fixing:

sudo perl -i -pe 'y|\r||d' install.sh

and run "vagrang up" again

  • SSH to VM

      vagrant ssh
    
  • Create hdfs folder

      mkdir /home/vagrant/hdfs
      chmod -R 777 /home/vagrant/hdfs/
    
  • Format Hadoop HDFS

      hadoop namenode -format -force
    
  • For starting hadoop without asking the password

      ssh-keygen -t dsa -P '' -f /home/vagrant/.ssh/id_dsa
      cat /home/vagrant/.ssh/id_dsa.pub >> /home/vagrant/.ssh/authorized_keys
    
  • Starting Hadoop

      start-all.sh
    
  • Starting Hive

      hive
    
  • For testing, create a sample table

      hive>CREATE TABLE pokes (foo INT, bar STRING);
    
  • Use Sqoop

      sqoop help
    
  • Use MySql

      mysql -uroot -proot
    

Access DFS on browser http://dev-hadoop:50070/dfshealth.jsp

Access job tracker on browser http://dev-hadoop:50030/jobtracker.jsp

Test import data from mysql to hive

  • Move to /vagrant

      cd /vagrant
    
  • Create mysql database

      mysql -uroot -proot create database test_db
    
  • Import sample database to mysql

      mysql test_db -uroot -proot < hedgefund-data.sql
    
  • Create a Hive database called "test_db"

      hive> create database test_db;
    
  • Import mysql table to Hive

      sqoop import --verbose --fields-terminated-by ',' --connect jdbc:mysql://localhost/test_db --table filings --username root --password root --hive-import --warehouse-dir /user/hive/warehouse/test_db.db --fields-terminated-by ',' --split-by id --hive-database test_db --hive-table filings
    

Notes

  • filings a sample table.
  • dev-hadoop is forwarding to 192.168.33.15 automatic, this is private ip of Virtual machine, you can change in Vagrantfile
  • When vagrant up again, you must start hadoop manualy too.

Feel free to give me some comment over email congdng@gmail.com. Thanks~