Hog is an interactive development environment that was created to assist analysts and developers with Apache Pig script creation requiring minimal knowledge of the language. The web application that was constructed provides a resource for script generation, analyzing output as well as archiving scripts. Using the Simple environment, users have drag and drop capabilities to generate scripts. The Complex side of the application provides a development environment that lets developers create, save and analyze output of their scripts.
-
Mac OS
brew cask install java
-
CentOS
yum install -y java-1.8.0-openjdk OR yum install -y java-1.7.0-openjdk
-
Ubuntu
apt-get install default-jdk
Install Apache Pig Client
-
Mac OS
brew install pig
-
CentOS or Ubuntu
mkdir -p /etc/pig/0.15.0 &&\ cd /etc/pig/0.15.0 &&\ wget -q -P /etc/pig/0.15.0 'http://apache.cs.utah.edu/pig/pig-0.15.0/pig-0.15.0.tar.gz' &&\ tar xf pig-0.15.0.tar.gz &&\ rm pig-0.15.0.tar.gz &&\ chown root:root -R pig-0.15.0 &&\ mv pig-0.15.0/* . &&\ rm -rf pig-0.15.0
Install Apache Hadoop Client
-
Mac OS
brew install hadoop
-
CentOS or Ubuntu
mkdir -p /etc/hadoop/2.7.1 &&\ cd /etc/hadoop/2.7.1 &&\ wget -q -P /etc/hadoop/2.7.1 'http://apache.cs.utah.edu/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz' &&\ tar xf hadoop-2.7.1.tar.gz &&\ rm hadoop-2.7.1.tar.gz &&\ chown root:root -R hadoop-2.7.1 &&\ mv hadoop-2.7.1/* . &&\ rm -rf hadoop-2.7.1
To configure your Pig client to run locally edit the pig.properties file with this parameter:
exectype=local
To configure your Hadoop and Pig clients to work with an exisiting Hadoop cluster, please follow the steps below.
Either by updating your .bashrc or be exporting enviroment variables:
-
Configure Hadoop Client
export HADOOP_HOME=/etc/hadoop/2.7.1 export HADOOP_PREFIX=$HADOOP_HOME export PATH=$HADOOP_HOME/bin:$PATH
From an exisiting Hadoop cluster, add your existing hadoop/conf/*.xml to your $HADOOP_HOME/etc/hadoop/
-
Configure Pig Client
export PIG_HOME=/etc/pig/0.15.0 export PATH=$PIG_HOME/bin:$PATH
To test your Hadoop client run: hadoop fs -ls / This should display any files or subdirectories in HDFS from your / directory
To test your Pig client run: pig -x mapreduce This will connect to HDFS and your YARN Resource Manager within the Pig grunt shell
- Node.js >= v6.5.0
Recommended way to install node is NVM
nvm install 6
git clone https://github.com/KeyWCorp/hog.git
cd hog
You will need to install node_modules and bower_components (If prompted to choose a version choose the option that corresponds to hog)
npm run build
If running as root use this option
npm run build:root
gulp migrate
mv /path/to/hog/server/api/pig/pig.data.db.mig /path/to/hog/server/api/pig/pig.data.db
mv /path/to/hog/server/api/settings/settings.data.db.mig /path/to/hog/server/api/settings/settings.data.db
Move others as needed.
Open a second terminal
cd /path/to/hog
npm start
A tab in your browser will automatically open to 'localhost:9000' with the script running. We find the program runs best using a Chrome web browser.