Skip to content

Quick Setup Instructions (Must Read)

Shubham Gupta edited this page Feb 27, 2020 · 5 revisions

Quick Setup Instructions (Must Read)

Step 1: Create an account on github and fork the Dr. Elephant project.

Step 2: Checkout the code.

$> git clone https://github.com/<username>/dr-elephant
$> cd dr-elephant*

Step 3: Prerequisites:

  1. You must have play or activator command installed. Download the activator zip from https://downloads.typesafe.com/typesafe-activator/1.3.12/typesafe-activator-1.3.12.zip, unzip it and add the activator command to your $PATH. For older versions of Play, you need to add the play command instead of activator.
export ACTIVATOR_HOME=/path/to/unzipped/activator
export PATH=$ACTIVATOR_HOME/bin:$PATH
  1. Dr. Elephant stores the analyzed results in a MySQL database. Please install and setup mysql if you do not have it yet. (Recommend version 5.5+)
  2. (Optional, but recommended) In order to use the new Dr. Elephant UI, you need to install npm and dependencies
sudo yum install npm
sudo npm install -g bower
cd web; bower install; cd ..
  1. Lastly, you should have Hadoop and/or spark already setup.

Step 4: (Optional, Beta Phase) Please follow the below steps if you wish to try out the auto-tuning feature. (More details: https://github.com/linkedin/dr-elephant/wiki/Auto-Tuning)

  • Enable it by setting the value of property autotuning.enabled to true in app-conf/AutoTuningConf.xml
  • Install python with version 2.6+
  • If you want to use a python installation other than the one set in environment:
    • Either set PYTHON_PATH to the path of desired python executable: $> export PYTHON_PATH=/path/to/python/executable
    • Or, uncomment and set the value of optional property python.path to the path of desired python executable in app-conf/AutoTuningConf.xml
  • Install inspyred package by executing: sudo pip install inspyred
  • If pip is missing, it can be installed from https://pip.pypa.io/en/stable/installing/

Step 5: Compile Dr. Elephant code and generate the zip. Compile.sh script optionally takes a configuration file which includes the version of Hadoop and Spark to compile with. For instructions check the Developer Guide.

$> ./compile.sh [./compile.conf]

After compiling, the distribution is created under dist directory.

$> ls dist
dr-elephant*.zip

Step 6: Copy the distribution file to the machine where you want to deploy Dr. Elephant.

Step 7: On the machine where you want to deploy Dr. Elephant, make sure the below env variables are set.

$> export HADOOP_HOME=/path/to/hadoop/home
$> export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
$> export SPARK_HOME=/path/to/spark/home
$> export SPARK_CONF_DIR=/path/to/conf

Add hadoop to the system path because Dr. Elephant uses 'hadoop classpath' to load the right classes.

$> export PATH=$HADOOP_HOME/bin:$PATH

Step 8: You also need a backend to save the data. Configure the mysql database in the elephant.conf file.

# Database configuration
db_url=localhost
db_name=drelephant
db_user=root
db_password=""

Step 9:(Optional) If you want to have SSL enabled Dr.Elephant then add these confs in elephant.conf

# SSL related configuration
https_port=8090(any port you can configure)
https_keystore_location="/path/to/keystore"
https_keystore_type=TYPE_OF_KEYSTORE(for instance JKS)
https_keystore_password="password_for_keystore"

Step 10: If your cluster is kerberised, then update the keytab user and the keytab file location in the elephant.conf file.

Step 11: If you are running Dr. Elephant for the first time, you need to enable evolutions. To do so append(or uncomment jvm_props) -Devolutionplugin=enabled and -DapplyEvolutions.default=true to jvm_props in elephant.conf file. This will automatically create the mysql tables for you. Remember to disable the evolutions when you restart Dr. Elephant the next time.

$> vim ./app-conf/elephant.conf
jvm_props=" -Devolutionplugin=enabled -DapplyEvolutions.default=true"

Step 12: To start dr-elephant, run the start script specifying a path to the application's configuration files.

$> /bin/start.sh /path/to/app-conf/directory

To verify if Dr. Elephant started correctly, check the dr.log file.

$> less $DR_RELEASE/dr.log
...
play - database [default] connected at jdbc:mysql://localhost/drelephant?characterEncoding=UTF-8
application - Starting Application...
play - Application started (Prod)
play - Listening for HTTP on /0:0:0:0:0:0:0:0:8080

To verify if Dr. Elephant is analyzing jobs correctly correctly check the dr-elephant.log file.

$> less $DR_RELEASE/../logs/elephant/dr_elephant.log

Step 13: Once the application starts, you can open the UI at ip:port (localhost:8080)

Step 14: To stop dr-elephant run

$> bin/stop.sh