Example application demonstrating how to integrate all of the components of Hortonworks DataFlow.
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
conf
images
scripts
src
.gitignore
LICENSE
README.md
Vagrantfile

README.md

Hortonworks DataFlow (HDF) Trucking Application

Example application demonstrating how to integrate all of the components of Hortonworks DataFlow:

  • Apache NiFi/MiNiFi
  • Apache Kafka
  • Apache Storm

Image

Setup

  1. Install VirtualBox

    https://www.virtualbox.org/wiki/Downloads

  2. Install Vagrant

    https://www.vagrantup.com/

  3. Clone this repository

     git clone https://github.com/bbende/hdf-trucking-app.git
    
  4. Start the Vagrant VM

    NOTE: This project uses the "centos/6" Vagrant box, if you have used this before you may need to run "vagrant box update" before running "vagrant up" to ensure you are running the latest version of the box. Older versions put the sync folder at /home/vagrant/sync and newer versions put it at /vagrant.

     cd hdf-trucking-app
     vagrant up
    

    NOTE: This step could take a little while, at the end there should be a line that shows something like:

     Cluster build status at: http://localhost:8080/api/v1/clusters/HDF/requests/1
    
  5. Wait until all HDF services have been installed and started

    Go to http://localhost:8080 in your browser and login with admin/admin.

    Wait until you see all services running:

    Image

  6. Setup the demo application

     vagrant ssh
     sudo su -
     /vagrant/scripts/setup_hdf_trucking_app.sh
    
  7. Install Banana Dashboard

NOTE: To gracefully shutdown the VM, make sure to exit out of your SSH session and execute:

vagrant halt

To completely destroy the VM and start over, execute:

vagrant destroy

Overview

This section contains an overview of the demo trucking application.

Trucking Data Simulator

The simulator is originally forked from this github repo: https://github.com/georgevetticaden/hdp.

The trucking data simulator is responsible for writing truck events to a file. Events look like the following:

2016-12-09 16:07:24.211|truck_geo_event|47|10|George Vetticaden|1390372503|Saint Louis to Tulsa|Normal|36.18|-95.76|1|
2016-12-09 16:07:24.212|truck_speed_event|47|10|George Vetticaden|1390372503|Saint Louis to Tulsa|66|

The source code for the simulator is on the VM at:

/root/hdp-bbende/reference-apps/iot-trucking-app/trucking-data-simulator/

The running simulator is installed on the VM at:

/opt/hdf-trucking-app/simulator/

The simulator is writing events to a file on the VM at:

/tmp/truck-sensor-data/truck-1.txt

MiNiFi

The MiNiFi Java distribution is installed on the VM at:

/opt/hdf-trucking-app/minifi-0.1.0/

MiNiFi is tailing the file of truck events described above and sending the events to NiFi via site-to-site.

NiFi

The NiFi console is available at http://localhost:9090/nifi.

NiFi is receiving the truck events from MiNiFi and publishing them to a Kafka topic.

NiFi is also consuming from a separate Kafka topic where results are being written by a Storm topology.

Kafka

There are two Kafka topics:

  • truck_speed_events - Where speed events are published by NiFi
  • truck_average_speed - Where average speed events are published by Storm

Storm

The Storm console is available at http://localhost:8744/index.html.

The source code for the average speed topology is at:

/vagrant/src/hdf-trucking-storm/

Solr/Banana

The Solr Admin UI is available at http://localhost:8886/solr/.

There is a single collection created called 'truck_average_speed' to hold the average speed events computer by Storm.

NiFi is responsible for consuming those events from Kafka and ingesting them to Solr.

The Banana dashboard is available at http://localhost:8886/solr/banana/src/index.html.