Hortonworks DataFlow (HDF) Trucking Application
Example application demonstrating how to integrate all of the components of Hortonworks DataFlow:
- Apache NiFi/MiNiFi
- Apache Kafka
- Apache Storm
Clone this repository
git clone https://github.com/bbende/hdf-trucking-app.git
Start the Vagrant VM
NOTE: This project uses the "centos/6" Vagrant box, if you have used this before you may need to run "vagrant box update" before running "vagrant up" to ensure you are running the latest version of the box. Older versions put the sync folder at /home/vagrant/sync and newer versions put it at /vagrant.
cd hdf-trucking-app vagrant up
NOTE: This step could take a little while, at the end there should be a line that shows something like:
Cluster build status at: http://localhost:8080/api/v1/clusters/HDF/requests/1
Wait until all HDF services have been installed and started
Go to http://localhost:8080 in your browser and login with admin/admin.
Wait until you see all services running:
Setup the demo application
vagrant ssh sudo su - /vagrant/scripts/setup_hdf_trucking_app.sh
Install Banana Dashboard
- Go to http://localhost:8886/solr/banana/src/index.html
- Click Load icon in top-right
- Choose Local File
- Select hdp-trucking-app/conf/banana/HDF_Truck_Events-1478197521141
- Save & Set As Browser Default
NOTE: To gracefully shutdown the VM, make sure to exit out of your SSH session and execute:
To completely destroy the VM and start over, execute:
This section contains an overview of the demo trucking application.
Trucking Data Simulator
The simulator is originally forked from this github repo: https://github.com/georgevetticaden/hdp.
The trucking data simulator is responsible for writing truck events to a file. Events look like the following:
2016-12-09 16:07:24.211|truck_geo_event|47|10|George Vetticaden|1390372503|Saint Louis to Tulsa|Normal|36.18|-95.76|1| 2016-12-09 16:07:24.212|truck_speed_event|47|10|George Vetticaden|1390372503|Saint Louis to Tulsa|66|
The source code for the simulator is on the VM at:
The running simulator is installed on the VM at:
The simulator is writing events to a file on the VM at:
The MiNiFi Java distribution is installed on the VM at:
MiNiFi is tailing the file of truck events described above and sending the events to NiFi via site-to-site.
The NiFi console is available at http://localhost:9090/nifi.
NiFi is receiving the truck events from MiNiFi and publishing them to a Kafka topic.
NiFi is also consuming from a separate Kafka topic where results are being written by a Storm topology.
There are two Kafka topics:
- truck_speed_events - Where speed events are published by NiFi
- truck_average_speed - Where average speed events are published by Storm
The Storm console is available at http://localhost:8744/index.html.
The source code for the average speed topology is at:
The Solr Admin UI is available at http://localhost:8886/solr/.
There is a single collection created called 'truck_average_speed' to hold the average speed events computer by Storm.
NiFi is responsible for consuming those events from Kafka and ingesting them to Solr.
The Banana dashboard is available at http://localhost:8886/solr/banana/src/index.html.