Visualizing Flight Delays
Using flight on-time performance data published by Bureau of Transportation Statistics, flight delays are collected using a Hadoop MapReduce job.
After gathering the data of interest, the visualizations were created in R. Following sections show the visualizations, to know more about the analysis, please read
You can run the whole pipeline (includes running MapReduce job and building the report file with graphics), by following instructions in the Running the project section below.
Top 5 Airports and Airlines based on activity
Mean Delay Per Year
Mean Delay across all years for top 5 airports
Mean Delay Per Month
Running the project
Install following dependencies to avoid errors while generating the Rmarkdown Report.
From your R console execute following commands:
install.packages("ggplot2") install.packages("RColorBrewer") install.packages("gridExtra")
How to run the project end to end:
- Change the Hadoop Home path in the Makefile to the Hadoop Home path on your system
- Make sure you put all the input files inside the
inputfolder and run
- Open terminal in the root directory of the project and execute command