Skip to content
Visualizations of flight delays in US (1987 - 2015)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
images
input
output
resources
src
.gitignore
MANIFEST.MF
Makefile
README.md
diff.Rmd
diff.pdf
report.Rmd
report.pdf

README.md

Visualizing Flight Delays

Using flight on-time performance data published by Bureau of Transportation Statistics, flight delays are collected using a Hadoop MapReduce job.

After gathering the data of interest, the visualizations were created in R. Following sections show the visualizations, to know more about the analysis, please read report.pdf.

You can run the whole pipeline (includes running MapReduce job and building the report file with graphics), by following instructions in the Running the project section below.

Top 5 Airports and Airlines based on activity

top

Mean Delay Per Year

perYear

Mean Delay across all years for top 5 airports

airports

Mean Delay Per Month

monthly

Running the project

Requirements

Install Java 1.8

Download Hadoop 2.8.1

R Requirements

Install following dependencies to avoid errors while generating the Rmarkdown Report.

From your R console execute following commands:

install.packages("ggplot2")
install.packages("RColorBrewer")
install.packages("gridExtra")

How to run the project end to end:

  • Change the Hadoop Home path in the Makefile to the Hadoop Home path on your system
  • Make sure you put all the input files inside the input folder and run make setup
  • Open terminal in the root directory of the project and execute command make run
You can’t perform that action at this time.