Skip to content

onelii/Flight-Data-Analysis-using-Python-and-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flight-Data-Analysis-using-Python-and-R

The “2006” and “2007” year datasets obtained from the Harvard Website(https://doi.org/10.7910/DVN/HG7NV7) were used fir this project. Data was cleaned apprpriately before conducting the analysisin depth. Data was visualised using both Python andR languages to answer the followingresearch questions;

  1. When is the best time of day, day of the week, and time of year to fly to minimise delays?
  2. Do older planes suffer more delays?
  3. How does the number of people flying between different locations change over time?
  4. Can you detect cascading failures as delays in one airport create delays in others?
  5. Use the available variables to construct a model that predicts delays. Produced a detailed analysis using line graphs, scatter plots, box plots, spatial temporal heat maps, hypothesis test as well as correlation heat maps to answer the above questions. Built a supervised classification model that consists of Decision Trees, Logistic Regression and Random Forest to predict the delay status, whether the plane will come late or not. Also a multiple linear regression model was made to predict the future arrival delays in real world scenarios. A report was produced to visualize the conclusions arrived from both Python and R and prove that similar conclusions were drawn from them.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published