Jupyter notebook here
Background
In the datasets there are domestic commercial passenger flights in the United States (routes, departure and arrival time, along with delays information). You also have another table containing weather information.
Business case
- What is the trend over time in number of flights and in delay times? Total US and per region?
- Which airports/states/routes have the most delays?
- Can flight delays be accurately predicted with machine learning?
These are just a few examples, use your creativity to find other ways to extract insights out of this data.
Success criteria
- Focus on efficient transformation and integration of the data
- Focus on a well performing machine learning algorithm.
- Focus on an effective communication of your findings
$ git clone https://github.com/alemelis/airplanes.git
$ cd airplanes
$ conda env create -f env.yml
$ source activate ds
$ jupyter notebook