
# Flight Delay Time Statistics

This notebook provides an interactive analysis of flight delay times using airline data.
It allows you to explore various types of delays by month for a selected year using Plotly.


In [1]:

# Import required libraries
import pandas as pd
import plotly.express as px

# Load the airline data into a pandas DataFrame
airline_data = pd.read_csv(
    'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/Data%20Files/airline_data.csv',
    encoding="ISO-8859-1",
    dtype={'Div1Airport': str, 'Div1TailNum': str, 'Div2Airport': str, 'Div2TailNum': str}
)

# Display the first few rows of the dataset
airline_data.head()


Unnamed: 0.1,Unnamed: 0,Year,Quarter,Month,DayofMonth,DayOfWeek,FlightDate,Reporting_Airline,DOT_ID_Reporting_Airline,IATA_CODE_Reporting_Airline,...,Div4WheelsOff,Div4TailNum,Div5Airport,Div5AirportID,Div5AirportSeqID,Div5WheelsOn,Div5TotalGTime,Div5LongestGTime,Div5WheelsOff,Div5TailNum
0,1295781,1998,2,4,2,4,1998-04-02,AS,19930,AS,...,,,,,,,,,,
1,1125375,2013,2,5,13,1,2013-05-13,EV,20366,EV,...,,,,,,,,,,
2,118824,1993,3,9,25,6,1993-09-25,UA,19977,UA,...,,,,,,,,,,
3,634825,1994,4,11,12,6,1994-11-12,HP,19991,HP,...,,,,,,,,,,
4,1888125,2017,3,8,17,4,2017-08-17,UA,19977,UA,...,,,,,,,,,,



## Select Year for Analysis

Enter the year you want to analyze. The following code will filter the data for the specified year and create plots for various types of flight delays.


In [2]:

# Function to compute delay averages based on selected year
def compute_info(airline_data, entered_year):
    # Filter data for the selected year
    df = airline_data[airline_data['Year'] == int(entered_year)]
    # Compute delay averages for different categories
    avg_car = df.groupby(['Month', 'Reporting_Airline'])['CarrierDelay'].mean().reset_index()
    avg_weather = df.groupby(['Month', 'Reporting_Airline'])['WeatherDelay'].mean().reset_index()
    avg_NAS = df.groupby(['Month', 'Reporting_Airline'])['NASDelay'].mean().reset_index()
    avg_sec = df.groupby(['Month', 'Reporting_Airline'])['SecurityDelay'].mean().reset_index()
    avg_late = df.groupby(['Month', 'Reporting_Airline'])['LateAircraftDelay'].mean().reset_index()
    return avg_car, avg_weather, avg_NAS, avg_sec, avg_late



## Run Analysis

Now we can create interactive plots for each type of delay.
Just specify the year you would like to analyze, and then execute the cells below to generate the plots.


In [4]:

# Set the year for analysis
year = 2020  # Change this to the desired year from 2010-2020

# Compute required information for creating graphs from the data
avg_car, avg_weather, avg_NAS, avg_sec, avg_late = compute_info(airline_data, year)

# Line plot for carrier delay
carrier_fig = px.line(avg_car, x='Month', y='CarrierDelay', color='Reporting_Airline', title='Average Carrier Delay Time (minutes) by Airline')
carrier_fig.show()

# Line plot for weather delay
weather_fig = px.line(avg_weather, x='Month', y='WeatherDelay', color='Reporting_Airline', title='Average Weather Delay Time (minutes) by Airline')
weather_fig.show()

# Line plot for NAS delay
nas_fig = px.line(avg_NAS, x='Month', y='NASDelay', color='Reporting_Airline', title='Average NAS Delay Time (minutes) by Airline')
nas_fig.show()

# Line plot for security delay
sec_fig = px.line(avg_sec, x='Month', y='SecurityDelay', color='Reporting_Airline', title='Average Security Delay Time (minutes) by Airline')
sec_fig.show()

# Line plot for late aircraft delay
late_fig = px.line(avg_late, x='Month', y='LateAircraftDelay', color='Reporting_Airline', title='Average Late Aircraft Delay Time (minutes) by Airline')
late_fig.show()
