# ✈️ Flight Delay Analysis

In this project, we'll analyze flight delay patterns using a sample dataset.

## 1. Import Libraries

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

sns.set(style='whitegrid')

## 2. Load Dataset

In [None]:
df = pd.read_csv('flights_sample.csv')
df.head()

## 3. Data Cleaning

In [None]:
# Convert FlightDate to datetime
df['FlightDate'] = pd.to_datetime(df['FlightDate'])

# Remove cancelled flights
df = df[df['Cancelled'] == 0]
df.shape

## 4. Exploratory Data Analysis

In [None]:
# Average delay per airline
airline_delay = df.groupby('Airline')['DepDelayMinutes'].mean().sort_values(ascending=False)
plt.figure(figsize=(10, 5))
airline_delay.plot(kind='bar', color='skyblue')
plt.title('Average Departure Delay per Airline')
plt.ylabel('Delay (minutes)')
plt.xlabel('Airline')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

In [None]:
# Delays by hour of day
df['Hour'] = pd.to_datetime(df['ScheduledDepTime'], format='%H:%M').dt.hour
hourly_delay = df.groupby('Hour')['DepDelayMinutes'].mean()
plt.figure(figsize=(10, 5))
hourly_delay.plot(kind='line', marker='o')
plt.title('Average Departure Delay by Hour of Day')
plt.ylabel('Delay (minutes)')
plt.xlabel('Hour')
plt.grid(True)
plt.tight_layout()
plt.show()

## 5. Key Insights

- Some airlines consistently have higher delays.
- Delays tend to increase during certain hours of the day.
- Further analysis could include destination-based trends or seasonal effects.

## 6. Conclusion

This project shows how flight delays vary across different dimensions like airline and time of day. It demonstrates skills in data wrangling, visualization, and deriving insights using Python.