# Airline Arrival Delay Analysis

Analyzing arrival delay data for Alaska Airlines and AM WEST across five destinations: Los Angeles, Phoenix, San Diego, San Francisco, and Seattle.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load data from CSV
df = pd.read_csv('project1.csv')
print("Raw data:")
print(df)

In [None]:
destinations = ['Los Angeles', 'Phoenix', 'San Diego', 'San Francisco', 'Seattle']

# Extracting on-time and delayed flight counts for each airline
alaska_ontime  = df[(df['Airline'] == 'Alaska')  & (df['Status'] == 'on time')][destinations].values[0]
alaska_delayed = df[(df['Airline'] == 'Alaska')  & (df['Status'] == 'delayed')][destinations].values[0]

amwest_ontime  = df[(df['Airline'] == 'AM WEST') & (df['Status'] == 'on time')][destinations].values[0]
amwest_delayed = df[(df['Airline'] == 'AM WEST') & (df['Status'] == 'delayed')][destinations].values[0]

# Delay rate = delayed / total flights
alaska_rate = alaska_delayed / (alaska_ontime + alaska_delayed)
amwest_rate = amwest_delayed / (amwest_ontime + amwest_delayed)

summary = pd.DataFrame({
    'Destination':     destinations,
    'Alaska Delay %':  (alaska_rate * 100).round(1),
    'AM WEST Delay %': (amwest_rate * 100).round(1)
})
print("\nDelay rates by destination:")
print(summary.to_string(index=False))

In [None]:
# Overall delay rate across all destinations combined
alaska_overall = (alaska_delayed.sum() / (alaska_ontime + alaska_delayed).sum()) * 100
amwest_overall = (amwest_delayed.sum() / (amwest_ontime + amwest_delayed).sum()) * 100

print(f'Alaska overall delay rate:  {alaska_overall:.1f}%')
print(f'AM WEST overall delay rate: {amwest_overall:.1f}%')

In [None]:
# Added this to experiment with grouped bar chart comparing delay rates per destination
x = range(len(destinations))
width = 0.35

fig, ax = plt.subplots(figsize=(10, 6))
ax.bar([i - width/2 for i in x], summary['Alaska Delay %'],  width, label='Alaska',  color='steelblue')
ax.bar([i + width/2 for i in x], summary['AM WEST Delay %'], width, label='AM WEST', color='salmon')

ax.set_xlabel('Destination')
ax.set_ylabel('Delay Rate (%)')
ax.set_title('Arrival Delay Rate by Airline and Destination')
ax.set_xticks(list(x))
ax.set_xticklabels(destinations)
ax.legend()
ax.grid(axis='y', linestyle='--', alpha=0.5)

plt.tight_layout()
plt.savefig('delay_chart.png', dpi=150)
plt.show()

## Conclusions

After analyzing the data:

1. Alaska Airlines has a lower overall delay rate (~13.3%) compared to AM WEST (~10.9%).

3. The reason this happens is that AM WEST flies a huge number of flights through Phoenix (its hub), where delays are relatively rare. This large volume of on-time Phoenix flights pulls AM WEST's overall rate down, even though it performs worse at every city individually.

4. Overall aggregate numbers can be misleading without context