# Project Title: Flight Accidents Analysis

## [TODO] Team Members with uniqname
* Yufeng Song (yfsong)
* Ziqi Wang ()
* Muyu Lin (mlin567)

## Overview
Our project analyzes historical flight accident data to determine trends in aviation safety and identify key factors contributing to accidents. We aim to assess whether the perceived increase in plane accidents this year is due to actual incidents or amplified media coverage. By uncovering patterns, we can help improve aviation safety and inform future risk mitigation strategies.

## Motivation
### Why this topic?
We selected this topic because there is a growing public perception that plane accidents have increased since end of the 2024. However, we are uncertain whether this perception is driven by an actual rise in incidents or by increased media attention. By exploring the underlying causes of flight accidents, we hope to provide data-driven insights that clarify trends in aviation safety and contribute to discussions on how to enhance airline security and risk management.

### 3 Key Questions and What We Hope to Learn
1. What are the most common causes of flight accidents, and how have these causes changed over time?
    * By analyzing historical data, we aim to determine whether certain factors (e.g., mechanical failure, weather conditions, human error) have become more or less frequent contributors to flight accidents. This could help identify emerging risks and areas for improvement in aviation safety.
2. What are the most common factors contributing to flight accidents?
    * By analyzing historical accident data, we aim to identify the leading causes, such as weather conditions, mechanical failures, human error, or operational inefficiencies.
3. Is there a correlation between media coverage and public perception of aviation safety?
    * We seek to understand whether heightened media reporting on aviation incidents correlates with increased public fear, regardless of actual accident trends.

## [TODO] Data Sources & Description
* https://asn.flightsafety.org/database/
* https://www.kaggle.com/datasets/ahmadrafiee/airports-airlines-planes-and-routes-update-2024/data?select=routes.csv

Explain how the two (or more) datasets complement each other

For each data source, list the variables of interest, the size of the data sets, missing values, etc.

## Data Manipulation
This is where you merge your data sets, as well as create new columns (if appropriate)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [3]:
pd.set_option('display.max_columns', None)

In [4]:
import glob

path = "data/ASN/"
file_pattern = f"{path}accident_details_*.csv"

csv_files = glob.glob(file_pattern)

dfs = [pd.read_csv(file) for file in csv_files]
df = pd.concat(dfs, ignore_index=True)
df

Unnamed: 0,Time,Type,Date,Owner.operator,Registration,MSN,Year.of.manufacture,Fatalities,Other.fatalities,Aircraft.damage,Category,Location,Phase,Nature,Departure.airport,Destination.airport,Confidence.Rating,Narrative,Engine.model,Investigating.agency,detail_link
0,11:16,British Aerospace BAe-125-700A,Monday 2 January 2006,Avcom,P4-AOD,257153,1981,3 / Occupants: 3,0,"Destroyed, written off",Accident,"4,3 km W of Kharkov Airport (HRK) - Ukraine",Approach,Ferry/positioning,Moskva-Sheremetyevo Airport (SVO/UUEE),Kharkov Airport (HRK/UKHH),"Information is only available from news, socia...",The BAe-125-700A corporate jet was being repos...,,,https://aviation-safety.net/database/record.ph...
1,12:40,Antonov An-26,Tuesday 3 January 2006,Ruwenzori Air Asala,,,,0 / Occupants:,0,Substantial,Accident,Fataki - Congo (Democratic Republic),En route,Unknown,Bunia Airport (BUX/FZKA),Aru Airstrip,,The Antonov An-26 was operating on a flight fr...,Ivchenko AI-24,,https://aviation-safety.net/database/record.ph...
2,08:00,Cessna 560 Citation Ultra,Thursday 5 January 2006,NetJets,N391QS,560-0493,1985Total airframe hrs:6276 hours,0 / Occupants: 7,0,"Substantial, repaired",Accident,"Minocqua-Noble F. Lee Airport, WI (ARV) - Un...",Landing,Passenger - Non-Scheduled/charter/Air Taxi,"Chicago-Executive Airport, IL (PWK/KPWK)","Minocqua-Noble F. Lee Airport, WI (ARV/KARV)",Accident investigation report completed and in...,"A Cessna 560 Citation Ultra, N391QS, sustained...",Pratt & Whitney Canada JT15D,NTSB,https://aviation-safety.net/database/record.ph...
3,02:00,Beechcraft A100 King Air,Thursday 5 January 2006,North Country Aviation,N700NC,B-138,1972Total airframe hrs:13033 hours,0 / Occupants: 3,0,"Destroyed, written off",Accident,"Sault Ste Marie Municipal-Sanderson Airport, M...",Landing,Ambulance,"Traverse City-Cherry Capital Airport, MI (TVC/...","Sault Ste Marie Municipal-Sanderson Airport, M...",Accident investigation report completed and in...,"A Beech A100, N700NC, operated as an emergency...",Pratt & Whitney PT6-28,NTSB,https://aviation-safety.net/database/record.ph...
4,17:04,Douglas C-54G (DC-4),Thursday 5 January 2006,Buffalo Airways,C-GXKN,36090,1946,0 / Occupants: 4,0,"Substantial, written off",Accident,"Norman Wells Airport, NT (YVQ) - Canada",En route,Cargo,"Norman Wells Airport, NT (YVQ/CYVQ)","Yellowknife Airport, NT (YZF/CYZF)",Accident investigation report completed and in...,The C-54 departed Norman Wells at 16:52. Six m...,Pratt & Whitney R-2000-7M2,TSB,https://aviation-safety.net/database/record.ph...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4123,01:30 LT,Boeing 767-224,Monday 22 April 2002,Continental Airlines,N68160,30439/851,2001Total airframe hrs:2076 hours,0 / Occupants: 122,0,,Accident,"75 miles N of San Juan, Puerto Rico - Atlant...",En route,Passenger - Scheduled,"SÃ£o Paulo-Guarulhos International Airport, SP...","Newark-Liberty International Airport, NJ (EWR/...",Accident investigation report completed and in...,The airplane was in cruise flight at flight le...,General Electric CF6-80C2B4F,NTSB,https://aviation-safety.net/wikibase/309637
4124,14:36 LT,Cessna 560 Citation V,Thursday 25 April 2002,Executive Jet Managemant,N560RP,560-0158Total airframe hrs:3927 hours,,0 / Occupants: 2,0,Substantial,Accident,"Lake In The Hil, Illinois - United States of...",Unknown,Ferry/positioning,"San Angelo-Mathis Field, TX (SJT/KSJT)","LAKE IN THE HIL, IL (3CK)",Accident investigation report completed and in...,The on-demand air-taxi positioning flight sust...,Pratt & Whitney JT15D-5A,NTSB,https://aviation-safety.net/wikibase/297366
4125,,Boeing 707-366C,Friday 26 April 2002,Hewa Bora Airways,9Q-CKB,19844/744,1973,0 / Occupants: 3,0,"Destroyed, written off",Accident,Kinshasa-N'Djili Airport (FIH) - Congo (Demo...,Landing,Cargo,Johannesburg International Airport (JNB/FAJS),Kinshasa-N'Djili Airport (FIH/FZAA),Information verified through data from acciden...,Weather at Kinshasa-NDjili Airport was stormy ...,Pratt & Whitney JT3D-7,,https://aviation-safety.net/database/record.ph...
4126,03:00,McDonnell Douglas DC-10-40F,Saturday 27 April 2002,Centurion Air Cargo,N141WE,46661/224,1976,0 / Occupants: 5,0,"Substantial, repaired",Accident,San Salvador-Comalapa International Airport (S...,Take off,Cargo,San Salvador-Comalapa International Airport (S...,Guatemala City-La Aurora Airport (GUA/MGGT),Information verified through data from acciden...,"At rotation, the flightcrew heard a sound desc...",,,https://aviation-safety.net/database/record.ph...


## Data Visualization
Produce 3 or more plots that help describe your data.  Choose appropriate visualizations
