## Aviation Safety Risk Analysis: Identifying Low-Risk Aircraft Models

### Overview
This project evaluates civil aviation accident data from the National Transportation Safety Board (NTSB), covering incidents between 1948 and 2022. The purpose is to provide data-driven insights into aircraft safety, focusing on risk levels associated with different aircraft types and operational conditions. The final outcome is a set of actionable recommendations to guide business stakeholders in selecting aircraft models with the lowest risk profiles for potential acquisition and operation.

### Business Understanding
The company intends to expand operations into the aviation industry, with an interest in both commercial and private aircraft. Entering this market involves significant financial and reputational risks, particularly if aircraft selected for acquisition are associated with high accident or fatality rates. The key business need is to identify aircraft models and operational factors that demonstrate the lowest risk. The findings will support informed decision-making for the head of the aviation division, whose objective is to minimize risk exposure while ensuring operational safety and long-term viability.

### Objectives
* Identify aircraft models with the lowest accident and fatality rates.
* Determine the operational or environmental factors (such as weather conditions, phase of flight, or purpose of flight) that most strongly influence accident severity.
* Analyze historical trends to assess whether safety outcomes have improved over time, particularly with newer aircraft or changing regulations.

### 1. Data Loading

In [3]:
# Import Required Liabraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns;

In [5]:
# Load the Dataset
df = pd. read_csv('./Aviation_Data.csv')
df.head()

Unnamed: 0,Event.Id,Investigation.Type,Accident.Number,Event.Date,Location,Country,Latitude,Longitude,Airport.Code,Airport.Name,...,Purpose.of.flight,Air.carrier,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight,Report.Status,Publication.Date
0,20001218X45444,Accident,SEA87LA080,1948-10-24,"MOOSE CREEK, ID",United States,,,,,...,Personal,,2.0,0.0,0.0,0.0,UNK,Cruise,Probable Cause,
1,20001218X45447,Accident,LAX94LA336,1962-07-19,"BRIDGEPORT, CA",United States,,,,,...,Personal,,4.0,0.0,0.0,0.0,UNK,Unknown,Probable Cause,19-09-1996
2,20061025X01555,Accident,NYC07LA005,1974-08-30,"Saltville, VA",United States,36.9222,-81.8781,,,...,Personal,,3.0,,,,IMC,Cruise,Probable Cause,26-02-2007
3,20001218X45448,Accident,LAX96LA321,1977-06-19,"EUREKA, CA",United States,,,,,...,Personal,,2.0,0.0,0.0,0.0,IMC,Cruise,Probable Cause,12-09-2000
4,20041105X01764,Accident,CHI79FA064,1979-08-02,"Canton, OH",United States,,,,,...,Personal,,1.0,2.0,,0.0,VMC,Approach,Probable Cause,16-04-1980


In [6]:
# Shape of the dataset
df.shape

(90348, 31)

In [9]:
# Colum names
df.columns

Index(['Event.Id', 'Investigation.Type', 'Accident.Number', 'Event.Date',
       'Location', 'Country', 'Latitude', 'Longitude', 'Airport.Code',
       'Airport.Name', 'Injury.Severity', 'Aircraft.damage',
       'Aircraft.Category', 'Registration.Number', 'Make', 'Model',
       'Amateur.Built', 'Number.of.Engines', 'Engine.Type', 'FAR.Description',
       'Schedule', 'Purpose.of.flight', 'Air.carrier', 'Total.Fatal.Injuries',
       'Total.Serious.Injuries', 'Total.Minor.Injuries', 'Total.Uninjured',
       'Weather.Condition', 'Broad.phase.of.flight', 'Report.Status',
       'Publication.Date'],
      dtype='object')