# PHASE 1 PROJECT - Aviation Risk Analysis & Recommendations

---

## Introduction

As the company is seeking to diversify its portfolio through expansion into commercial and private aviation, it is impotrant to have a deep understanding of operational risks. Using the accident data from the National Transportation Safety Board, this project evaluates the safety records of various aircraft manufacturesrs and models from 1962 to 2023. I will explore the data to determine the lowest risk options available. This will inform actionable recommendations to stakeholders to acquire safe and reliable fleet.

## Business Understanding

### The Business Problem
While the company has capital ready for investment, the new aviation division lacks historical knowledge regarding the operational risks associated with different aircraft manufacturers and models. Purchasing aircraft without a clear understanding of their safety profiles poses significant risks

### Objectives
The objective of this analysis is to:
1. Identify the top 3 low-risk aircraft categories and manufacturers.
2. Understand what factors increase accident severity.
3. Recommend where the company should invest.

This project directly supports strategic growth by giving leadership a clear, evidence-based foundation for selecting aircraft that offer the best balance of safety, reliability, and long-term operational value.

### Goals
To achieve the business objectives, we will perform the following data science tasks using the NTSB dataset:
- Quantify Risk by creating measurable metrics for "risk" using **Total.Fatal.Injuries** and **Aircraft.damage**.
- Trend Analysis- Analyze safety performance over time (**Event.Date**) to ensure recommendations are based on relevant, modern data rather than outdated historical trends.
- Comparative Analysis: Compare **Make** and **Model** performance against the average incident rates to identify outliers (both good and bad).

### Key Stakeholders
1. **Head of Aviation Division:** Primary consumer of the insights; responsible for the final purchasing decision.
2. **Investment Committee:** responsible for the financial implications of "Hull Loss" and asset durability.
3. **Operations Team:** interested in reliability and potential for non-fatal incidents that cause downtime.

### Business Impact
Effective use of these insights will:
- Reduce financial risk through the purchase of safer aircraft
- Improve passenger and crew safety
- Support compliance with aviation safety standards
- Strengthen long-term operational sustainability
- Enhance the company's reputation as a safety-first aviation operator

---

## Business Requirements

To support leadership in making informed decisions about entering the aviation market, the following business requirements define the analysis.
## Data Strategy

The relevant columns include:
- Make
- Model
- Aircraft.Category
- Broad.Phase.of.Flight
- Weather.Condition
- Number.of.Engines
- Total.Fatal.Injuries
- Total.Serious.Injuries
- Total.Minor.Injuries
- Total_Uninjured
- Aircraft.Damage
- Purpose.of.flight
- Event.Date

---

#### Identify Low-Risk Aircraft for Purchase
The business needs clear, data-driven guidance on which aircraft categories and manufacturers have the safest historical performance by:
1. Comparing accident frequency across aircraft categories (e.g., single-engine, multi-engine, helicopters).
2. Measuring accident severity (fatal, serious, minor).
3. Assessing aircraft damage outcomes (destroyed, substantial, minor).
4. Determine aircraft types with consistently low fatality and damage rates.

#### Evaluate Key Risk Factors That Impact Safety

Leadership needs insight into what conditions increase accident severity. this  is by:

- Determining how weather conditions influence accidents.
- Analyzyng safety differences between **commercial, corporate, and personal flights**.
- Examining which phases of flight **(takeoff, landing, cruise)** have the highest risk.
- Evaluating whether certain aircraft features **(engine count, aircraft category)** correlate with higher severity.

---

## Data Understanding
#### Data source
the dataset for this analysis is sourved from **National Transportation Safety Board, that covers aviation accidents and incidents involving aircrafts in the United States and international waters from 1962 to 2023.
This data serves as a reliable sourve to determine aircraft safety records.

#### Data Schema
The dataset includes tens of thousands of records, where each row represents a single aviation accident or incident.
Each record contains information about:
- The aircraft (type, category, manufacturer, model)
- The event (date, location, purpose of flight)
- The environment (weather, light conditions)
- The outcome (injuries, fatalities, damage level)

This makes the dataset comprehensive enough to evaluate both accident frequency and accident severity across aircraft types.


#### Data Quality
##### Missing Values
- Some older records (1960s–1980s) may lack detail
- Certain columns may have missing values (None) that may need to be filled or droped.
- Weather and light condition fields frequently contain “Unknown”

##### Inconsistent Values
Since the data set is based on a large period, certain values may have format changes or spelling errors.
- Manufacturers may appear in multiple forms
- Aircraft models may use different formatting or spacing

##### Outliers
- Extremely old or rare aircraft types
- Occasional data-entry errors
- Records with zero injuries but aircraft recorded as “Destroyed”

---

## Data Preparation

In [7]:
import pandas as pd
import csv
df= pd.read_csv('AviationData.csv', encoding='utf-8', encoding_errors='replace')
df.head()

  df= pd.read_csv('AviationData.csv', encoding='utf-8', encoding_errors='replace')


Unnamed: 0,Event.Id,Investigation.Type,Accident.Number,Event.Date,Location,Country,Latitude,Longitude,Airport.Code,Airport.Name,...,Purpose.of.flight,Air.carrier,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight,Report.Status,Publication.Date
0,20001218X45444,Accident,SEA87LA080,1948-10-24,"MOOSE CREEK, ID",United States,,,,,...,Personal,,2.0,0.0,0.0,0.0,UNK,Cruise,Probable Cause,
1,20001218X45447,Accident,LAX94LA336,1962-07-19,"BRIDGEPORT, CA",United States,,,,,...,Personal,,4.0,0.0,0.0,0.0,UNK,Unknown,Probable Cause,19-09-1996
2,20061025X01555,Accident,NYC07LA005,1974-08-30,"Saltville, VA",United States,36.922223,-81.878056,,,...,Personal,,3.0,,,,IMC,Cruise,Probable Cause,26-02-2007
3,20001218X45448,Accident,LAX96LA321,1977-06-19,"EUREKA, CA",United States,,,,,...,Personal,,2.0,0.0,0.0,0.0,IMC,Cruise,Probable Cause,12-09-2000
4,20041105X01764,Accident,CHI79FA064,1979-08-02,"Canton, OH",United States,,,,,...,Personal,,1.0,2.0,,0.0,VMC,Approach,Probable Cause,16-04-1980
