## Final Project Submission

Please fill out:
* Student name: Colleta Kiilu
* Student pace: self paced / **part time** / full time
* Scheduled project review date/time: 08/09/2024
* Instructor name: William Okomba
* Blog post URL:


![airplane accident](https://gifdb.com/images/high/plane-jet-crash-explosion-jy9cqg43kri1oa2m.webp)
Source: [GIFDB.com](https://gifdb.com/)

# **OVERVIEW**
As part of the company's strategic growth into new markets, there is increasing interest in joining the aviation industry. The goal is to purchase and operate aircraft for both commercial and private enterprises. However, before making any decisions, the business needs to be aware of the possible dangers associated with various aircraft types.

This analysis focuses on identifying the lowest-risk aircraft models by reviewing data from the National Transportation Safety Board (NTSB) aviation accident database. The key indicators to be evaluated are damage to the aircraft, frequency of accidents/incidents and severity of injuries.

Ultimately, this analysis will provide the company, under the guidance of the Head of the new Aviation Division, with data-driven insights and recommendations about which aircraft models to invest in for this new business venture.


## **BUSINESS** **UNDERSTANDING**
xxxxx

# **Business Questions**

1.   Which aircraft models have the lowest accident rates?
2.   Are there specific factors (e.g., weather conditions, flight phases) that significantly increase the risk of accidents for certain aircraft?
3. What are the trends over time, and how do they impact decision-making for future aircraft purchases?





# **DATA UNDERSTANDING**

In [718]:
# import data analysis libraries
import numpy as np
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings('ignore')

In [719]:
# import data and create df
# df = pd.read_csv("AviationData.csv") # encoding issues
df = pd.read_csv("AviationData.csv", encoding="latin-1")

#checking the first 5 columns
df.head()

Unnamed: 0,Event.Id,Investigation.Type,Accident.Number,Event.Date,Location,Country,Latitude,Longitude,Airport.Code,Airport.Name,...,Purpose.of.flight,Air.carrier,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight,Report.Status,Publication.Date
0,20001218X45444,Accident,SEA87LA080,1948-10-24,"MOOSE CREEK, ID",United States,,,,,...,Personal,,2.0,0.0,0.0,0.0,UNK,Cruise,Probable Cause,
1,20001218X45447,Accident,LAX94LA336,1962-07-19,"BRIDGEPORT, CA",United States,,,,,...,Personal,,4.0,0.0,0.0,0.0,UNK,Unknown,Probable Cause,19-09-1996
2,20061025X01555,Accident,NYC07LA005,1974-08-30,"Saltville, VA",United States,36.922223,-81.878056,,,...,Personal,,3.0,,,,IMC,Cruise,Probable Cause,26-02-2007
3,20001218X45448,Accident,LAX96LA321,1977-06-19,"EUREKA, CA",United States,,,,,...,Personal,,2.0,0.0,0.0,0.0,IMC,Cruise,Probable Cause,12-09-2000
4,20041105X01764,Accident,CHI79FA064,1979-08-02,"Canton, OH",United States,,,,,...,Personal,,1.0,2.0,,0.0,VMC,Approach,Probable Cause,16-04-1980


In [720]:
#checking the last 5 columns
df.tail()

Unnamed: 0,Event.Id,Investigation.Type,Accident.Number,Event.Date,Location,Country,Latitude,Longitude,Airport.Code,Airport.Name,...,Purpose.of.flight,Air.carrier,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight,Report.Status,Publication.Date
88884,20221227106491,Accident,ERA23LA093,2022-12-26,"Annapolis, MD",United States,,,,,...,Personal,,0.0,1.0,0.0,0.0,,,,29-12-2022
88885,20221227106494,Accident,ERA23LA095,2022-12-26,"Hampton, NH",United States,,,,,...,,,0.0,0.0,0.0,0.0,,,,
88886,20221227106497,Accident,WPR23LA075,2022-12-26,"Payson, AZ",United States,341525N,1112021W,PAN,PAYSON,...,Personal,,0.0,0.0,0.0,1.0,VMC,,,27-12-2022
88887,20221227106498,Accident,WPR23LA076,2022-12-26,"Morgan, UT",United States,,,,,...,Personal,MC CESSNA 210N LLC,0.0,0.0,0.0,0.0,,,,
88888,20221230106513,Accident,ERA23LA097,2022-12-29,"Athens, GA",United States,,,,,...,Personal,,0.0,1.0,0.0,1.0,,,,30-12-2022


In [721]:
# checking the dataset information
df.info()

# The aviation data frame indicates that we have 88889 Rows and 31 columns

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88889 entries, 0 to 88888
Data columns (total 31 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Event.Id                88889 non-null  object 
 1   Investigation.Type      88889 non-null  object 
 2   Accident.Number         88889 non-null  object 
 3   Event.Date              88889 non-null  object 
 4   Location                88837 non-null  object 
 5   Country                 88663 non-null  object 
 6   Latitude                34382 non-null  object 
 7   Longitude               34373 non-null  object 
 8   Airport.Code            50132 non-null  object 
 9   Airport.Name            52704 non-null  object 
 10  Injury.Severity         87889 non-null  object 
 11  Aircraft.damage         85695 non-null  object 
 12  Aircraft.Category       32287 non-null  object 
 13  Registration.Number     87507 non-null  object 
 14  Make                    88826 non-null

In [722]:
df.describe()

Unnamed: 0,Number.of.Engines,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured
count,82805.0,77488.0,76379.0,76956.0,82977.0
mean,1.146585,0.647855,0.279881,0.357061,5.32544
std,0.44651,5.48596,1.544084,2.235625,27.913634
min,0.0,0.0,0.0,0.0,0.0
25%,1.0,0.0,0.0,0.0,0.0
50%,1.0,0.0,0.0,0.0,1.0
75%,1.0,0.0,0.0,0.0,2.0
max,8.0,349.0,161.0,380.0,699.0


In [723]:
# Listing the columns in the dataset
df.columns

Index(['Event.Id', 'Investigation.Type', 'Accident.Number', 'Event.Date',
       'Location', 'Country', 'Latitude', 'Longitude', 'Airport.Code',
       'Airport.Name', 'Injury.Severity', 'Aircraft.damage',
       'Aircraft.Category', 'Registration.Number', 'Make', 'Model',
       'Amateur.Built', 'Number.of.Engines', 'Engine.Type', 'FAR.Description',
       'Schedule', 'Purpose.of.flight', 'Air.carrier', 'Total.Fatal.Injuries',
       'Total.Serious.Injuries', 'Total.Minor.Injuries', 'Total.Uninjured',
       'Weather.Condition', 'Broad.phase.of.flight', 'Report.Status',
       'Publication.Date'],
      dtype='object')

In [724]:
#checking the dataset shape
df.shape

# 88889 events, 31 columns

(88889, 31)

### **Data Cleaning**

In [725]:
df.duplicated().sum()

# The data set has no duplicates

0

In [726]:
# make df copy to be used in data cleaning

data = df.copy()

In [727]:
# percentage of missing values per column
# sorted in descending order

data.isna().sum().sort_values(ascending=False)/len(data)*100

Unnamed: 0,0
Schedule,85.845268
Air.carrier,81.271023
FAR.Description,63.97417
Aircraft.Category,63.67717
Longitude,61.330423
Latitude,61.320298
Airport.Code,43.60157
Airport.Name,40.708074
Broad.phase.of.flight,30.560587
Publication.Date,15.492356


In [728]:
# check for unique values in each column

for col in data.columns:
    print({col})
    print(data[col].unique())
    print()

{'Event.Id'}
['20001218X45444' '20001218X45447' '20061025X01555' ... '20221227106497'
 '20221227106498' '20221230106513']

{'Investigation.Type'}
['Accident' 'Incident']

{'Accident.Number'}
['SEA87LA080' 'LAX94LA336' 'NYC07LA005' ... 'WPR23LA075' 'WPR23LA076'
 'ERA23LA097']

{'Event.Date'}
['1948-10-24' '1962-07-19' '1974-08-30' ... '2022-12-22' '2022-12-26'
 '2022-12-29']

{'Location'}
['MOOSE CREEK, ID' 'BRIDGEPORT, CA' 'Saltville, VA' ... 'San Manual, AZ'
 'Auburn Hills, MI' 'Brasnorte, ']

{'Country'}
['United States' nan 'GULF OF MEXICO' 'Puerto Rico' 'ATLANTIC OCEAN'
 'HIGH ISLAND' 'Bahamas' 'MISSING' 'Pakistan' 'Angola' 'Germany'
 'Korea, Republic Of' 'Martinique' 'American Samoa' 'PACIFIC OCEAN'
 'Canada' 'Bolivia' 'Mexico' 'Dominica' 'Netherlands Antilles' 'Iceland'
 'Greece' 'Guam' 'Australia' 'CARIBBEAN SEA' 'West Indies' 'Japan'
 'Philippines' 'Venezuela' 'Bermuda' 'San Juan Islands' 'Colombia'
 'El Salvador' 'United Kingdom' 'British Virgin Islands' 'Netherlands'
 'Costa 

In [729]:
#Drop irrelevant columns with a high percentage of missing values and may not add any value in my analysis

data.drop(['Schedule', 'Air.carrier', 'Latitude', 'Longitude','FAR.Description',
           'Accident.Number', 'Airport.Code', 'Airport.Name', 'Publication.Date', 'Location', 'Report.Status', 'Registration.Number'],
          axis=1, inplace=True)

In [730]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88889 entries, 0 to 88888
Data columns (total 19 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Event.Id                88889 non-null  object 
 1   Investigation.Type      88889 non-null  object 
 2   Event.Date              88889 non-null  object 
 3   Country                 88663 non-null  object 
 4   Injury.Severity         87889 non-null  object 
 5   Aircraft.damage         85695 non-null  object 
 6   Aircraft.Category       32287 non-null  object 
 7   Make                    88826 non-null  object 
 8   Model                   88797 non-null  object 
 9   Amateur.Built           88787 non-null  object 
 10  Number.of.Engines       82805 non-null  float64
 11  Engine.Type             81793 non-null  object 
 12  Purpose.of.flight       82697 non-null  object 
 13  Total.Fatal.Injuries    77488 non-null  float64
 14  Total.Serious.Injuries  76379 non-null

In [731]:
# Check the number of missing values in each column

data.isna().sum().sort_values(ascending=False)

Unnamed: 0,0
Aircraft.Category,56602
Broad.phase.of.flight,27165
Total.Serious.Injuries,12510
Total.Minor.Injuries,11933
Total.Fatal.Injuries,11401
Engine.Type,7096
Purpose.of.flight,6192
Number.of.Engines,6084
Total.Uninjured,5912
Weather.Condition,4492


In [732]:
print(data['Aircraft.Category'].unique())

[nan 'Airplane' 'Helicopter' 'Glider' 'Balloon' 'Gyrocraft' 'Ultralight'
 'Unknown' 'Blimp' 'Powered-Lift' 'Weight-Shift' 'Powered Parachute'
 'Rocket' 'WSFT' 'UNK' 'ULTR']


In [733]:
# Replace missing values in 'Aircraft.Category' column with 'Unknown'
data['Aircraft.Category'].fillna('Unknown', inplace=True)

# Replace missing values in 'Broad.phase.of.flight' column with 'Unknown'
data['Broad.phase.of.flight'].fillna('Unknown', inplace=True)

# Replace missing values in 'Engine.Type' column with 'Unknown'
data['Engine.Type'].fillna('Unknown', inplace=True)

# Replace missing values in 'Purpose.of.flight' column with 'Unknown'
data['Purpose.of.flight'].fillna('Unknown', inplace=True)

# Replace missing values in 'Number.of.Engines' column with 'Unknown'
# the number of engines is also a critical factor when determing the model of aircraft to purchase
data['Number.of.Engines'].fillna('Unknown', inplace=True)

# Replace missing values in 'Weather.Condition' column with 'UNK'
data['Weather.Condition'].fillna('UNK', inplace=True)

# Replace missing values in 'Aircraft.damage' column with 'Unspecified'
data['Aircraft.damage'].fillna('Unspecified', inplace=True)

# Replace missing values in 'Country' column with 'Unknown'
data['Country'].fillna('Unknown', inplace=True)

# Replace missing values in 'Amateur.Built' column with 'Unknown'
data['Amateur.Built'].fillna('Unknown', inplace=True)

# Replace missing values in 'Make' column with 'Unknown'
data['Make'].fillna('Unknown', inplace=True)

# Replace missing values in 'Total.Serious.Injuries' column with 0
data['Total.Serious.Injuries'].fillna(0, inplace=True)

# Replace missing values in 'Total.Minor.Injuries' column with 0
data['Total.Minor.Injuries'].fillna(0, inplace=True)

In [734]:
# Replace all 'Incident' in 'Injury.Severity' column with 'Non-Fatal'

data['Injury.Severity'].replace('Incident',  'Non-Fatal', inplace=True)

# Replace all 'Unavailable' in 'Injury.Severity' column with 'Unknown'
data['Injury.Severity'].replace('Unavailable',  'Unknown', inplace=True)

# Strip all numbericals from 'Injury.Severity' column as they are already included in the 'Total.Fatal.Injuries' column
# data['Injury.Severity'].str.replace(r'\(\d+\)', '', regex=True)
data['Injury.Severity'] = data['Injury.Severity'].str.replace(r'\(\d+\)', '', regex=True)

# Replace missing values in 'Injury.Severity' column with 'Unknown'
data['Injury.Severity'].fillna('Unknown', inplace=True)


In [735]:
# Replace all missing values(Blanks) in 'Total.Fatal.Injuries' column with 0 as they are all 'Non-Fatal' cases
# when compared with data in the 'Injury.severity' column
data['Total.Fatal.Injuries'].fillna(0, inplace=True)

# Replace all blanks in 'Total.Uninjured' column with '0' if 'Total.Fatal.Injuries' column has a value greater than or equal to 1.
# The reasoning behind this is that you cannot have a fatality and uninjured within the same event.

data.loc[(data['Total.Fatal.Injuries'] >= 1) & (data['Total.Uninjured'].isnull()), 'Total.Uninjured'] = 0
data['Total.Uninjured'] = data['Total.Uninjured'].fillna(0, inplace=True)

# Now replace the remaining null values (blanks) in the 'Total.Uninjured' column with 'Unknown'
data.loc[data['Total.Uninjured'].isnull(), 'Total.Uninjured'] = 'Unknown'
data['Total.Uninjured'].fillna('Uknown', inplace=True)

In [736]:
column_name = 'Model'
print(data[column_name].unique())

['108-3' 'PA24-180' '172M' ... 'ROTORWAY EXEC 162-F' 'KITFOX S5'
 'M-8 EAGLE']


In [737]:
# Data cleaning for the 'model' column which is very critical for analysis
# The goal is to purchase and operate aircraft for both commercial and private enterprises hence the anlayisis will focus on idenifying the low risk models for evidence based deciosn making

# Replace all 'wrongly spelt model names' in 'model' column with 'the correct model names as listed'
data['Model'].replace('PA24-180',  'PA-24-180', inplace=True)
data['Model'].replace('PA28-161',  'PA-28-161', inplace=True)
data['Model'].replace('R22 MARINER',  'R-22 MARINER', inplace=True)
data['Model'].replace('S2R', 'S-2R', inplace=True)

# Replace missing values in 'Model' column with 'Unknown'
data['Model'].fillna('Unknown', inplace=True)



In [738]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88889 entries, 0 to 88888
Data columns (total 19 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Event.Id                88889 non-null  object 
 1   Investigation.Type      88889 non-null  object 
 2   Event.Date              88889 non-null  object 
 3   Country                 88889 non-null  object 
 4   Injury.Severity         88889 non-null  object 
 5   Aircraft.damage         88889 non-null  object 
 6   Aircraft.Category       88889 non-null  object 
 7   Make                    88889 non-null  object 
 8   Model                   88889 non-null  object 
 9   Amateur.Built           88889 non-null  object 
 10  Number.of.Engines       88889 non-null  object 
 11  Engine.Type             88889 non-null  object 
 12  Purpose.of.flight       88889 non-null  object 
 13  Total.Fatal.Injuries    88889 non-null  float64
 14  Total.Serious.Injuries  88889 non-null

In [739]:
data.head()

Unnamed: 0,Event.Id,Investigation.Type,Event.Date,Country,Injury.Severity,Aircraft.damage,Aircraft.Category,Make,Model,Amateur.Built,Number.of.Engines,Engine.Type,Purpose.of.flight,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight
0,20001218X45444,Accident,1948-10-24,United States,Fatal,Destroyed,Unknown,Stinson,108-3,No,1.0,Reciprocating,Personal,2.0,0.0,0.0,Unknown,UNK,Cruise
1,20001218X45447,Accident,1962-07-19,United States,Fatal,Destroyed,Unknown,Piper,PA-24-180,No,1.0,Reciprocating,Personal,4.0,0.0,0.0,Unknown,UNK,Unknown
2,20061025X01555,Accident,1974-08-30,United States,Fatal,Destroyed,Unknown,Cessna,172M,No,1.0,Reciprocating,Personal,3.0,0.0,0.0,Unknown,IMC,Cruise
3,20001218X45448,Accident,1977-06-19,United States,Fatal,Destroyed,Unknown,Rockwell,112,No,1.0,Reciprocating,Personal,2.0,0.0,0.0,Unknown,IMC,Cruise
4,20041105X01764,Accident,1979-08-02,United States,Fatal,Destroyed,Unknown,Cessna,501,No,Unknown,Unknown,Personal,1.0,2.0,0.0,Unknown,VMC,Approach


In [740]:
data['Injury.Severity'].unique()

array(['Fatal', 'Non-Fatal', 'Unknown', 'Minor', 'Serious'], dtype=object)

insert a comment to confirm no missing values in the data set

### **Convert Event.Date Column to DateTime**

In [741]:
data['Event.Date'] = pd.to_datetime(data['Event.Date'])
data['Event.Date'].dtype # confirm datatype

dtype('<M8[ns]')

In [742]:
data['Event.Date'].head()

Unnamed: 0,Event.Date
0,1948-10-24
1,1962-07-19
2,1974-08-30
3,1977-06-19
4,1979-08-02


In [743]:
# extract year and month and create Event.Year and Event.Month columns

data["Event.Year"] = data["Event.Date"].dt.year
data["Event.Month"] = data["Event.Date"].dt.month_name()
data.head()

Unnamed: 0,Event.Id,Investigation.Type,Event.Date,Country,Injury.Severity,Aircraft.damage,Aircraft.Category,Make,Model,Amateur.Built,...,Engine.Type,Purpose.of.flight,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight,Event.Year,Event.Month
0,20001218X45444,Accident,1948-10-24,United States,Fatal,Destroyed,Unknown,Stinson,108-3,No,...,Reciprocating,Personal,2.0,0.0,0.0,Unknown,UNK,Cruise,1948,October
1,20001218X45447,Accident,1962-07-19,United States,Fatal,Destroyed,Unknown,Piper,PA-24-180,No,...,Reciprocating,Personal,4.0,0.0,0.0,Unknown,UNK,Unknown,1962,July
2,20061025X01555,Accident,1974-08-30,United States,Fatal,Destroyed,Unknown,Cessna,172M,No,...,Reciprocating,Personal,3.0,0.0,0.0,Unknown,IMC,Cruise,1974,August
3,20001218X45448,Accident,1977-06-19,United States,Fatal,Destroyed,Unknown,Rockwell,112,No,...,Reciprocating,Personal,2.0,0.0,0.0,Unknown,IMC,Cruise,1977,June
4,20041105X01764,Accident,1979-08-02,United States,Fatal,Destroyed,Unknown,Cessna,501,No,...,Unknown,Personal,1.0,2.0,0.0,Unknown,VMC,Approach,1979,August


In [744]:
# The data is now clean for Exploratory data analysis
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88889 entries, 0 to 88888
Data columns (total 21 columns):
 #   Column                  Non-Null Count  Dtype         
---  ------                  --------------  -----         
 0   Event.Id                88889 non-null  object        
 1   Investigation.Type      88889 non-null  object        
 2   Event.Date              88889 non-null  datetime64[ns]
 3   Country                 88889 non-null  object        
 4   Injury.Severity         88889 non-null  object        
 5   Aircraft.damage         88889 non-null  object        
 6   Aircraft.Category       88889 non-null  object        
 7   Make                    88889 non-null  object        
 8   Model                   88889 non-null  object        
 9   Amateur.Built           88889 non-null  object        
 10  Number.of.Engines       88889 non-null  object        
 11  Engine.Type             88889 non-null  object        
 12  Purpose.of.flight       88889 non-null  object

In [745]:
#save the new dataframe in csv format

data.to_csv('cleaned_aviation_data.csv', index=False)

# **EXPLORATORY DATA ANALYSIS (EDA)**

**EDA columns of interest**
 'Country' 'Injury.Severity', 'Aircraft.damage', 'Aircraft.Category', , 'Make', 'Model', 'Engine.Type', 'Schedule', 'Purpose.of.flight', 'Total.Fatal.Injuries', 'Total.Uninjured', 'Weather.Condition', 'Broad.phase.of.flight'


** Questions **

1. Which aircraft models have the lowest accident rates? Model

2. Are there specific factors (e.g., weather conditions, flight phases) that significantly increase the risk of accidents for certain aircraft?

3. What is the level of severity for the top 15 prefered mmodels?

In [747]:
# Now load the clean data set fro EDA

aviation = pd.read_csv('/content/cleaned_aviation_data.csv')
aviation.head()

Unnamed: 0,Event.Id,Investigation.Type,Event.Date,Country,Injury.Severity,Aircraft.damage,Aircraft.Category,Make,Model,Amateur.Built,...,Engine.Type,Purpose.of.flight,Total.Fatal.Injuries,Total.Serious.Injuries,Total.Minor.Injuries,Total.Uninjured,Weather.Condition,Broad.phase.of.flight,Event.Year,Event.Month
0,20001218X45444,Accident,1948-10-24,United States,Fatal,Destroyed,Unknown,Stinson,108-3,No,...,Reciprocating,Personal,2.0,0.0,0.0,Unknown,UNK,Cruise,1948,October
1,20001218X45447,Accident,1962-07-19,United States,Fatal,Destroyed,Unknown,Piper,PA-24-180,No,...,Reciprocating,Personal,4.0,0.0,0.0,Unknown,UNK,Unknown,1962,July
2,20061025X01555,Accident,1974-08-30,United States,Fatal,Destroyed,Unknown,Cessna,172M,No,...,Reciprocating,Personal,3.0,0.0,0.0,Unknown,IMC,Cruise,1974,August
3,20001218X45448,Accident,1977-06-19,United States,Fatal,Destroyed,Unknown,Rockwell,112,No,...,Reciprocating,Personal,2.0,0.0,0.0,Unknown,IMC,Cruise,1977,June
4,20041105X01764,Accident,1979-08-02,United States,Fatal,Destroyed,Unknown,Cessna,501,No,...,Unknown,Personal,1.0,2.0,0.0,Unknown,VMC,Approach,1979,August


In [749]:
aviation.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88889 entries, 0 to 88888
Data columns (total 21 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Event.Id                88889 non-null  object 
 1   Investigation.Type      88889 non-null  object 
 2   Event.Date              88889 non-null  object 
 3   Country                 88889 non-null  object 
 4   Injury.Severity         88889 non-null  object 
 5   Aircraft.damage         88889 non-null  object 
 6   Aircraft.Category       88889 non-null  object 
 7   Make                    88889 non-null  object 
 8   Model                   88889 non-null  object 
 9   Amateur.Built           88889 non-null  object 
 10  Number.of.Engines       88889 non-null  object 
 11  Engine.Type             88889 non-null  object 
 12  Purpose.of.flight       88889 non-null  object 
 13  Total.Fatal.Injuries    88889 non-null  float64
 14  Total.Serious.Injuries  88889 non-null