## Civil Aviation Authority of the Philippines - Civil Aviation Accidents, Incidents, and Serious Incidents

Data Source:
https://www.caap.gov.ph/

The *same data* can be found at [Open Data Philippines](https://data.gov.ph/?q=dataset/civil-aviation-authority-philippines-aircraft-accidents), but I chose to scrape the Civil Aviation Authority of the Philippines (CAAP) website since it contains more complete information on aircraft accidents, incidents, and serious incidents. I also found that there are some error in the data uploaded by CAAP in the Open Data Philippines website.

#### Errors found in the data uploaded at [Open Data Philippines](https://data.gov.ph/?q=dataset/civil-aviation-authority-philippines-aircraft-accidents):  
- [(Typo?) Error found at 6th row of place of occurrence column:](https://data.gov.ph/?q=dataset/civil-aviation-authority-philippines-aircraft-accidents/resource/29c1d129-11b2-4aac-89e7#{view-grid:{columnsWidth:[{column:!place_of_occurance,width:504}]}})
The place of occurrence written is 'Runway Excursion during Landing' which is not a place. The correct row is found at the [CAAP 2014 Accidents Page](https://www.caap.gov.ph/?page_id=3096).
- Occurrence is mispelled as 'occurance' on the Place of Occurrence and Type of Occurrence columns.


In [307]:
import glob
import pandas as pd
import requests
from bs4 import BeautifulSoup

In [308]:
# Get website source
website = (requests.get("https://www.caap.gov.ph/").text) 
soup = BeautifulSoup(website, 'lxml')

In [309]:
# Find the 'Aircraft Accident and Incident Report'
dd = soup.find("dd", {"class": "level1 nextend-nav-6425 parent"})

In [310]:
# Get the description terms of dt('Aircraft Accident and Incident Report')
dt = dd.find_all("dt")
print(dt)

[<dt class="level2 nextend-nav-6506 parent first" data-menuid="6506">
<span class="outer">
<span class="inner">
<a><span>Accidents</span></a> </span>
</span>
</dt>, <dt class="level3 nextend-nav-7521 notparent first" data-menuid="7521">
<span class="outer">
<span class="inner">
<a href="https://www.caap.gov.ph/?page_id=7509"><span>2018 Accidents</span></a> </span>
</span>
</dt>, <dt class="level3 nextend-nav-6508 notparent" data-menuid="6508">
<span class="outer">
<span class="inner">
<a href="https://www.caap.gov.ph/?page_id=6439"><span>2017 Accidents</span></a> </span>
</span>
</dt>, <dt class="level3 nextend-nav-6546 notparent" data-menuid="6546">
<span class="outer">
<span class="inner">
<a href="https://www.caap.gov.ph/?page_id=2960"><span>2016 Accidents</span></a> </span>
</span>
</dt>, <dt class="level3 nextend-nav-6614 notparent" data-menuid="6614">
<span class="outer">
<span class="inner">
<a href="https://www.caap.gov.ph/?page_id=3055"><span>2015 Accidents</span></a> </span>


In [311]:
accident_urls = []
incident_urls = []
serious_incident_urls = []
switch = 'a'
for item in dt:
    if item.find("a").get_text() == 'Accidents':
        continue
    if item.find("a").get_text() == 'Incidents':
        switch = 'i'
        continue
    if item.find("a").get_text() == 'Serious Incidents':
        switch = 's'
        continue
    if switch == 'a':
        accident_urls.append(item.find("a")['href'])
    elif switch == 'i':
        incident_urls.append(item.find("a")['href'])
    else:
        #print(item.find("a"))
        serious_incident_urls.append(item.find("a")['href'])

In [312]:
# Check contents of url
print("List of accident url: ", accident_urls)
print("List of incident url: ", incident_urls)
print("List of serious incident url: ", serious_incident_urls)

List of accident url:  ['https://www.caap.gov.ph/?page_id=7509', 'https://www.caap.gov.ph/?page_id=6439', 'https://www.caap.gov.ph/?page_id=2960', 'https://www.caap.gov.ph/?page_id=3055', 'https://www.caap.gov.ph/?page_id=3096', 'https://www.caap.gov.ph/?page_id=3175', 'https://www.caap.gov.ph/?page_id=3283', 'https://www.caap.gov.ph/?page_id=3349', 'https://www.caap.gov.ph/?page_id=3446', 'https://www.caap.gov.ph/?page_id=3526', 'https://www.caap.gov.ph/?page_id=3620']
List of incident url:  ['https://www.caap.gov.ph/?page_id=6421', 'https://www.caap.gov.ph/?page_id=7850', 'https://www.caap.gov.ph/?page_id=3028', 'https://www.caap.gov.ph/?page_id=3086', 'https://www.caap.gov.ph/?page_id=3154', 'https://www.caap.gov.ph/?page_id=3250', 'https://www.caap.gov.ph/?page_id=3333', 'https://www.caap.gov.ph/?page_id=3401', 'https://www.caap.gov.ph/?page_id=3468', 'https://www.caap.gov.ph/?page_id=3568', 'https://www.caap.gov.ph/?page_id=3676']
List of serious incident url:  ['https://www.caap.

In [313]:
def get_data(url_list):
    """
    Get the concatenated DataFrame from the url in the url_list
    
    @param url_list: List of urls from the CAAP website (accident, incident, or serious_incident)
    """
    data_df = pd.DataFrame(columns=['date', 'aircraft_registration', 'aircraft_type', 'type_of_occurance',
                                           'place_of_occurance', 'status', 'report', 'report_link'])
    for url in url_list:
        print(url)
        website = (requests.get(url).text)
        soup = BeautifulSoup(website, 'lxml')
        accident_table = soup.find('tbody')
        listOfRows = accident_table.find_all('tr')
        temp_data_df = pd.DataFrame(columns=['date', 'aircraft_registration', 'aircraft_type', 'type_of_occurance',
                                           'place_of_occurance', 'status', 'report', 'report_link'])
        i = 0
        for row in listOfRows:
            listOfCells = row.find_all('td')
            if len(listOfCells) == 7 and listOfCells[6].get_text() == "Back":
                continue
            if len(listOfCells) == 7 and i > 1:
                getText = lambda x: x.get_text()
                # print(listOfCells[3])
                temp_data_df.loc[i] = list(map(getText, listOfCells)) + list([listOfCells[6].find('a')['href']])
            i += 1
        data_df = pd.concat([data_df,temp_data_df], axis = 0)
    return data_df

In [314]:
accident_df = get_data(accident_urls)
incident_df = get_data(incident_urls)
serious_incident_df = get_data(serious_incident_urls)

https://www.caap.gov.ph/?page_id=7509
https://www.caap.gov.ph/?page_id=6439
https://www.caap.gov.ph/?page_id=2960
https://www.caap.gov.ph/?page_id=3055
https://www.caap.gov.ph/?page_id=3096
https://www.caap.gov.ph/?page_id=3175
https://www.caap.gov.ph/?page_id=3283
https://www.caap.gov.ph/?page_id=3349
https://www.caap.gov.ph/?page_id=3446
https://www.caap.gov.ph/?page_id=3526
https://www.caap.gov.ph/?page_id=3620
https://www.caap.gov.ph/?page_id=6421
https://www.caap.gov.ph/?page_id=7850
https://www.caap.gov.ph/?page_id=3028
https://www.caap.gov.ph/?page_id=3086
https://www.caap.gov.ph/?page_id=3154
https://www.caap.gov.ph/?page_id=3250
https://www.caap.gov.ph/?page_id=3333
https://www.caap.gov.ph/?page_id=3401
https://www.caap.gov.ph/?page_id=3468
https://www.caap.gov.ph/?page_id=3568
https://www.caap.gov.ph/?page_id=3676
https://www.caap.gov.ph/?page_id=7895
https://www.caap.gov.ph/?page_id=8002
https://www.caap.gov.ph/?page_id=4343
https://www.caap.gov.ph/?page_id=4274
https://www.

In [315]:
accident_df

Unnamed: 0,date,aircraft_registration,aircraft_type,type_of_occurance,place_of_occurance,status,report,report_link
2,"September 9, 2018",RP-C8158,McDonnell Douglas (MD) 369 E,Ditched into the sea during over,International Water near the Federated States ...,Completed,Summary,https://www.caap.gov.ph?download=9397
3,"August 16, 2018",B-5498,Boeing 737-800,Runway Lateral Excursion,Ninoy Aquino International Airport (RPLL) Mani...,Completed,Summary,https://www.caap.gov.ph?download=9119
4,"June 6, 2018",RP-C1811,Robinson Helicopter Company R44 II,Main Rotor Failure during Take-Off,"DOLE Plantation, Sitio Glandang, Barangay Kabl...",Completed,Summary,http://www.caap.gov.ph?download=7517
2,"December 10, 2017",RP-C938,BE-P35,Departure Stall,"Skyhawk Airstrip, Tuy, Batangas, Philippines",Completed,Summary,http://www.caap.gov.ph?download=7806
3,"November 26, 2017",RP-C2586,Cessna 152,Engine Fire,"South East Apron, Subic Bay International Airp...",Completed,Summary,http://www.caap.gov.ph?download=6472
...,...,...,...,...,...,...,...,...
12,"Mar 01, 2008",RP-C1129,Cessna 150 M,Crashed short at threshold of Runway 35,"Plaridel Airport, Bulacan",Completed,Summary,http://www.caap.gov.ph?download=3636
13,"Feb 24, 2008",RP-C5328,Dornier DO-328-100,Runway Excursion,Ninoy Aquino International Airport,Completed,Summary,http://www.caap.gov.ph?download=3633
14,"Feb 17, 2008",RP-C654,Cessna 172M,Engine Malfunction,"Hermana Mayor, Zambales",Completed,Summary,http://www.caap.gov.ph?download=3627
15,"Feb 01, 2008",RP-C229,Cessna 172M,Fuel Starvation,"Reclamation area near S&R, Parañaque City",Completed,Summary,http://www.caap.gov.ph?download=3624


In [316]:
accident_df = accident_df.reset_index(drop=True)
incident_df = incident_df.reset_index(drop=True)
serious_incident_df = serious_incident_df.reset_index(drop=True)

#### Get geolocation of the Place of Occurrence column

Originally, I used the Nominatim's geocoders to get the location's coordinates (`from geopy.geocoders import Nominatim`); however, it returned a lot of Nones because of the inconsistent addresses. I switched to the Googlemaps api which returns the closest address string location if the exact location isn't found. For example, in the 2008 ACCIDENTS Dataset, "Taliban, Bohol" is one of the places of occurrence. Nominatim's geocoder returns None, while Google maps returns details of "Talibon, Bohol".

In [323]:
import googlemaps
from datetime import datetime

with open('api_key.txt') as f:
    api_key = f.readline()
    f.close
gmaps = googlemaps.Client(api_key)

In [324]:
# Geocode helper functions

# returns the geocode (json containing details of the location)
def google_get_geocode(location):
    try: 
        geocode_result = gmaps.geocode(location)
        return geocode_result[0]
    except: return None

# returns latitude given the geocode json
def google_get_latitude(geocode):
    try: 
        return geocode['geometry']['location']['lat']
    except: 
        return None

# returns longitude given the geocode json
def google_get_longitude(geocode):
    try: 
        return geocode['geometry']['location']['lng']
    except: 
        return None

# returns address given the geocode json
def google_get_address(geocode):
    try: 
        return geocode['formatted_address']
    except: 
        return None

In [325]:
geocode_data = []
data_list = [accident_df, incident_df, serious_incident_df]

# get geocode, latitude, longitude, and formatted address of each place of occurrence 
# in the accident_df, incident_df, and serious_incident_df tables
for table in data_list:
    table['geocode'] = table.place_of_occurance.apply(google_get_geocode)
    table['latitude'] = table.geocode.apply(google_get_latitude)
    table['longitude'] = table.geocode.apply(google_get_longitude)
    table['formatted_address'] = table.geocode.apply(google_get_address)

In [326]:
# Check how many 'None' geocode of the places of occurrence did the google maps return
print(f"Accident_df: \n{accident_df.geocode.isna().value_counts()}\n")
print(f"Incident_df: \n{incident_df.geocode.isna().value_counts()}\n", )
print(f"Serious_incident_df: \n{serious_incident_df.geocode.isna().value_counts()}", )

Accident_df: 
False    135
True       2
Name: geocode, dtype: int64

Incident_df: 
False    78
True      2
Name: geocode, dtype: int64

Serious_incident_df: 
False    15
Name: geocode, dtype: int64


Manually input the geocode, latitude, longitude, and formatted_address of the 4 places of occurrence with 'None' geocodes.

In [327]:
accident_df.loc[accident_df.geocode.isna()]

Unnamed: 0,date,aircraft_registration,aircraft_type,type_of_occurance,place_of_occurance,status,report,report_link,geocode,latitude,longitude,formatted_address
0,"September 9, 2018",RP-C8158,McDonnell Douglas (MD) 369 E,Ditched into the sea during over,International Water near the Federated States ...,Completed,Summary,https://www.caap.gov.ph?download=9397,,,,
69,"May 8, 2013",RP-C1095,Piper Aztec Twin-Engine,Belly Landing,"MIA, Runway 06",Completed,Summary,http://www.caap.gov.ph?download=3230,,,,


In [328]:
temp = pd.DataFrame({'latitude' : [3.65, 14.49], 'longitude': [160.266667, 121.001500]})
accident_df.loc[accident_df.geocode.isna(), ['latitude', 'longitude']] = temp.values

- September 9, 2018
    - International Water near the Federated States of Micronesia
        - As listed from the report_link the helicopter's final position was noted at coordinates 3°39'00.0"N 160°16'00.0"E (latitude: 3.650000, longitude: 160.266667)
- May 8, 2013
    - MIA, Runway 06 (Ninoy Aquino International Airport Runway 06)
        - 14°29'54.6"N 121°00'05.4"E (latitude: 14.498500, longitude: 121.001500)

In [329]:
incident_df.loc[incident_df.geocode.isna()]

Unnamed: 0,date,aircraft_registration,aircraft_type,type_of_occurance,place_of_occurance,status,report,report_link,geocode,latitude,longitude,formatted_address
47,"Mar 12, 2010",RP-C1322,Cessna 172,Jet Blast,Runway 13 Extension,Completed,Summary,http://www.caap.gov.ph?download=3484,,,,
52,"Jan 24, 2010",RP-C1086,Eurocopter Deutschland GMBH,Detachment of Unsecured Detachable Step,"Gen. Aviation Area, Domestic Airport",Completed,Summary,http://www.caap.gov.ph?download=3472,,,,


In [330]:
temp = pd.DataFrame({'latitude' : [14.522667, 121.005000], 'longitude': [14.524806, 121.001028]})
incident_df.loc[incident_df.geocode.isna(), ['latitude', 'longitude']] = temp.values

- Mar 12, 2010
    - Runway 13 Extension
        - NAIA Runway 13 Extension 14°31'21.6"N 121°00'18.0"E (latitude: 14.522667, longitude: 121.005000)
- Jan 24, 2010
    - Gen. Aviation Area, Domestic Airport
        - Domestic Airport, Brgy 191, Pasay City 14°31'29.3"N 121°00'03.7"E (latitude: 14.524806, longitude: 121.001028)

In [332]:
serious_incident_df.loc[serious_incident_df.geocode.isna()]

Unnamed: 0,date,aircraft_registration,aircraft_type,type_of_occurance,place_of_occurance,status,report,report_link,geocode,latitude,longitude,formatted_address


Looks like geocode is returned successfully for all the places of occurrence in the serious incident table.

#### Clean date column

In [333]:
from datetime import datetime

def try_parsing_date(text):
    for fmt in ('%B %d, %Y', '%b %d, %Y', '%b %d,%Y', '%B %d %Y'):
        try:
            return datetime.strptime(text, fmt)
        except ValueError:
            pass
    raise ValueError('no valid date format found')

accident_df['date'] = accident_df['date'].apply(try_parsing_date)
incident_df.loc[incident_df['date'].str.contains('Sept '), 'date'] = incident_df.loc[incident_df['date'].str.contains('Sept '), 'date'].str.replace("Sept", "September")
incident_df.loc[incident_df['date'].str.contains('Apirl'), 'date'] = incident_df.loc[incident_df['date'].str.contains('Apirl '), 'date'].str.replace("Apirl", "April")
incident_df['date'] = incident_df['date'].apply(try_parsing_date)
serious_incident_df['date'] = serious_incident_df['date'].apply(try_parsing_date)

#### Clean aircraft_registration column

In [334]:
# accident_df = pd.read_csv("accidents.csv")
# incident_df = pd.read_csv("incidents.csv")
# serious_incident_df = pd.read_csv("serious_incidents.csv")

Check if there's any inconsistencies with the format of the aircraft_registration column

In [335]:
accident_df.aircraft_registration.unique()

array(['RP-C8158', 'B-5498', 'RP-C1811', 'RP-C938', 'RP-C2586',
       'RP-C1955', 'RP-R4169 & RP-R3029', 'RP-C3565', 'RP-R2923',
       'RP-C5034', 'RP-C9928', 'RP-R5835', 'RP-C7838', 'RP-C7252',
       'RP-C3560', 'RP-R3100', 'RP-R5898', 'RP-R1253', 'RP-C337',
       'RP-R3028', 'RP-C1752', 'RP-C3545', 'RP-R3026', 'RP-C6919',
       'RP-C2280', 'RP-R8856', 'RP-C8832', 'RP-C8992', 'RP-C2726',
       'RP-C8837', 'RP-C1136', 'RP-C1379', 'RP-C2792', 'RP-R5385',
       'RP-C9363', 'RP-C2346', 'RP-R2505', 'RP-C8864', 'RP-C7555',
       'RP-R5388', 'RP-C6932', 'RP-C6989', 'RP-R5837', 'RP-C8847',
       'RP-C1974', 'RP-R146', 'RP-C1745', 'RP-R3024', 'RP-R1420',
       'RP-R1750', 'RP-C8835', 'RP-C3533', 'RP-C8371', 'RP-C3699',
       'RP-C525', 'RP-C1034', 'RP-C1970', 'RP-C1001', 'RP-R5878',
       'RP-C9988', 'RP-R9908', 'RP-R5788', 'RP-C2045', 'RP-R1191',
       'RP-C3266', 'RP-R1425', 'RP-C3604', 'RP-C1095', 'RP-C7788',
       'RP-C8818', 'RP-C8836', 'RP-C8220', 'RP-C4431', 'RP-R2356',
  

In [336]:
tempRow = accident_df.loc[accident_df.aircraft_registration == "RP-R4169 & RP-R3029"]
tempRow

Unnamed: 0,date,aircraft_registration,aircraft_type,type_of_occurance,place_of_occurance,status,report,report_link,geocode,latitude,longitude,formatted_address
6,2017-05-19,RP-R4169 & RP-R3029,Allied AgCat G-164B & Thrush Aircraft S2R,Ground Collision,Mindanao Agricultural Training Center Airstrip...,Completed,Summary,https://www.caap.gov.ph?download=9829,{'address_components': [{'long_name': 'Sampao ...,7.619337,125.66968,"Sampao (Bienvenida), Kapalong, Davao del Norte..."


In [337]:
x = tempRow.aircraft_registration.str.split(' & ').tolist()
y = tempRow.aircraft_type.str.split(' & ').tolist()

b  = pd.DataFrame([list(map(lambda pair: {'reg': pair[0], 'type': pair[1]}, zip(x[0], y[0])))], 
             index=[tempRow.date, 
                    tempRow.type_of_occurance,
                    tempRow.place_of_occurance,
                    tempRow.status,
                    tempRow.report,
                    tempRow.report_link,
                    tempRow.geocode,
                    tempRow.latitude,
                    tempRow.longitude,
                    tempRow.formatted_address]).stack()

b = b.reset_index()[[0, 
                     'date', 
                     'type_of_occurance', 
                     'place_of_occurance',
                     'status', 
                     'report', 
                     'report_link', 
                     'geocode',
                     'latitude',
                     'longitude', 
                     'formatted_address']]

b.columns = ['temp',
             'date', 
             'type_of_occurance',  
             'place_of_occurance', 
             'status', 
             'report', 
             'report_link', 
             'geocode', 
             'longitude', 
             'latitude', 
             'formatted_address']

b['aircraft_registration'] = b.temp.loc[:].apply(lambda x: x.get('reg'))
b['aircraft_type'] = b.temp.loc[:].apply(lambda x: x.get('type'))
b = b.drop('temp', axis=1)
b


TypeError: unhashable type: 'dict'

In [338]:
# drop row with inconsistent string format for aircraft type and reg
accident_df = accident_df.drop(accident_df.loc[accident_df.aircraft_registration == "RP-R4169 & RP-R3029"].index, axis=0)

In [339]:
accident_df = pd.concat([accident_df, b], sort=True)
accident_df.reset_index(drop=True)

Unnamed: 0,aircraft_registration,aircraft_type,date,formatted_address,geocode,latitude,longitude,place_of_occurance,report,report_link,status,type_of_occurance
0,RP-C8158,McDonnell Douglas (MD) 369 E,2018-09-09 00:00:00,,,3.650000,160.266667,International Water near the Federated States ...,Summary,https://www.caap.gov.ph?download=9397,Completed,Ditched into the sea during over
1,B-5498,Boeing 737-800,2018-08-16 00:00:00,"Ninoy Aquino International Airport (MNL), Andr...",{'address_components': [{'long_name': 'Ninoy A...,14.512274,121.016508,Ninoy Aquino International Airport (RPLL) Mani...,Summary,https://www.caap.gov.ph?download=9119,Completed,Runway Lateral Excursion
2,RP-C1811,Robinson Helicopter Company R44 II,2018-06-06 00:00:00,"Barangay Polonuling, S Cotabato - Sarangani Rd...",{'address_components': [{'long_name': 'South C...,6.296014,124.985853,"DOLE Plantation, Sitio Glandang, Barangay Kabl...",Summary,http://www.caap.gov.ph?download=7517,Completed,Main Rotor Failure during Take-Off
3,RP-C938,BE-P35,2017-12-10 00:00:00,"Tuy, Batangas, Philippines","{'address_components': [{'long_name': 'Tuy', '...",14.022075,120.728354,"Skyhawk Airstrip, Tuy, Batangas, Philippines",Summary,http://www.caap.gov.ph?download=7806,Completed,Departure Stall
4,RP-C2586,Cessna 152,2017-11-26 00:00:00,"Subic Bay Freeport Zone, Philippines",{'address_components': [{'long_name': 'Subic B...,14.769004,120.320487,"South East Apron, Subic Bay International Airp...",Summary,http://www.caap.gov.ph?download=6472,Completed,Engine Fire
...,...,...,...,...,...,...,...,...,...,...,...,...
133,RP-C654,Cessna 172M,2008-02-17 00:00:00,"Hermana Mayor, Philippines",{'address_components': [{'long_name': 'Hermana...,15.796948,119.797608,"Hermana Mayor, Zambales",Summary,http://www.caap.gov.ph?download=3627,Completed,Engine Malfunction
134,RP-C229,Cessna 172M,2008-02-01 00:00:00,"Ninoy Aquino Ave, Parañaque, Metro Manila, Phi...",{'address_components': [{'long_name': 'Ninoy A...,14.501859,120.997766,"Reclamation area near S&R, Parañaque City",Summary,http://www.caap.gov.ph?download=3624,Completed,Fuel Starvation
135,RP-C2706,Cessna-172K,2008-01-27 00:00:00,"Mactan, Philippines",{'address_components': [{'long_name': 'Mactan'...,10.299046,123.963853,"Mactan Island, Cebu",Summary,http://www.caap.gov.ph?download=3621,Completed,Ditching
136,RP-R4169,Allied AgCat G-164B,2017-05-19,"Sampao (Bienvenida), Kapalong, Davao del Norte...",{'address_components': [{'long_name': 'Sampao ...,125.669680,7.619337,Mindanao Agricultural Training Center Airstrip...,Summary,https://www.caap.gov.ph?download=9829,Completed,Ground Collision


In [340]:
print(incident_df.aircraft_registration.unique())
print(serious_incident_df.aircraft_registration.unique())

['RP-R8306' 'RP-R2546' 'RP-C2791' 'RP-C3020' 'RP-C9914' 'RP-C4429'
 'RP-C2844' 'RP-C1123' 'RP-R5838' 'RP-C5090' 'RP-C8862' 'RP-C8759'
 'RP-C3278' 'RP-C9002' 'RP-C3025' 'RP-C8994' 'N888VS' 'RP-C8078'
 'RP-C8993' 'HZ-AIX' 'RP-C1737' 'RP-C3195' 'RP-C8832' 'RP-C3545'
 'RP-C5319' 'RP-C7257' 'RP-C3271' 'RP-C984' 'RP-C1955' 'RP-C143'
 'RP-R2998' 'RP-C1970' 'RP-C3801' 'RP-C5758' 'RP-R1750' 'RP-C860'
 'RP-C9980' 'RP-C4421' 'PH-JDK' 'RP-R3024' 'RP-C8864' 'RP-C2252'
 'RP-C9328' 'RP-R3618' 'RP-C989' 'RP-C1743' 'RP-C2928' 'RP-C1322'
 'RP-R2374' 'RP-C5522 / RP-C666' 'RP-C4427' 'RP-C8840' 'RP-C1086'
 'RP-C1939' 'RP-C8690' 'RP-C2282' 'RP-C9988' 'RP-C317' 'RP-C347'
 'RP-C1742' 'RP-C3546' 'RP-C596' 'RP-C484' 'RP-C3249' 'RP-C2584'
 'RP-C4328' 'RP-C1402' 'RP-C2622' 'RP-R5858' 'RP-178' 'RP-C2711'
 'RP-C3033' 'RP-C6890' 'RP-C8080' 'RP-C8204' 'RP-C3541' 'RP-C1038'
 'RP-C1118' 'RP-C916']
['RP-R8759' 'RP-R5385' 'RP-C8613' 'RP-C3192' 'RP-C4105' 'RP-R2806'
 'RP-C3237' 'RP-C5323' 'RP-C8972' 'RP-C3031' 'RP-C3336' 

---
### Save to csv

In [341]:
dict_df = {'accidents.csv': accident_df, 
           'incidents.csv': incident_df, 
           'serious_incidents.csv': serious_incident_df}

for key,val in dict_df.items():
    val.to_csv(key, encoding='utf-8', index=False)

Recommendations:

Explore other database like:
- https://aviation-safety.net/database/country/country.php?id=RP
- http://planecrashinfo.com/

In [69]:
# accident_df.loc[90:110]