#  <p style="text-align: center;">AIRCRASH INVESTIGATION</p>


# FUENTES

### Kaggle:

Usuario MihirSethi
dataset: Airplane_Crashes_and_Fatalities_Since_1908.csv
link: 'https://www.kaggle.com/mihirsethi007/aircrash-data'

*mirar: Código interesante*: https://www.kaggle.com/mihirsethi007/fork-of-air-crash-analysis-1908-2009

### planecrashinfo.com:
Richard Kebabjian  
dataset: web scrapping
link: 'http://www.planecrashinfo.com/database.htm'

### datos.bancomundial.org: 

Organización de Aviación Civil Internacional (OACI)
dataset: API_IS.AIR.DPRT_DS2_es_csv_v2_2169474.csv
link: 'https://datos.bancomundial.org/indicator/IS.AIR.DPRT'

# LIBRERÍAS:

In [42]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import urllib.request
from bs4 import BeautifulSoup

In [2]:
import warnings
warnings.filterwarnings("ignore")
plt.style.use('seaborn-bright')

## HIPÓTESIS:

- La mayor causa de accidente es el error humano, mayoritariamente del piloto.
- La mayoría de las muertes se producen ya en el suelo.


## ANÁLISIS PREVIO DATASET KAGGLE

In [166]:
air = pd.read_csv('data/Airplane_Crashes_and_Fatalities_Since_1908.csv')

In [140]:
air.head()

Unnamed: 0,Date,Time,Location,Operator,Flight #,Route,Type,Registration,cn/In,Aboard,Fatalities,Ground,Summary
0,09/17/1908,17:18,"Fort Myer, Virginia",Military - U.S. Army,,Demonstration,Wright Flyer III,,1.0,2.0,1.0,0.0,"During a demonstration flight, a U.S. Army fly..."
1,07/12/1912,06:30,"AtlantiCity, New Jersey",Military - U.S. Navy,,Test flight,Dirigible,,,5.0,5.0,0.0,First U.S. dirigible Akron exploded just offsh...
2,08/06/1913,,"Victoria, British Columbia, Canada",Private,-,,Curtiss seaplane,,,1.0,1.0,0.0,The first fatal airplane accident in Canada oc...
3,09/09/1913,18:30,Over the North Sea,Military - German Navy,,,Zeppelin L-1 (airship),,,20.0,14.0,0.0,The airship flew into a thunderstorm and encou...
4,10/17/1913,10:30,"Near Johannisthal, Germany",Military - German Navy,,,Zeppelin L-2 (airship),,,30.0,30.0,0.0,Hydrogen gas which was being vented was sucked...


In [167]:
air.columns

Index(['Date', 'Time', 'Location', 'Operator', 'Flight #', 'Route', 'Type',
       'Registration', 'cn/In', 'Aboard', 'Fatalities', 'Ground', 'Summary'],
      dtype='object')

### Variable Notes:

- **Date**:	 Date of accident,  in the format - January 01, 2001
- **Time**:	 Local time, in 24 hr. format unless otherwise specified
- **Airline/Op:**	 Airline or operator of the aircraft
- **Flight #:**	 Flight number assigned by the aircraft operator
- **Route:**	 Complete or partial route flown prior to the accident
- **AC Type:**	 Aircraft type
- **Reg:**	 ICAO registration of the aircraft
- **cn / ln:**	 Construction or serial number / Line or fuselage number
- **Aboard:**	 Total aboard (passengers / crew)
- **Fatalities:**	 Total fatalities aboard (passengers / crew)
- **Ground:**	 Total killed on the ground
- **Summary:**	 Brief description of the accident and cause if known

In [168]:
#Cambio el nombre de una columna para que no haya espacios ni caracteres especiales
air.rename(columns={'Flight #':'nFlight'},
                        inplace = True)

In [169]:
air.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 13 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Date          5268 non-null   object 
 1   Time          3049 non-null   object 
 2   Location      5248 non-null   object 
 3   Operator      5250 non-null   object 
 4   nFlight       1069 non-null   object 
 5   Route         3562 non-null   object 
 6   Type          5241 non-null   object 
 7   Registration  4933 non-null   object 
 8   cn/In         4040 non-null   object 
 9   Aboard        5246 non-null   float64
 10  Fatalities    5256 non-null   float64
 11  Ground        5246 non-null   float64
 12  Summary       4878 non-null   object 
dtypes: float64(3), object(10)
memory usage: 535.2+ KB


In [170]:
air.isna().sum()

Date               0
Time            2219
Location          20
Operator          18
nFlight         4199
Route           1706
Type              27
Registration     335
cn/In           1228
Aboard            22
Fatalities        12
Ground            22
Summary          390
dtype: int64

In [171]:
air['unos'] = 1

In [242]:
# Me deshago de los Nans en todas las variables categóricas y las que tienen valores 
#múltiples, no numéricos, que no aportan info al análisis
lista = ['Time','Location','Operator','nFlight','Route','Type','Registration','cn/In',
         'Summary']
for campo in lista:
    
    air[campo].fillna('Unknown', inplace = True)

In [172]:
air.groupby('Date')[['unos']].sum().sort_values('unos',ascending = False).head(10)

Unnamed: 0_level_0,unos
Date,Unnamed: 1_level_1
09/11/2001,4
08/31/1988,4
08/27/1992,4
02/28/1973,4
06/18/1972,4
08/28/1976,4
11/23/1962,3
03/23/1994,3
11/15/1934,3
05/13/1957,3


In [186]:
air.groupby('Time')[['unos']].sum().sort_values('unos',ascending = False).head()

Unnamed: 0_level_0,unos
Time,Unnamed: 1_level_1
15:00,32
12:00,31
11:00,29
19:30,26
16:00,26


In [198]:
#Analizo el campo Time según su length.
air['len_time'] = air['Time'].str.len()

In [189]:
#Solamente hay unos pocos que tienen length diferente, los observo y los corrijo
air.groupby('len_time')[['unos']].sum().sort_values('unos',ascending = False)#.head(10)

Unnamed: 0_level_0,unos
len_time,Unnamed: 1_level_1
5.0,3033
4.0,7
7.0,6
6.0,3


In [199]:
air.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Date          5268 non-null   object 
 1   Time          5268 non-null   object 
 2   Location      5248 non-null   object 
 3   Operator      5250 non-null   object 
 4   nFlight       1069 non-null   object 
 5   Route         3562 non-null   object 
 6   Type          5241 non-null   object 
 7   Registration  4933 non-null   object 
 8   cn/In         4040 non-null   object 
 9   Aboard        5246 non-null   float64
 10  Fatalities    5256 non-null   float64
 11  Ground        5246 non-null   float64
 12  Summary       4878 non-null   object 
 13  unos          5268 non-null   int64  
 14  len_time      5268 non-null   int64  
dtypes: float64(3), int64(2), object(10)
memory usage: 617.5+ KB


In [151]:
air.loc[air['len_time']> 5,['Time']]

Unnamed: 0,Time
190,c: 1:00
213,c:17:00
228,c: 2:00
279,c:09:00
1462,c16:50
2599,c:09:00
3267,114:20
3390,c14:30
4838,c: 9:40


In [203]:
air['Time'] = air['Time'].map(lambda x: str(x).replace(' ','').replace('c:','').replace('c',''))
air['len_time'] = air['Time'].str.len()

In [214]:
air.loc[air['len_time']> 5,['Time']].groupby('Time').count()

Unknown


In [206]:
air.loc[3267,'Time'] = air.loc[3267,'Time'][1:]
air['len_time'] = air['Time'].str.len()

In [207]:
air.loc[3267,'Time']

'14:20'

In [201]:
air['len_time'] = air['Time'].str.len()
air.loc[air['len_time']<5,['Time']]

Unnamed: 0,Time
711,1:30
3536,0943
3584,1:00
4298,2:40
4848,2:00
4849,8:02
5156,9:30


In [216]:
air.loc[3536,'Time'] = '09:43'
air['len_time'] = air['Time'].str.len()

In [217]:
air.loc[air['len_time']<5,['Time']]

Unnamed: 0,Time
190,1:00
228,2:00
711,1:30
3584,1:00
4298,2:40
4838,9:40
4848,2:00
4849,8:02
5156,9:30


In [220]:
air.loc[air['len_time']== 4,['Time']] = '0' + air.loc[air['len_time']==4,['Time']]
air['len_time'] = air['Time'].str.len()

In [221]:
air.loc[air['len_time']<5,['Time']]

Unnamed: 0,Time


In [222]:
air.groupby('Time')[['unos']].sum().sort_values('unos',ascending = False)#.head(10)

Unnamed: 0_level_0,unos
Time,Unnamed: 1_level_1
Unknown,2219
15:00,32
12:00,31
11:00,29
19:30,26
...,...
13:28,1
13:26,1
13:24,1
13:22,1


In [235]:
air.groupby('Route')[['unos']].sum().sort_values('unos',ascending = False).head(10)

Unnamed: 0_level_0,unos
Route,Unnamed: 1_level_1
Unknown,1706
Training,81
Sightseeing,29
Test flight,17
Test,6
Sao Paulo - Rio de Janeiro,5
Villavicencio - Mitu,4
Bogota - Barranquilla,4
Saigon - Paris,4
Sao Paulo - Porto Alegre,4


In [88]:
# count de registros que en Location tienen 'origen - destino'
air['Route'].isnull().sum()

1706

In [224]:
# Los Nans de Route los paso a 'Unknown'
air['Route'].fillna('Unknown', inplace = True)

In [233]:
air[air['Route'].str.contains('-')]

Unnamed: 0,Date,Time,Location,Operator,nFlight,Route,Type,Registration,cn/In,Aboard,Fatalities,Ground,Summary,unos,len_time
56,09/06/1921,Unknown,"Paris, France",Franco-Roumaine,,Varsovie - Strasbourg - Paris,Potez IX,F-ADCD,160,5.0,5.0,0.0,Crashed while making an approach to Le Bourget...,1,7
80,12/23/1923,02:30,Over the Mediterranean Sea,Military - French Navy,,Toulon - Algiers,Zeppelin Dixmunde (airship),L-72,,52.0,52.0,0.0,"Crashed while on a flight from Toulon, France ...",1,5
83,04/24/1924,Unknown,Over the English Channel,KLM Royal Dutch Airlines,,"Lympne, England - Rotterdam, The Netherlands",Fokker F.III,H-NABS,1535,3.0,3.0,0.0,,1,7
96,09/03/1925,05:30,"Caldwell, Ohio",Military - U.S. Navy,,"Lakehurst, NJ - S.t Louis, MO",Dirigible ZR-1 Shenandoah (airship),ZR-1,,43.0,14.0,0.0,The Shenandoah was flying over Southern Ohio w...,1,5
102,07/03/1926,Unknown,"Rossaugpt, Czechoslovakia",Compagnie Internationale de Navigation Aérienne,,Paris - Prague,Caudron C-61,F-AFBT,5307,7.0,7.0,0.0,Crashed while en route.,1,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5261,04/29/2009,06:00,"Massamba, DemocratiRepubliof Congo",Bako Air,,"Bangui, CAR- Brazzaville, Congo - Harare, Zimb...",Boeing B-737-200,TL-ADM,22264/753,7.0,7.0,0.0,Crashed while en route on a ferrying flight. T...,1,5
5263,05/20/2009,06:30,"Near Madiun, Indonesia",Military - Indonesian Air Force,,Jakarta - Maduin,Lockheed C-130 Hercules,A-1325,1982,112.0,98.0,2.0,"While on approach, the military transport cras...",1,5
5264,05/26/2009,Unknown,"Near Isiro, DemocratiRepubliCongo",Service Air,,Goma - Isiro,Antonov An-26,9Q-CSA,5005,4.0,4.0,,The cargo plane crashed while on approach to I...,1,7
5265,06/01/2009,00:15,"AtlantiOcean, 570 miles northeast of Natal, Br...",Air France,447,Rio de Janeiro - Paris,Airbus A330-203,F-GZCP,660,228.0,228.0,0.0,The Airbus went missing over the AtlantiOcean ...,1,5


In [225]:
air.groupby('Location')[['unos']].sum().sort_values('unos',ascending = False)

Unnamed: 0_level_0,unos
Location,Unnamed: 1_level_1
"Moscow, Russia",15
"Sao Paulo, Brazil",15
"Rio de Janeiro, Brazil",14
"Manila, Philippines",13
"Anchorage, Alaska",13
...,...
"Mannheim, Germany",1
"Manta, Ecuador",1
"Manus Island, New Guinea",1
"Manzanares, Colombia",1


In [226]:
#SEPARO LOS VALORES DE LOCATION PARA CONSEGUIR UNA COLUMNA CON LOS PAISES
separado = air["Location"].str.split(",", n=-1, expand=True)
separado

Unnamed: 0,0,1,2,3
0,Fort Myer,Virginia,,
1,AtlantiCity,New Jersey,,
2,Victoria,British Columbia,Canada,
3,Over the North Sea,,,
4,Near Johannisthal,Germany,,
...,...,...,...,...
5263,Near Madiun,Indonesia,,
5264,Near Isiro,DemocratiRepubliCongo,,
5265,AtlantiOcean,570 miles northeast of Natal,Brazil,
5266,Near Port Hope Simpson,Newfoundland,Canada,


In [228]:
air.groupby('Operator')[['unos']].sum().sort_values('unos',ascending = False)#.head(10)

Unnamed: 0_level_0,unos
Operator,Unnamed: 1_level_1
Aeroflot,179
Military - U.S. Air Force,176
Air France,70
Deutsche Lufthansa,65
United Air Lines,44
...,...
Everest Air,1
Europe Aero Service EAS,1
Eurojet Italila,1
Euroair,1


In [243]:
air.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Date          5268 non-null   object 
 1   Time          5268 non-null   object 
 2   Location      5268 non-null   object 
 3   Operator      5268 non-null   object 
 4   nFlight       5268 non-null   object 
 5   Route         5268 non-null   object 
 6   Type          5268 non-null   object 
 7   Registration  5268 non-null   object 
 8   cn/In         5268 non-null   object 
 9   Aboard        5246 non-null   float64
 10  Fatalities    5256 non-null   float64
 11  Ground        5246 non-null   float64
 12  Summary       5268 non-null   object 
 13  unos          5268 non-null   int64  
 14  len_time      5268 non-null   int64  
dtypes: float64(3), int64(2), object(10)
memory usage: 617.5+ KB


# VARIABLES NUMÉRICAS

In [245]:
air.describe()

Unnamed: 0,Aboard,Fatalities,Ground,unos,len_time
count,5246.0,5256.0,5246.0,5268.0,5268.0
mean,27.554518,20.068303,1.608845,1.0,5.842445
std,43.076711,33.199952,53.987827,0.0,0.987604
min,0.0,0.0,0.0,1.0,5.0
25%,5.0,3.0,0.0,1.0,5.0
50%,13.0,9.0,0.0,1.0,5.0
75%,30.0,23.0,0.0,1.0,7.0
max,644.0,583.0,2750.0,1.0,7.0


In [246]:
air[air['Ground']==2750]

Unnamed: 0,Date,Time,Location,Operator,nFlight,Route,Type,Registration,cn/In,Aboard,Fatalities,Ground,Summary,unos,len_time
4803,09/11/2001,08:47,"New York City, New York",American Airlines,11,Boston - Los Angeles,Boeing 767-223ER,N334AA,22332/169,92.0,92.0,2750.0,The aircraft was hijacked shortly after it lef...,1,5
4804,09/11/2001,09:03,"New York City, New York",United Air Lines,175,Boston - Los Angeles,Boeing B-767-222,N612UA,21873/41,65.0,65.0,2750.0,The aircraft was hijacked shortly after it lef...,1,5


In [248]:
airno11s = air[air['Ground']<2750]
airno11s.describe()

Unnamed: 0,Aboard,Fatalities,Ground,unos,len_time
count,5234.0,5244.0,5244.0,5244.0,5244.0
mean,27.569736,20.067315,0.560641,1.0,5.839054
std,43.105573,33.212478,5.768364,0.0,0.987057
min,0.0,0.0,0.0,1.0,5.0
25%,5.0,3.0,0.0,1.0,5.0
50%,13.0,9.0,0.0,1.0,5.0
75%,30.0,23.0,0.0,1.0,7.0
max,644.0,583.0,225.0,1.0,7.0


In [252]:
air[air['Ground']<225].sort_values('Ground',ascending = False)

Unnamed: 0,Date,Time,Location,Operator,nFlight,Route,Type,Registration,cn/In,Aboard,Fatalities,Ground,Summary,unos,len_time
4802,09/11/2001,09:45,"Arlington, Virginia.",American Airlines,77,Washington D.C. - Los Angeles,Boeing B-757-223,N644AA,24602/365,64.0,64.0,125.0,The aircraft was hijacked after taking off fro...,1,5
2933,10/13/1976,Unknown,"Santa Cruz, Bolivia",Lloyd Aéreo Boliviano,Unknown,Santa Cruz - Viru,Boeing B-707-31,N730JP,17671/48,3.0,3.0,113.0,The aircraft failed to climb and crashed into ...,1,7
2091,12/24/1966,19:15,"Binh Tahi, Da Nang, Vietnam",Flying Tiger Line,Unknown,Unknown,Canadair CL-44D4-1,N228SW,31,4.0,4.0,107.0,The cargo plane undershot runway by 1 mile whi...,1,5
1833,02/01/1963,17:15,"Ankara, Turkey",Middle East Airlines / Military - Turkish Air ...,265,Nicosia - Ankara,Vickers Viscount 754D,OD-ADE,244,17.0,17.0,87.0,Midair collision between a civilian and milita...,1,5
4875,07/27/2002,12:45,"Lviv, Ukraine",Military - Ukraine Air Force,Unknown,Unknown,Sukhoi Su-27,42,Unknown,2.0,0.0,85.0,The Su-76 was performing aerobatics when it cr...,1,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1793,08/01/1962,13:15,"Near Kathmandu, Nepal",Royal Nepal Airlines,Unknown,Katmandu - New Deli,Douglas DC-3,9N-AAH,6216,10.0,10.0,0.0,"Crashed into a 11,200 ft. mountain, 100 miles ...",1,5
1792,07/30/1962,Unknown,"Coulommiers, France",Air France,Unknown,Training,Douglas DC-3,F-BAOE,11769,8.0,4.0,0.0,Crashed during a training flight.,1,7
1791,07/28/1962,Unknown,"Sochi, Russia",Aeroflot,Unknown,Unknown,Antonov An-10A,CCCP-11186,0402003,81.0,81.0,0.0,Struck a mountain while attempting to land. Ch...,1,7
1790,07/22/1962,23:19,"Honolulu, Hawaii",Canadian PacifiAir Lines,323,"Honolulu - Nadi, Fiji",Bristol Britannia 314,CF-CZB,13394,40.0,27.0,0.0,"Shortly after takeoff, a fire warning indicati...",1,5


In [63]:
'''
Para evaluar si voy a poder incluir hipótesis relativas a la causa del accidente,
hago un diccionario y una lista con las causas, por si lo puedo usar después.
La fuente es una url dentro del propio planecrahsinfo: 
http://www.planecrashinfo.com/cause.htm
'''

#Causas de accidente en un diccionario
dicausa = {'PILOT ERROR':['Improper procedure', 'Flying VFR into IFR conditions',
                          'Controlled flight into terrain','Descending below minima',
                          'Spatial disorientation', 'Premature descent','Excessive landing speed',
                          'Missed runway','Fuel starvation','Navigation error',
                          'Wrong runway takeoff/landing','Midair collision caused by primary pilot'],
           'MECHANICIAL':['Engine failure','Equipment failure','Structural failure','Design flaw'],
           'WEATHER':['Severe turbulence','Windshear','Mountain wave','Poor visibility','Heavy rain',
                      'Severe winds','Icing','Thunderstorms','Lightning strike'],
           'SABOTAGE':['Hijacking','Shot down','Explosive device aboard','Pilot suicide'],
           'OTHER':['ATC error','Ground crew error','Overloaded','Improperly loaded cargo',
                    'Bird strike','Fuel contamination','Pilot incapacitation','Obstruction on runway',
                    'Midair collision caused by other aircraft',
                    'Fire/smoke in flight (cabin, cockpit, cargo hold)','Maintenance error']
          }

listcausa = ['Improper procedure', 'Flying VFR into IFR conditions',
             'Controlled flight into terrain','Descending below minima',
             'Spatial disorientation', 'Premature descent','Excessive landing speed',
             'Missed runway','Fuel starvation','Navigation error',
             'Wrong runway takeoff/landing','Midair collision caused by primary pilot',
             'Engine failure','Equipment failure','Structural failure','Design flaw',
             'Severe turbulence','Windshear','Mountain wave','Poor visibility','Heavy rain',
             'Severe winds','Icing','Thunderstorms','Lightning strike',
             'Hijacking','Shot down','Explosive device aboard','Pilot suicide',
             'ATC error','Ground crew error','Overloaded','Improperly loaded cargo',
             'Bird strike','Fuel contamination','Pilot incapacitation','Obstruction on runway',
             'Midair collision caused by other aircraft',
             'Fire/smoke in flight (cabin, cockpit, cargo hold)','Maintenance error']

In [71]:
#busco los valores de la lista de causas en el campo 'Summary'
air['sum2'] = 0

for causa in listcausa:
    
    air.loc[air['Summary'].str.contains(causa),'sum2'] = causa  

In [77]:
#Dado que es un campo con texto sin reglas, encuentro solamente
#325 registros en los que está el texto literal, por lo que no me va a servir
#para hacer este análisis.
air.groupby('sum2')['unos'].sum()#.sum()

sum2
0                                 4943
ATC error                            8
Controlled flight into terrain       6
Design flaw                          1
Engine failure                      51
Fuel contamination                   6
Fuel starvation                     19
Heavy rain                           1
Hijacking                            3
Icing                               37
Improper procedure                   4
Improperly loaded cargo              2
Maintenance error                    2
Navigation error                    13
Overloaded                          21
Poor visibility                      5
Premature descent                   10
Severe turbulence                    3
Severe winds                         1
Shot down                           95
Spatial disorientation              12
Structural failure                   7
Thunderstorms                        1
Windshear                           17
Name: unos, dtype: int64

In [89]:
#Hago una copia del df 'air' para trabajar sobre la copia, voy a quitar NaNs para
#poder trabajar con búsquedas de strings usando str.
air2 = air.copy()

In [94]:
air2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Date          5268 non-null   object 
 1   Time          3049 non-null   object 
 2   Location      5268 non-null   object 
 3   Operator      5250 non-null   object 
 4   nFlight       1069 non-null   object 
 5   Route         3562 non-null   object 
 6   Type          5241 non-null   object 
 7   Registration  4933 non-null   object 
 8   cn/In         4040 non-null   object 
 9   Aboard        5246 non-null   float64
 10  Fatalities    5256 non-null   float64
 11  Ground        5246 non-null   float64
 12  Summary       5268 non-null   object 
 13  unos          5268 non-null   int64  
 14  sum2          5268 non-null   object 
dtypes: float64(3), int64(1), object(11)
memory usage: 617.5+ KB


In [93]:
lista = ['Operator','nFlight','Route','Type','Registration','cn/In',
         'Time']
air2.Location.fillna('Unknown', inplace = True)

# ANÁLISIS PREVIO planecrashinfo.com

In [78]:
url = 'http://www.planecrashinfo.com/CATATC.htm'

tablas = pd.read_html(url)

tablas[0]

Unnamed: 0,0,1,2,3
0,Air Traffic Control Errors,Air Traffic Control Errors,Air Traffic Control Errors,Air Traffic Control Errors
1,6/13/1947,"Leesburg, Virginia",Pennsylvania AL,A contributing cause was the faulty clearance ...
2,05/22/1948,"Khabarovsk, Russia",Aeroflot,ATC error.
3,11/01/1949,"Arlington, Virginia",Eastern Air Lines,ATC error.
4,10/05/1952,"Skvoritsy, Russia",Aeroflot,Midair collision with a TC-62 aircraft. ATC er...
5,04/14/1958,"Castel de Fels, Spain",Aviaco,Another aircraft was permitted to takeoff with...
6,07/21/1961,"Shemya, Alaska",Alaska AL,Lack of guidance from air traffic controller d...
7,02/08/1965,"New York, New York",Eastern AL,Placement of the two aircraft on a near head o...
8,03/09/1967,"Urbana, Ohio",Trans World Airlines,ATC systems inadequate to separate controlled ...
9,03/05/1969,"San Juan, Puerto Rico",Prinair,A trainee vectored the aircraft into mountaino...


In [46]:
url = 'https://www.airfleets.es/crash/crash_year_1970.htm'
html =urllib.request.urlopen(url)
soup = BeautifulSoup(html)

#método para etiquetas y atributos

tag = soup.find_all('div',class_ ='ten columns')
tag[0]

HTTPError: HTTP Error 403: Forbidden