# List of Fighter and Attack Aircraft 

This is a list of military aircraft that are primarily designed for air-to-air combat and ground attack aircraft. 

We will be using pandas, requests and beautifulsoup to webscrape the Fighter Aircraft and Attack Aircraft wikipedia pages.

In [1]:
import pandas as pd
import requests
import re
from bs4 import BeautifulSoup

page = requests.get('https://en.wikipedia.org/wiki/List_of_fighter_aircraft').text
soup = BeautifulSoup(page, 'html.parser')
table = soup.find('table', class_="wikitable sortable")

df = pd.read_html(str(table))
df = pd.concat(df)
df['Purpose'] = 'Fighter'
print(df)

                                         Type  Country                 Class  \
0                                    ACAZ C.2  Belgium      Two-seat fighter   
1                     Adamoli-Cattani fighter    Italy                   NaN   
2                                    AD Scout       UK  Zeppelin interceptor   
3     ADA/HAL Advanced Medium Combat Aircraft    India                   NaN   
4                       AEG D.I, D.II & D.III  Germany                   NaN   
...                                       ...      ...                   ...   
1302                   Yakovlev Yak-50 (1949)     USSR                   NaN   
1303                            Yatsenko I-28     USSR                   NaN   
1304                          Yokosuka D4Y2-S    Japan         Night fighter   
1305                     Yokosuka P1Y2 Kyokko    Japan         Night fighter   
1306            Zeppelin-Lindau (Dornier) D.I  Germany                   NaN   

      Date      Status  No.          No

In [2]:
page = requests.get('https://en.wikipedia.org/wiki/List_of_attack_aircraft').text
soup = BeautifulSoup(page, 'html.parser')
table = soup.find('table', class_="wikitable sortable")

df2 = pd.read_html(str(table))
df2 = pd.concat(df2)
df2['Purpose'] = 'Attack'
print(df2)

                   Type  Country  Class  Date  Status   No.  Notes Purpose
0              AEG DJ.I  Germany    NaN  1918     NaN    1+    NaN  Attack
1               AEG J.I  Germany    NaN  1917     NaN   609    NaN  Attack
2      Aermacchi MB-326    Italy    NaN  1957     NaN   650    NaN  Attack
3      Aermacchi MB-339    Italy    NaN  1976     NaN  213+    NaN  Attack
4      Aermacchi SF.260    Italy    NaN  1964     NaN  860+    NaN  Attack
..                  ...      ...    ...   ...     ...   ...    ...     ...
249  Westland Whirlwind       UK    NaN  1938     NaN   116    NaN  Attack
250     Westland Wyvern       UK    NaN  1946     NaN   127    NaN  Attack
251           Xian JH-7    China    NaN  1988     NaN   240    NaN  Attack
252        Yokosuka B4Y    Japan    NaN  1935     NaN   205    NaN  Attack
253        Yokosuka D4Y    Japan    NaN  1940     NaN  2038    NaN  Attack

[254 rows x 8 columns]


### Reading data in pandas

In [3]:
# Read headers
df.columns

Index(['Type', 'Country', 'Class', 'Date', 'Status', 'No.', 'Notes',
       'Purpose'],
      dtype='object')

### Concat dataframes

In [4]:
frames = [df, df2]

result = pd.concat(frames)

result

Unnamed: 0,Type,Country,Class,Date,Status,No.,Notes,Purpose
0,ACAZ C.2,Belgium,Two-seat fighter,1926,Prototype,1,,Fighter
1,Adamoli-Cattani fighter,Italy,,1918,Prototype,1,,Fighter
2,AD Scout,UK,Zeppelin interceptor,1915,Prototype,4,,Fighter
3,ADA/HAL Advanced Medium Combat Aircraft,India,,2019,Project,0,[1],Fighter
4,"AEG D.I, D.II & D.III",Germany,,1917,Prototype,3,,Fighter
...,...,...,...,...,...,...,...,...
249,Westland Whirlwind,UK,,1938,,116,,Attack
250,Westland Wyvern,UK,,1946,,127,,Attack
251,Xian JH-7,China,,1988,,240,,Attack
252,Yokosuka B4Y,Japan,,1935,,205,,Attack


### Make changes to the data

In [5]:
# Drop column
result = result.drop(columns=['Notes'])

result

Unnamed: 0,Type,Country,Class,Date,Status,No.,Purpose
0,ACAZ C.2,Belgium,Two-seat fighter,1926,Prototype,1,Fighter
1,Adamoli-Cattani fighter,Italy,,1918,Prototype,1,Fighter
2,AD Scout,UK,Zeppelin interceptor,1915,Prototype,4,Fighter
3,ADA/HAL Advanced Medium Combat Aircraft,India,,2019,Project,0,Fighter
4,"AEG D.I, D.II & D.III",Germany,,1917,Prototype,3,Fighter
...,...,...,...,...,...,...,...
249,Westland Whirlwind,UK,,1938,,116,Attack
250,Westland Wyvern,UK,,1946,,127,Attack
251,Xian JH-7,China,,1988,,240,Attack
252,Yokosuka B4Y,Japan,,1935,,205,Attack


### Filtering data

In [6]:
# Find Aircraft from the US
us_aircraft = result.loc[(result['Country'] == "US")]
us_aircraft

Unnamed: 0,Type,Country,Class,Date,Status,No.,Purpose
14,Aeromarine PG-1,US,Fighter-bomber,1922,Prototype,1,Fighter
37,Albree Pigeon-Fraser Pursuit,US,,1917,Prototype,3,Fighter
129,Bell YFM-1 Airacuda,US,Interceptor,1937,Production,13,Fighter
130,Bell XFL Airabonita,US,Carrier fighter,1940,Prototype,1,Fighter
131,Bell P-39 Airacobra,US,,1938,Production,9584,Fighter
...,...,...,...,...,...,...,...
242,Vought F4U Corsair,US,,1940,,12571,Attack
243,Vought SB2U Vindicator,US,,1936,,260,Attack
244,Vought SBU Corsair,US,,1933,,125,Attack
245,"Vultee V-11, V-12 and A-19",US,,1935,,224,Attack


### Sorting data by Date - Part 1

In [7]:
us_aircraft.sort_values(['Date'], ascending=[1], inplace=True)

TypeError: '<' not supported between instances of 'str' and 'int'

We encountered an error when trying to sort by Date, which means that this column may have special characters.

Upon further analysis, we see that multiple fields have question marks or parenthesis in them.

### Remove special characters from Date column by extracting numbers only

In [8]:
us_aircraft = us_aircraft.assign(Date = lambda x: x['Date'].str.extract('(\d+)'))

### Sorting data by Date - Part 2

In [9]:
us_aircraft.sort_values(['Date'], ascending=[1], inplace=True)

us_aircraft

Unnamed: 0,Type,Country,Class,Date,Status,No.,Purpose
32,Boeing GA-1,US,,1920,,10,Attack
55,Curtiss Falcon,US,,1924,,488,Attack
77,Douglas XA-2,US,,1926,,1,Attack
155,Martin BM,US,,1929,,35,Attack
96,Fokker XA-7,US,,1931,,1,Attack
...,...,...,...,...,...,...,...
1254,Vultee XP-54,US,,,Prototype,2,Fighter
1255,Vultee P-66 Vanguard,US,,,Production,146,Fighter
1256,Waco CSO-A/240A,US,,,Production,11,Fighter
1257,Waco CTO-A,US,,,Prototype,1,Fighter
