# Exploring World Historical Battle Database
Access to this database was granted by its creator and curator, [Dr. Shuhei Kitamura of Osaka University](https://osf.io/j357k). It's important we acknowledge his generosity. 

In the interest of looking at applying data science to social studies topics, Canada's involvement in world conflict is an important part of Canadian history. This database allows us to explore battles from throughout human history, and across the world.

In [30]:
import pandas as pd
import numpy as np
import plotly.express as px
import warnings
import pycountry_convert as pc
import requests

In [31]:
# Read in the data
data = pd.read_excel('data/whbd_v11.xlsx')
data2 = pd.read_csv('data/historical_battles.csv')
data = data.merge(data2,how='left')
data = data[data['year'].notna()]
data['year'] = data['year'].map(lambda x: int(x))
data.sort_values('year',inplace=True)
data

Unnamed: 0,uid,bid,lr,bname,year,year_end,war,bell,mult_sides,win,...,entire,unknown,ongoing,plan,nopage,wd_url,wp_url,casualties,True Location,continent
142,143,53,l,Battle of Zhuolu,-2500,,,Yanhuang tribe,,win,...,,,,,,http://www.wikidata.org/entity/Q1064923,https://en.wikipedia.org/wiki/Battle_of_Zhuolu,,,
143,144,53,r,Battle of Zhuolu,-2500,,,Jiuli tribes,,loss,...,,,,,,http://www.wikidata.org/entity/Q1064923,https://en.wikipedia.org/wiki/Battle_of_Zhuolu,,,
19448,19449,6850,l,Battle of Banquan,-2500,,,Shennong (tribe),,loss,...,,,,,,http://www.wikidata.org/entity/Q755758,https://en.wikipedia.org/wiki/Battle_of_Banquan,,,
19449,19450,6850,r,Battle of Banquan,-2500,,,Youxiong (tribe),,win,...,,,,,,http://www.wikidata.org/entity/Q755758,https://en.wikipedia.org/wiki/Battle_of_Banquan,,,
21481,21482,7590,r,Battle of Uruk,-2271,,,Sumerian provinces (?),,loss,...,,,,,,http://www.wikidata.org/entity/Q3309009,https://en.wikipedia.org/wiki/Battle_of_Uruk,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17506,17507,6213,l,Battle of Baghuz Fawqani,2019,,Deir ez-Zor offensive,United Kingdom,,win,...,,,,,,http://www.wikidata.org/entity/Q61843818,https://en.wikipedia.org/wiki/Battle_of_Baghuz...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia
17505,17506,6213,l,Battle of Baghuz Fawqani,2019,,Deir ez-Zor offensive,France,,win,...,,,,,,http://www.wikidata.org/entity/Q61843818,https://en.wikipedia.org/wiki/Battle_of_Baghuz...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia
17504,17505,6213,l,Battle of Baghuz Fawqani,2019,,Deir ez-Zor offensive,United States,,win,...,,,,,,http://www.wikidata.org/entity/Q61843818,https://en.wikipedia.org/wiki/Battle_of_Baghuz...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia
17510,17511,6213,r,Battle of Baghuz Fawqani,2019,,Deir ez-Zor offensive,Wilayat al-Sham,,loss,...,,,,,,http://www.wikidata.org/entity/Q61843818,https://en.wikipedia.org/wiki/Battle_of_Baghuz...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia


In [32]:
# See what the columns contain
data.columns

Index(['uid', 'bid', 'lr', 'bname', 'year', 'year_end', 'war', 'bell',
       'mult_sides', 'win', 'uk', 'fr', 'de', 'sp', 'sw', 'tr', 'at', 'ru',
       'nl', 'it', 'pt', 'dk', 'habsburg', 'hre', 'lat', 'lng', 'locn',
       'naval', 'river', 'lake', 'air', 'multiple', 'entire', 'unknown',
       'ongoing', 'plan', 'nopage', 'wd_url', 'wp_url', 'casualties',
       'True Location', 'continent'],
      dtype='object')

In [48]:
country_of_interest = 'France' #you can change this to a country you are interested in

# Filter to only look at the battles that specify Canada as a combatant. Drop unused columns
country_data = data[data['bell']==country_of_interest].drop(['uk', 'fr', 'de', 'sp', 'sw', 'tr', 'at', 
                                             'ru', 'nl', 'it', 'pt', 'dk', 'habsburg', 'hre',
                                             'naval', 'river', 'lake', 'air', 'multiple', 'entire',
                                             'ongoing', 'plan', 'nopage', 'unknown'], axis=1)
display(country_data)

Unnamed: 0,uid,bid,lr,bname,year,year_end,war,bell,mult_sides,win,lat,lng,locn,wd_url,wp_url,casualties,True Location,continent
5248,5249,1763,r,Battle of Gisors,1198,,,France,,loss,49.261667,1.743611,"Courcelles-lès-Gisors, Oise, Picardy, France",http://www.wikidata.org/entity/Q2236918,https://en.wikipedia.org/wiki/Battle_of_Gisors,,,
1968,1969,659,r,Battle of Arnemuiden,1338,,Hundred Years' War,France,,win,51.645000,3.373000,Arnemuiden (Walcheren island),http://www.wikidata.org/entity/Q1523324,https://en.wikipedia.org/wiki/Battle_of_Arnemu...,,,
17477,17478,6205,l,Battle of Champtoceaux,1341,,War of the Breton Succession,France,,win,47.315972,-1.292889,"Champtoceaux, Brittany",http://www.wikidata.org/entity/Q616448,https://en.wikipedia.org/wiki/Battle_of_Champt...,,,
2232,2233,742,r,Battle of St Pol de Léon,1346,,War of the Breton Succession,France,,loss,48.684959,-3.986864,"Saint-Pol-de-Léon, Finistère, France",http://www.wikidata.org/entity/Q1576140,https://en.wikipedia.org/wiki/Battle_of_St_Pol...,,,
2224,2225,740,r,Battle of La Roche-Derrien,1347,,War of the Breton Succession,France,,loss,48.744522,-3.256612,"La Roche-Derrien, France",http://www.wikidata.org/entity/Q1575537,https://en.wikipedia.org/wiki/Battle_of_La_Roc...,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21426,21427,7567,l,Battle of Tabqa (2017),2017,,Raqqa campaign (2016â_x0080__x0093_17),France,,major win,35.836666,38.548054,,http://www.wikidata.org/entity/Q29258911,https://en.wikipedia.org/wiki/Battle_of_Tabqa,,"{'country': 'Syria', 'country_code': 'sy'}",Asia
9091,9092,3179,l,Battle of Raqqa (2017),2017,,Raqqa campaign (2016–17),France,,win,35.950000,39.016667,"Raqqa, Raqqa Governorate, Syria",http://www.wikidata.org/entity/Q30140821,https://en.wikipedia.org/wiki/Battle_of_Raqqa_...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia
16613,16614,5893,l,Battle of In-Delimane,2018,,Northern Mali conflict,France,,win (partial),16.600000,1.600000,"In-Delimane, Gao Region, Mali",http://www.wikidata.org/entity/Q54866576,https://en.wikipedia.org/wiki/Battle_of_In-Del...,,"{'country': 'Mali', 'country_code': 'ml'}",Africa
15993,15994,5684,l,Battle of Al Hudaydah,2018,,Al Hudaydah governorate offensive,France,,ceasefire agreement,14.802222,42.951111,"Al Hudaydah, Yemen",http://www.wikidata.org/entity/Q48735086,https://en.wikipedia.org/wiki/Battle_of_Al_Hud...,,"{'country': 'Yemen', 'country_code': 'ye'}",Asia


We can take the data and consider only the wars that list Canada as a participant:

In [49]:
country_data = country_data[country_data['bname'].notna()]

list(country_data['bname'].sort_values().unique())

['3rd battle of Gao',
 '4th battle of Gao',
 '5th Battle of Gao',
 'Action at Cherbourg',
 'Action of 10 April 1795',
 'Action of 13 January 1797',
 'Action of 13 March 1806',
 'Action of 18 June 1799',
 'Action of 27 June 1798',
 'Action of 30 May 1798',
 'Action of 6 July 1746',
 'Action of 8 June 1755',
 'Action of March 1677',
 'Action of the Hohenzollern Redoubt',
 'Armada of 1779',
 'Attack on Mers-el-Kébir',
 'Battle at Port-la-Joye',
 'Battle at The Lizard',
 'Battle of Abbeville',
 'Battle of Abukir (1799)',
 'Battle of Abukir (1801)',
 'Battle of Adwa',
 'Battle of Ajdabiya',
 'Battle of Al Hudaydah',
 'Battle of Al jurf',
 'Battle of Alasay',
 'Battle of Albert',
 'Battle of Albuera',
 'Battle of Alcantara',
 'Battle of Alexandria',
 'Battle of Algiers',
 'Battle of Alkmaar',
 'Battle of Almonacid',
 'Battle of Altenheim',
 'Battle of Amiens',
 'Battle of Amstetten',
 'Battle of Arara',
 'Battle of Arnemuiden',
 'Battle of Arras',
 'Battle of Arroyo dos Molinos',
 'Battle of

We can plot the battles on a map with their lattitude and longitude coordinates.

You can drag to move around the map, zoom in and out to get more clarity. Hovering over each data point lists the name of the battle, as well as the war in which the battle was fought.

In [50]:
fig = px.scatter_geo(country_data, lat='lat', lon='lng', 
               hover_name='bname', 
               hover_data=['year'],
               color= 'war',
               title='Battles participated in by post-Confederation Canada')

fig.update_layout(showlegend=False)
fig.show()

Let's animate through the years so see the wars that Canada has participated in

In [51]:
warnings.filterwarnings("ignore")
years = list(country_data['year'].unique())

animate_country_data = pd.DataFrame(country_data)

def animation_years(row):
    global animate_country_data
   
    df = pd.DataFrame(columns=animate_country_data.columns)
    index = years.index(row['year'])
    for i in years[index+1:]:
        row[4] = i
        df.loc[len(df.index)] = row
    
    animate_country_data = pd.concat([animate_country_data,df],ignore_index=True)

for i in range(len(country_data.index)):
    animation_years(country_data.iloc[i])



In [52]:
def rank_rows(df):
    
    df['sort_rank'] = 0
    curr_rank = 1
    for b in battles:
        temp = df.loc[df['bname'] == b]
        for i,rows in temp.iterrows():
            
            if df['sort_rank'][i] == 0:
                df['sort_rank'][i] = curr_rank
                curr_rank += 1
    return df        
            

    
animate_country_data.sort_values(['year'],inplace=True)            
battles = list(animate_country_data.sort_values('year')['bname'].unique())
animate_country_data = rank_rows(animate_country_data)

animate_country_data.sort_values('sort_rank',inplace=True)
animate_country_data

Unnamed: 0,uid,bid,lr,bname,year,year_end,war,bell,mult_sides,win,lat,lng,locn,wd_url,wp_url,casualties,True Location,continent,sort_rank
0,5249,1763,r,Battle of Gisors,1198,,,France,,loss,49.261667,1.743611,"Courcelles-lès-Gisors, Oise, Picardy, France",http://www.wikidata.org/entity/Q2236918,https://en.wikipedia.org/wiki/Battle_of_Gisors,,,,1
619,5249,1763,r,Battle of Gisors,1338,,,France,,loss,49.261667,1.743611,"Courcelles-lès-Gisors, Oise, Picardy, France",http://www.wikidata.org/entity/Q2236918,https://en.wikipedia.org/wiki/Battle_of_Gisors,,,,2
620,5249,1763,r,Battle of Gisors,1341,,,France,,loss,49.261667,1.743611,"Courcelles-lès-Gisors, Oise, Picardy, France",http://www.wikidata.org/entity/Q2236918,https://en.wikipedia.org/wiki/Battle_of_Gisors,,,,3
621,5249,1763,r,Battle of Gisors,1346,,,France,,loss,49.261667,1.743611,"Courcelles-lès-Gisors, Oise, Picardy, France",http://www.wikidata.org/entity/Q2236918,https://en.wikipedia.org/wiki/Battle_of_Gisors,,,,4
622,5249,1763,r,Battle of Gisors,1347,,,France,,loss,49.261667,1.743611,"Courcelles-lès-Gisors, Oise, Picardy, France",http://www.wikidata.org/entity/Q2236918,https://en.wikipedia.org/wiki/Battle_of_Gisors,,,,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
617,15994,5684,l,Battle of Al Hudaydah,2018,,Al Hudaydah governorate offensive,France,,ceasefire agreement,14.802222,42.951111,"Al Hudaydah, Yemen",http://www.wikidata.org/entity/Q48735086,https://en.wikipedia.org/wiki/Battle_of_Al_Hud...,,"{'country': 'Yemen', 'country_code': 'ye'}",Asia,47412
47415,15994,5684,l,Battle of Al Hudaydah,2019,,Al Hudaydah governorate offensive,France,,ceasefire agreement,14.802222,42.951111,"Al Hudaydah, Yemen",http://www.wikidata.org/entity/Q48735086,https://en.wikipedia.org/wiki/Battle_of_Al_Hud...,,"{'country': 'Yemen', 'country_code': 'ye'}",Asia,47413
616,16614,5893,l,Battle of In-Delimane,2018,,Northern Mali conflict,France,,win (partial),16.6,1.6,"In-Delimane, Gao Region, Mali",http://www.wikidata.org/entity/Q54866576,https://en.wikipedia.org/wiki/Battle_of_In-Del...,,"{'country': 'Mali', 'country_code': 'ml'}",Africa,47414
47414,16614,5893,l,Battle of In-Delimane,2019,,Northern Mali conflict,France,,win (partial),16.6,1.6,"In-Delimane, Gao Region, Mali",http://www.wikidata.org/entity/Q54866576,https://en.wikipedia.org/wiki/Battle_of_In-Del...,,"{'country': 'Mali', 'country_code': 'ml'}",Africa,47415


By clicking on the "play" button on the bottom we'll be able to look at the wars that Canada has found throughout the years

In [53]:
px.scatter_geo(animate_country_data, lat='lat', lon='lng', 
               height=800, hover_name='bname', 
               animation_frame= 'year',
               animation_group='war',
               title='Battles participated in by post-Confederation Canada')

Now let's add in the number of casualties of each battle that Canada has fought in. We will change the size of each bubble to match the number of casualties each battle had. Larger bubbles will indicate battles with more casualties.

In [55]:
casualties_con = country_data[country_data['casualties'].notna()]

px.scatter_geo(casualties_con, lat='lat', lon='lng', 
               height=800, hover_name='bname', 
               hover_data=['war', 'year'],
               size = 'casualties',
               color='casualties',
               title='Casualties of battles participated in by ' + country_of_interest)

We can also create a bar graph that shows the bloodiest wars that Canada has been a part of.

In [56]:
con_bloodiest_war = country_data.groupby('war',as_index=False)['casualties'].sum()
con_bloodiest_war = con_bloodiest_war[con_bloodiest_war['casualties'] > 0]
con_bloodiest_war.sort_values('casualties',inplace=True)

In [57]:
top_10_bloddiest_con_wars = con_bloodiest_war[:10:-1]

fig = px.bar(top_10_bloddiest_con_wars,x='war',y='casualties',title= country_of_interest + "'s Bloodiest Wars")

fig.show()

We can also look at which continents Canada has fought the most in.

In [72]:
countries = ['Canada','Russia','India'] #you can change and add/remove countries in this list

battles_continent = data
battles_continent = battles_continent[battles_continent['continent'].notna()]
battles_continent['interest'] = data['bell'].map(lambda x: True if x in countries else False)
battles_continent = battles_continent[battles_continent['interest'] == True]
battles_continent.rename(columns={'bell':'Country'},inplace=True)
battles_continent

Unnamed: 0,uid,bid,lr,bname,year,year_end,war,Country,mult_sides,win,...,unknown,ongoing,plan,nopage,wd_url,wp_url,casualties,True Location,continent,interest
1978,1979,662,r,Battle of Poltava,1709,,Great Northern War,Russia,,win,...,,,,,http://www.wikidata.org/entity/Q152486,https://en.wikipedia.org/wiki/Battle_of_Poltava,,"{'country': 'Ukraine', 'country_code': 'ua'}",Europe,True
5875,5876,1978,l,Siege of Danzig,1734,,War of the Polish Succession,Russia,,win,...,,,,,http://www.wikidata.org/entity/Q2354105,https://en.wikipedia.org/wiki/Siege_of_Danzig_...,,"{'country': 'Poland', 'country_code': 'pl'}",Europe,True
18503,18504,6552,r,Battle of Gross-Jägersdorf,1757,,Seven Years' War,Russia,,win (tactical) loss (strategic),...,,,,,http://www.wikidata.org/entity/Q696689,https://en.wikipedia.org/wiki/Battle_of_Gross-...,,"{'country': 'Russia', 'country_code': 'ru'}",Europe,True
18033,18034,6392,r,Battle of Zorndorf,1758,,Seven Years' War,Russia,,inconclusive,...,,,,,http://www.wikidata.org/entity/Q663405,https://en.wikipedia.org/wiki/Battle_of_Zorndorf,,"{'country': 'Poland', 'country_code': 'pl'}",Europe,True
3145,3146,1055,r,Battle of Kunersdorf,1759,,Seven Years' War,Russia,,win,...,,,,,http://www.wikidata.org/entity/Q170400,https://en.wikipedia.org/wiki/Battle_of_Kuners...,,"{'country': 'Poland', 'country_code': 'pl'}",Europe,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5584,5585,1881,l,2016 Khanasir offensive,2016,,Syrian Civil War,Russia,,win,...,,,,,http://www.wikidata.org/entity/Q22947394,https://en.wikipedia.org/wiki/2016_Khanasir_of...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia,True
7358,7359,2522,l,Aleppo offensive (November–December 2016),2016,,Battle of Aleppo,Russia,,win,...,,,,,http://www.wikidata.org/entity/Q27894014,https://en.wikipedia.org/wiki/Aleppo_offensive...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia,True
7410,7411,2535,r,Palmyra offensive in December 2016,2016,,Syrian Civil War,Russia,,loss (partial),...,,,,,http://www.wikidata.org/entity/Q27988133,https://en.wikipedia.org/wiki/Palmyra_offensiv...,,"{'country': 'Syria', 'country_code': 'sy'}",Asia,True
5167,5168,1731,r,War in Donbass (Avdiivka),2017,,War in Donbass,Russia,,ceasefire,...,,,,,http://www.wikidata.org/entity/Q22117640,https://en.wikipedia.org/wiki/Battle_of_Avdiivka,,"{'country': 'Ukraine', 'country_code': 'ua'}",Europe,True


In [69]:
continent_grouped = battles_continent.groupby(['continent','bell'])['locn'].count().reset_index(name='Number of Battles Fought')
continent_grouped.sort_values('Number of Battles Fought',ascending=False,inplace=True)
fig = px.bar(continent_grouped,x='continent',y='Number of Battles Fought',color='bell',title='Number of Battles fought by each country in different Continents')
fig.show()

But these only consider battles that happened after Confederation, as the sovereign state of Canada didn't exist yet. We can also look at battles that happened in geographic Canada by only considering lattitude and longitude values in that region:

In [73]:
# Southernmost point of Canada is Lake Erie, ON, at 41°40' N; easternmost is Cape Spear, NL, at 52°37' W
# We also need to exclude a single WWII Pacific battle that happened off the coast of Alaska that didn't involve Canada
NA_data = data[(data['lat']>41.6) & 
               (data['lng']<-52.6) & 
               (data['lng']>-160)]

# There's also many wars in this subset that don't feature Canada, so we can list them here to remove:
remove = ['American Revolutionary War',
          'Sioux Wars',
          "Red Cloud's War",
          'Dakota War of 1862',
          'Russo-Tlingit War',
          'Great Sioux War of 1876',
          'Powder River Expedition',
          'American Civil War',
          'Yellowstone Expedition of 1873',
          'Nez Perce War',
          'Comanche Campaign',
          'Boston campaign',
          'Modoc War',
          'American Revolution',
          "King Philip's War",
          'Black Hawk War',
          'Colorado War',
          'American Indian Wars',
          'Forage War',
          "Coeur d'Alene War",
          'Yakima War',
          'Philadelphia campaign',
          'Ghost Dance War']

# Remove wars listed above, as well as battles without a specific war (that all happened in the USA)
NA_data = NA_data[(~NA_data['war'].isin(remove)) & (~NA_data['war'].isnull())]
          
          
          
fig = px.scatter_geo(NA_data, lat='lat', lon='lng', 
               hover_name='bname', 
               color='war',
               hover_data=['year'], 
               fitbounds='locations',
               title='Historical battles fought in present-day Canada')

fig.update_layout(showlegend=False)
fig.show()

In [45]:
list(NA_data['bname'].sort_values().unique())

['Action of 8 June 1755',
 'Battle at Port-la-Joye',
 'Battle at St. Croix',
 "Battle of Baker's Farm",
 'Battle of Batoche',
 'Battle of Beauharnois',
 'Battle of Beauport',
 'Battle of Beaver Dams',
 'Battle of Big Sandy Creek',
 'Battle of Bloody Creek',
 'Battle of Brownstown',
 'Battle of Buffalo',
 'Battle of Carillon',
 'Battle of Chedabucto',
 'Battle of Chippawa',
 "Battle of Cook's Mills",
 "Battle of Crysler's Farm",
 'Battle of Cut Knife',
 "Battle of Devil's Hole",
 'Battle of Duck Lake',
 'Battle of Eccles Hill',
 'Battle of Fish Creek',
 'Battle of Fort Albany',
 'Battle of Fort Beauséjour',
 'Battle of Fort Bull',
 'Battle of Fort Dearborn',
 'Battle of Fort Erie',
 'Battle of Fort Frontenac',
 'Battle of Fort George',
 'Battle of Fort Loyal',
 'Battle of Fort Niagara',
 'Battle of Fort Oswego',
 'Battle of Fort Pitt',
 "Battle of Frenchman's Butte",
 "Battle of Frenchman's Creek",
 'Battle of Frenchtown',
 'Battle of Fundy Bay',
 'Battle of Grand Pré',
 'Battle of Hamp

Similar to before, we can also include casualties into this and have the bubble sizes match the number of casualties

In [46]:
NA_casualties = NA_data[NA_data['casualties'].notna()]

px.scatter_geo(NA_casualties, lat='lat', lon='lng', 
               height=800, hover_name='bname', 
               size = 'casualties',
               size_max = 30,
               color = 'casualties',
               hover_data=['war','year'], 
               fitbounds='locations',
               title='Historical battles fought in present-day Canada')

In [74]:
fig = px.scatter_mapbox(NA_data, lat="lat", lon="lng", hover_name="bname", hover_data=['year'],
                        color_discrete_sequence=["fuchsia"],zoom=2.5)
fig.update_layout(
    mapbox_style="white-bg",
    mapbox_layers=[
        {
            "below": 'traces',
            "sourcetype": "raster",
            "sourceattribution": "United States Geological Survey",
            "source": [
                "https://basemap.nationalmap.gov/arcgis/rest/services/USGSImageryOnly/MapServer/tile/{z}/{y}/{x}"
            ]
        }
      ])
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

## Next Steps

Though the data can be kind of spotty for some battles, it could be interesting to bring in the number (or rate) of casualities for each battle, and size the markers proportional to that. It's only an estimate of the importance of the battle, but it's a good first step.

We could also use the plotting function to animate the conflicts throughout Canadian history.

Lastly, we could focus on WWI or WWII data and look at the advance of the Allied gains in Europe, highlighting battles that Canada was a major part of.