# Wars by Region & Type

Jenna Jordan

Group members: Jenna Jordan, Dennis Piehl, Gianni Pezzarossi, Xue Lu, and Ryan Wang.

Group name: Allied Against An Anonymous Axis (aka 5A)

Github repo: https://github.com/jenna-jordan/IS590DV-FinalProject

## Introduction

This notebook explores how the Correlates of War and UCDP/PRIO Armed Conflict datasets compare in terms of wars of each type and wars occuring in each world region. These are two different datasets, so there are some noticable differences in the categories - for example, CoW tracks "Oceania" as a region and "Non-State Wars" as a war type, while UCDP/PRIO doesn't have either of these categories. There is also a significant difference in terms of the date range - CoW records wars from 1816 - 2007, while UCDP/PRIO records wars from 1946 - 2018

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Wrangle CoW Data

I need to transform the CoW data (already normalized) to get the number of wars per region/type per year.

For the notebook containing my work in normalizing/tidying the CoW war data, please see: https://github.com/jenna-jordan/international-relations-database-extended/blob/master/Wrangle_Data/CoW_Normalize.ipynb

In [2]:
cow_con = pd.read_csv("./Data/CorrelatesOfWar/wars.csv")
cow_par = pd.read_csv("./Data/CorrelatesOfWar/war_participants.csv")
cow_reg = pd.read_csv("./Data/CorrelatesOfWar/war_locations.csv")

FileNotFoundError: [Errno 2] File b'./Data/CorrelatesOfWar/wars.csv' does not exist: b'./Data/CorrelatesOfWar/wars.csv'

In [None]:
cow_par['StartDate'] = pd.to_datetime(cow_par['StartDate'])
cow_par['EndDate'] = pd.to_datetime(cow_par['EndDate'])
cow_par['EndDate'] = cow_par['EndDate'].fillna('2007-12-31')

In [None]:
cow_merged = cow_par.merge(cow_reg, on='WarID').merge(cow_con, on='WarID').reset_index(drop=True)
cow_merged

code citation for where I originally found this handy way of transforming rows with date ranges into a time series: https://stackoverflow.com/questions/42151886/expanding-pandas-data-frame-with-date-range-in-columns 

Unfortunately, one side effect of this method is that if a conflict is not ongoing at the start of a given year, it will not be recorded in the time series. This may result in a slightly lower number wars per category per year.

In [None]:
cow_war_ts = pd.concat([pd.DataFrame({'year': pd.date_range(row.StartDate, row.EndDate, freq='YS'),
                                        'cow_id': row.PolityID,
                                        'WarID': row.WarID,
                                        'WarRegion': row.Region,
                                        'WarType': row.WarTypeName}, 
                                columns=['year', 'cow_id', 'WarID', 'WarRegion', 'WarType']) 
                           for i, row in cow_merged.iterrows()], ignore_index=True)
cow_war_ts

This is the table I will use to plot the number of wars (CoW) by region (per year)

In [None]:
cow_war_ts_gbregion = cow_war_ts.groupby(['year', 'WarRegion']).agg({'WarID': 'nunique'}).reset_index()
cow_regions_toplot = cow_war_ts_gbregion.pivot(index='year', columns='WarRegion')['WarID'].reset_index().fillna(0)
cow_regions_toplot['year'] = cow_regions_toplot['year'].dt.year
cow_regions_toplot = cow_regions_toplot[cow_regions_toplot['year'] < 2008]
cow_regions_toplot

This is the table I will use to plot the number of wars (CoW) by type (per year)

In [None]:
cow_war_ts_gbtype = cow_war_ts.groupby(['year', 'WarType']).agg({'WarID': 'nunique'}).reset_index()
cow_type_toplot = cow_war_ts_gbtype.pivot(index='year', columns='WarType')['WarID'].reset_index().fillna(0)
cow_type_toplot['year'] = cow_type_toplot['year'].dt.year
cow_type_toplot = cow_type_toplot[cow_type_toplot['year'] < 2008]
cow_type_toplot

## Wrangle UCDP/PRIO data

I need to transform the UCDP/PRIO data (already normalized) to get the number of wars per region/type per year.

For the notebook containing my work in normalizing/tidying the UCDP/PRIO war data, please see: https://github.com/jenna-jordan/international-relations-database-extended/blob/master/Wrangle_Data/UCDP-PRIO_Normalize.ipynb

In [None]:
ucdp_par = pd.read_csv("./Data/UCDP-PRIO_ArmedConflict/participants_gw.csv")
ucdp_obs = pd.read_csv("./Data/UCDP-PRIO_ArmedConflict/observations.csv")
ucdp_con = pd.read_csv("./Data/UCDP-PRIO_ArmedConflict/conflicts.csv")
ucdp_eps = pd.read_csv("./Data/UCDP-PRIO_ArmedConflict/episodes.csv")
ucdp_reg = pd.read_csv("./Data/UCDP-PRIO_ArmedConflict/regions.csv")

UCDP/PRIO tracks conflicts at a lower threshold than CoW - starting at 25 deaths per year instead of 1,000 deaths per year. In order to make sure I am comparing apples to apples, I need to filter out all observations (conflict-year) with an intensity level of 'Minor'. Since this dataset is already organized according to year, I don't need to use the time-series transformation as with the CoW data.

In [None]:
ucdp_merged = ucdp_obs.merge(ucdp_reg, on=['conflict_id']).merge(ucdp_con[['conflict_id', 'type_of_conflict']], on='conflict_id')
ucdp_merged = ucdp_merged[ucdp_merged['intensity_level'] == 'War']
ucdp_merged

This is the table I will use to plot the number of wars (UCDP/PRIO) by region (per year)

In [None]:
ucdp_merged_gbregion = ucdp_merged.groupby(['incompatibility_region', 'year']).agg({'conflict_id':'nunique'}).reset_index()
ucdp_regions_toplot = ucdp_merged_gbregion.pivot(index='year', columns='incompatibility_region')['conflict_id'].reset_index().fillna(0).reset_index()
ucdp_regions_toplot

This is the table I will use to plot the number of wars (UCDP/PRIO) by type (per year)

In [None]:
ucdp_merged_gbtype = ucdp_merged.groupby(['type_of_conflict', 'year']).agg({'conflict_id':'nunique'}).reset_index()
ucdp_type_toplot = ucdp_merged_gbtype.pivot(index='year', columns='type_of_conflict')['conflict_id'].reset_index().fillna(0)
ucdp_type_toplot

## Export data

In [None]:
cow_regions_toplot.to_csv("../Data/Visualization_Ready_Datasets/cow_regions_areaplot_data.csv", index=False)
cow_type_toplot.to_csv("../Data/Visualization_Ready_Datasets/cow_type_areaplot_data.csv", index=False)
ucdp_regions_toplot.to_csv("../Data/Visualization_Ready_Datasets/ucdp_regions_areaplot_data.csv", index=False)
ucdp_type_toplot.to_csv("../Data/Visualization_Ready_Datasets/ucdp_type_areaplot_data.csv", index=False)

## Plots: by Region and by Type, according to CoW and UCDP/PRIO

In [None]:
plt.rcParams["figure.figsize"] = (20,10)
plt.stackplot(cow_regions_toplot['year'], cow_regions_toplot['Africa'], cow_regions_toplot['Asia'], cow_regions_toplot['Europe'], 
              cow_regions_toplot['Middle East'], cow_regions_toplot['W. Hemisphere'], cow_regions_toplot['Oceania'], 
             labels=['Africa', 'Asia', 'Europe', 'Middle East', 'W. Hemisphere', 'Oceania'])
plt.legend(loc='upper left')
plt.xlabel('year')
plt.ylabel('number of wars')
plt.title('Number of Wars Over Time Per Region: \n According to the Correlates of War (1816-2007)')
plt.show()

In [None]:
plt.rcParams["figure.figsize"] = (20,10)
plt.stackplot(ucdp_regions_toplot['year'], ucdp_regions_toplot['Africa'], ucdp_regions_toplot['Asia'], ucdp_regions_toplot['Europe'], 
              ucdp_regions_toplot['Middle East'], ucdp_regions_toplot['Americas'], 
             labels=['Africa', 'Asia', 'Europe', 'Middle East', 'Americas'])
plt.legend(loc='upper left')
plt.xlabel('year')
plt.ylabel('number of wars')
plt.title('Number of Wars Over Time Per Region: \n According to UCDP/PRIO (1946-2018)')
plt.show()

In [None]:
cow_type_toplot.columns

In [None]:
plt.rcParams["figure.figsize"] = (20,10)
plt.stackplot(cow_type_toplot['year'], cow_type_toplot['Inter-State War '], cow_type_toplot['Intra-State War '], 
              cow_type_toplot['Extra-State War '], cow_type_toplot['Non-State War '], 
             labels=['Inter-State War', 'Intra-State War', 'Extra-State War', 'Non-State War'])
plt.legend(loc='upper left')
plt.xlabel('year')
plt.ylabel('number of wars')
plt.title('Number of Wars Over Time Per Type: \n According to the Correlates of War (1816-2007)')
plt.show()

In [None]:
plt.rcParams["figure.figsize"] = (20,10)
plt.stackplot(ucdp_type_toplot['year'], ucdp_type_toplot['Interstate'], ucdp_type_toplot['Internal'], 
              ucdp_type_toplot['Extrasystemic'], 
             labels=['Interstate', 'Internal', 'Extrasystemic'])
plt.legend(loc='upper left')
plt.xlabel('year')
plt.ylabel('number of wars')
plt.title('Number of Wars Over Time Per Type: \n According to UCDP/PRIO (1946-2018)')
plt.show()