# Water Scarcity and Global Conflict Analysis
This project aims to explore the complex relationship between armed conflict and water scarcity by integrating and analyzing datasets from various sources. We will leverage geospatial and environmental data to assess how water scarcity influences the occurrence and intensity of conflicts.

## Definitions
- Scarcity: Demand for a good or service is greater than the availability of the good or service (Oxford Languages).
- Supply: total freshwater resources available in cubic meters per person, per year (The ImpEE Project).
- Withdrawal: amount extracted for use by country (The ImpEE Project).
- Water Stress: ratio between total freshwater withdrawn (TFWW) and total renewable freshwater resources (TRWR). Water stress = TFWW / TRWR (Wikipedia).
- Water Scarcity: volume of fresh water available does not meet the per person per day recommendations for human health (University of Nottingham).er day

In [1]:
# Dependencies
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import linregress
from functools import partial, reduce

## Data Collection and Cleanup

#### Collect csv files

In [2]:
# Collect water scarcity data from the Food and Agriculture Organization (FAO) https://data.apps.fao.org/aquastat/?lang=en
aqua_csv = pd.read_csv('Resources/AQUASTAT Dissemination System.csv')

# Collect international conflict data from the University of Alabama https://internationalconflict.ua.edu/data-download/
mie_csv = pd.read_csv('Resources/ua-mie-1.0.csv')
micnames = pd.read_csv('Resources/ua-micnames-1.0.csv')

# Collect country codes from Correlates of War (COW) https://correlatesofwar.org/data-sets/cow-country-codes-2/
COW_Country_Codes = pd.read_csv('Resources/COW-country-codes.csv')

#### Cleanup the Militarized Interstate Events (MIE) csv file

In [3]:
# Copy the dataframe with only the columns we want 
conflicts_df = mie_csv[['styear', 'ccode1', 'eventnum', 'micnum', 'hostlev', 'ccode2']].copy()

# Create a dictionary for the country codes and their names and the confrontation codes and their name
code_to_country = pd.Series(COW_Country_Codes.StateNme.values, index=COW_Country_Codes.CCode).to_dict()
conflict_name = pd.Series(micnames.micname.values, index= micnames.micnum).to_dict()

# Map the country codes to their names from the dictionary and replace
conflicts_df['ccode1'] = conflicts_df['ccode1'].map(code_to_country)
conflicts_df['ccode2'] = conflicts_df['ccode2'].map(code_to_country)
conflicts_df['micnum'] = conflicts_df['micnum'].map(conflict_name)

# Rename columns headers
conflicts_df = conflicts_df.rename(columns={'styear': 'Year',
                                            'ccode1': 'Country',
                                            'ccode2': 'Target Country',
                                            'eventnum': 'Event Number',
                                            'micnum': 'Conflict Name',
                                            'hostlev': 'Hosility Level'
                                            })

# Display
conflicts_df.head()

Unnamed: 0,Year,Country,Event Number,Conflict Name,Hosility Level,Target Country
0,1902,United States of America,1,Alaska Boundary Dispute (1902),3,United Kingdom
1,1913,Austria-Hungary,1,Serbian and Austro-Hungarian Fighting over Alb...,2,Yugoslavia
2,1946,Albania,2,British Attempts to Pass the Albanian Corfu Ch...,4,United Kingdom
3,1946,United Kingdom,3,British Attempts to Pass the Albanian Corfu Ch...,3,Albania
4,1946,United Kingdom,4,British Attempts to Pass the Albanian Corfu Ch...,3,Albania


#### Cleanup the AQUASTAT csv file

In [12]:
# Copy the dataframe with only the columns we want 
aqua_df = aqua_csv[['Year', 'Area', 'Variable', 'Value', 'Unit']].copy()

# Rename column header
aqua_df = aqua_df.rename(columns={'Area': 'Country'})

# Filter by variable type, rename the value column, and drop the variable column
pop = aqua_df[aqua_df['Variable'].str.contains('Population density')].rename(columns={'Value': 'Pop Density',
                                                                                    'Unit': 'Pop Unit',
                                                                                    }).drop(columns='Variable')
pop_tot = aqua_df[aqua_df['Variable'].str.contains('Total population')].rename(columns={'Value': 'Total Pop',
                                                                                    'Unit': 'Tot Pop Unit',
                                                                                    }).drop(columns='Variable')
stress = aqua_df[aqua_df['Variable'].str.contains('SDG 6.4.2. Water Stress')].rename(columns={'Value': 'Water Stress',
                                                                                    'Unit': 'Stress Unit',
                                                                                    }).drop(columns='Variable')
freshw = aqua_df[aqua_df['Variable'].str.contains('Total freshwater withdrawal')].rename(columns={'Value': 'Tot FreshW Wthdrl',
                                                                                    'Unit': 'Wthdrl Unit',
                                                                                    }).drop(columns='Variable')

#### Merge Water Conflict Analysis Datasets

In [13]:
# Merge the dataframes
merge1_df = pd.merge(pop, pop_tot, on=['Country', 'Year'])
merge2_df = pd.merge(stress, freshw, on=['Country', 'Year'])
aqua_df = pd.merge(merge1_df, merge2_df, on=['Country', 'Year'])
water_conflicts_df = pd.merge(conflicts_df, aqua_df, on=['Country', 'Year'])

# Fill the NaN under conflicts with no conflict
water_conflicts_df['Conflict Name'] = water_conflicts_df['Conflict Name'].fillna('No Conflict')

# Display
water_conflicts_df.head()

Unnamed: 0,Year,Country,Event Number,Conflict Name,Hosility Level,Target Country,Pop Density,Pop Unit,Total Pop,Tot Pop Unit,Water Stress,Stress Unit,Tot FreshW Wthdrl,Wthdrl Unit
0,1976,Argentina,1,No Conflict,4,United Kingdom,9.455566,inhab/km2,26290.257,1000 inhab,7.657308,%,27.6,10^9 m3/year
1,1970,Israel,225,The Six Day War of 1967,4,Jordan,131.731174,inhab/km2,2907.307,1000 inhab,133.157191,%,1.543425,10^9 m3/year
2,1970,Israel,226,The Six Day War of 1967,4,Jordan,131.731174,inhab/km2,2907.307,1000 inhab,133.157191,%,1.543425,10^9 m3/year
3,1970,Israel,228,The Six Day War of 1967,4,Saudi Arabia,131.731174,inhab/km2,2907.307,1000 inhab,133.157191,%,1.543425,10^9 m3/year
4,1970,Israel,229,The Six Day War of 1967,3,Jordan,131.731174,inhab/km2,2907.307,1000 inhab,133.157191,%,1.543425,10^9 m3/year


In [14]:
# Write the new merged dataframe to a csv file
water_conflicts_df.to_csv('Resources/water_conflicts_df.csv')