## Spatial Data Analysis --- Geopandas

### Data

The data I am using for module-07 is from the [data on African conflict](https://www.acleddata.com/data/) provided by [The Armed Conflict Location & Event Data Project (ACLED)](https://www.acleddata.com/about-acled/). I originally downloaded and wrangled the data in modules 03 and 04. I have a cleaned csv of the original file and a csv of number of fatalities per country per year. 

In [33]:
%matplotlib inline

import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point

# change default figsize 
plt.rcParams['figure.figsize'] = (15.0, 12.0) # added decimals in

In [35]:
# load in csv data using pandas
conflicts = pd.read_csv('./data/africa_conflict_cleaned.csv')
fatalities = pd.read_csv('./data/fatalities_per_year.csv')

# check
conflicts.head()

Unnamed: 0.1,Unnamed: 0,YEAR,COUNTRY,LOCATION,LATITUDE,LONGITUDE,EVENT_TYPE,INTERACTION,NOTES,FATALITIES
0,0,1997,Algeria,Douaouda,36.6725,2.7894,Violence against civilians,27,5 January: Beheading of 5 citizens in Douaouda...,5
1,1,1997,Algeria,Hassasna,36.1333,0.8833,Violence against civilians,27,Two citizens were beheaded in Hassasna.,2
2,2,1997,Algeria,Hassi El Abed,34.9664,-0.2903,Violence against civilians,27,Two citizens were killed in a raid on the vill...,2
3,3,1997,Algeria,Blida,36.4686,2.8289,Violence against civilians,27,4 January: 16 citizens were murdered in the vi...,16
4,4,1997,Algeria,Douaouda,36.6725,2.7894,Violence against civilians,27,5 January: Killing of 18 citizens in the Olivi...,18


CSV data are now loaded in as dfs. Conflicts has geometry, which means I can convert it to a geodf like we did earlier in the lesson with the toxic release inventory. Fatalities does not have geometry yet. For that data set, I need to do a spatial join using with geoms for African countries.

In [50]:
# convert conflicts df to gdf
# first, convert lat and lon to shapely geometries using Point
# remember that lon is the x and lat is the y
geoms = [Point(xy) for xy in zip(conflicts.LONGITUDE, conflicts.LATITUDE)]
conflicts_mod = conflicts.drop(['LATITUDE', 'LONGITUDE'], axis=1)
conflicts_mod = conflicts_mod.drop(conflicts_mod.columns[0], axis=1)

# define crs
crs = {'init' :'epsg:4326'}

# create gdf
conflicts_geo = gpd.GeoDataFrame(conflicts_mod, crs=crs, geometry=geoms)

# check
conflicts_geo.head()

Unnamed: 0,YEAR,COUNTRY,LOCATION,EVENT_TYPE,INTERACTION,NOTES,FATALITIES,geometry
0,1997,Algeria,Douaouda,Violence against civilians,27,5 January: Beheading of 5 citizens in Douaouda...,5,POINT (2.7894 36.6725)
1,1997,Algeria,Hassasna,Violence against civilians,27,Two citizens were beheaded in Hassasna.,2,POINT (0.8833 36.1333)
2,1997,Algeria,Hassi El Abed,Violence against civilians,27,Two citizens were killed in a raid on the vill...,2,POINT (-0.2903 34.9664)
3,1997,Algeria,Blida,Violence against civilians,27,4 January: 16 citizens were murdered in the vi...,16,POINT (2.8289 36.4686)
4,1997,Algeria,Douaouda,Violence against civilians,27,5 January: Killing of 18 citizens in the Olivi...,18,POINT (2.7894 36.6725)


In [56]:
# load in countries json as gdf
countries = gpd.read_file('./data/countries.json')
countries

Unnamed: 0,id,featurecla,scalerank,LABELRANK,SOVEREIGNT,SOV_A3,ADM0_DIF,LEVEL,TYPE,ADMIN,...,NAME_KO,NAME_NL,NAME_PL,NAME_PT,NAME_RU,NAME_SV,NAME_TR,NAME_VI,NAME_ZH,geometry
0,0,Admin-0 country,1,3,Zimbabwe,ZWE,0,2,Sovereign country,Zimbabwe,...,짐바브웨,Zimbabwe,Zimbabwe,Zimbábue,Зимбабве,Zimbabwe,Zimbabve,Zimbabwe,辛巴威,POLYGON ((31.28789062500002 -22.40205078125001...
1,1,Admin-0 country,1,3,Zambia,ZMB,0,2,Sovereign country,Zambia,...,잠비아,Zambia,Zambia,Zâmbia,Замбия,Zambia,Zambiya,Zambia,赞比亚,"POLYGON ((30.39609375000001 -15.64306640625, 3..."
2,2,Admin-0 country,1,3,Yemen,YEM,0,2,Sovereign country,Yemen,...,예멘,Jemen,Jemen,Iémen,Йемен,Jemen,Yemen,Yemen,也门,"(POLYGON ((53.08564453125001 16.648388671875, ..."
3,3,Admin-0 country,3,2,Vietnam,VNM,0,2,Sovereign country,Vietnam,...,베트남,Vietnam,Wietnam,Vietname,Вьетнам,Vietnam,Vietnam,Việt Nam,越南,"(POLYGON ((104.06396484375 10.3908203125, 104...."
4,4,Admin-0 country,5,3,Venezuela,VEN,0,2,Sovereign country,Venezuela,...,베네수엘라,Venezuela,Wenezuela,Venezuela,Венесуэла,Venezuela,Venezuela,Venezuela,委內瑞拉,(POLYGON ((-60.82119140624999 9.13837890624999...
5,5,Admin-0 country,6,6,Vatican,VAT,0,2,Sovereign country,Vatican,...,바티칸 시국,Vaticaanstad,Watykan,Vaticano,Ватикан,Vatikanstaten,Vatikan,Thành Vatican,梵蒂冈,"POLYGON ((12.43916015625001 41.898388671875, 1..."
6,6,Admin-0 country,1,4,Vanuatu,VUT,0,2,Sovereign country,Vanuatu,...,바누아투,Vanuatu,Vanuatu,Vanuatu,Вануату,Vanuatu,Vanuatu,Vanuatu,萬那杜,"(POLYGON ((166.74580078125 -14.82685546875001,..."
7,7,Admin-0 country,1,3,Uzbekistan,UZB,0,2,Sovereign country,Uzbekistan,...,우즈베키스탄,Oezbekistan,Uzbekistan,Usbequistão,Узбекистан,Uzbekistan,Özbekistan,Uzbekistan,乌兹别克斯坦,"(POLYGON ((70.94677734375 42.24868164062499, 7..."
8,8,Admin-0 country,1,4,Uruguay,URY,0,2,Sovereign country,Uruguay,...,우루과이,Uruguay,Urugwaj,Uruguai,Уругвай,Uruguay,Uruguay,Uruguay,乌拉圭,"POLYGON ((-53.37060546875 -33.7421875, -53.419..."
9,9,Admin-0 country,3,6,Federated States of Micronesia,FSM,0,2,Sovereign country,Federated States of Micronesia,...,미크로네시아 연방,Micronesia,Mikronezja,Micronésia,Микронезия,Mikronesiens federerade stater,Mikronezya,Micronesia,密克罗尼西亚联邦,"(POLYGON ((162.983203125 5.325732421874989, 16..."


In [58]:
# prep a geom gdf for joining by selecting only relevant columns
countries_geom = countries[['ADMIN', 'geometry']]

# write data to file
countries_geom.to_file('./data/countries_geom.json')

# check
countries_geom.head()

Unnamed: 0,ADMIN,geometry
0,Zimbabwe,POLYGON ((31.28789062500002 -22.40205078125001...
1,Zambia,"POLYGON ((30.39609375000001 -15.64306640625, 3..."
2,Yemen,"(POLYGON ((53.08564453125001 16.648388671875, ..."
3,Vietnam,"(POLYGON ((104.06396484375 10.3908203125, 104...."
4,Venezuela,(POLYGON ((-60.82119140624999 9.13837890624999...


In [59]:
# merge dataframes together using 'ADMIN' as the key
# inner joins means geoms for non-africa countries are not included
merged = countries_geom.merge(fatalities, on='ADMIN', how='inner')
merged.head()


# didn't work, because name of column in both sets needs to be the same
# change one and try again later

KeyError: 'ADMIN'