This repository has been archived by the owner on Sep 7, 2021. It is now read-only.
/
merge.Rmd
294 lines (235 loc) · 14.4 KB
/
merge.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
---
title: "Integrating Conflict Data"
output:
workflowr::wflow_html:
toc: false
editor_options:
chunk_output_type: console
---
This page shows how to merge Geo-PKO data with conflict data and visualise the results. The examples used here are Uppsala University's ViEWS project, which forecasts conflict risk, and the Uppsala Conflict Data Programme (UCDP), one of the world's leading sources of data on armed conflict. Merging these datasets can provide insights into the links between conflict risk and peacekeeping deployments, and help policymakers make effective peacekeeping decisions where the risk of conflict is high.
## Setting up
Load packages.
```{r, warning=FALSE, message=FALSE}
library(sp)
library(tidyr)
library(dplyr)
library(geojsonio)
library(broom)
library(rgdal)
library(tidyverse)
library(ggplot2)
library(leaflet)
library(sf)
library(spdep)
library(maptools)
library(plyr)
library(rjson)
library(RJSONIO)
library(rmapshaper)
library(htmltools)
library(htmlwidgets)
```
### Geo-PKO x ViEWS
First, we import the datasets. We're using two sets of forecast data from ViEWS. Both forecast the risk of state-based conflict, non-state conflict, and one-sided violence over the next 36 months in Africa. One is at country level, and the other is at PRIO-grid level, an innovative geospatial unit from the Peace Research Institute Oslo that divides the world into roughly 100km x 100km squares. We're using a version of Geo-PKO data that has been coded to the ViEWS format and includes numerical month IDs and PRIO-GRID IDs. And lastly, we're also importing a version of the Geo-PKO data at the country level.
```{r}
geopkoviews <- read.csv("geopko_pgm.csv")
predictors <- read.csv("ensemble_pgm.csv")
countrypredict <- read.csv("ensemble_cm.csv")
countrygeopkoviews <- read.csv("geopko_cm.csv")
```
Here's what the Geo-PKO dataset looks like in the ViEWS format, showing five rows within the database.
```{r}
kable(geopkoviews[90545:90550,]) %>% kable_styling() %>%
scroll_box(width = "100%", height = "200px")
```
Months are coded numerically and rather than coordinates, the data is mapped to a PRIO-grid ID. `pg_month_id` gives a unique month identifier to each PRIO-grid square. To use the published Geo-PKO dataset (as opposed to the ViEWS version), you'll need to merge ViEWS' `month_id` with Geo-PKO's `month' and `year` fields. This page will be updated when the new dataset is published, including how to do this merge.
### Filtering, joining and subsetting the datasets at the PRIO-grid level
The predictor database begins with July 2020 and forecasts the risk of conflict over the next 36 months ahead. Here's a preview of the data within it, showing state-based (`sb`), non-state (`ns`) and one-sided violence (`os`) forecasts.
```{r}
kable(predictors[90545:90550,]) %>% kable_styling() %>%
scroll_box(width = "100%", height = "200px")
```
The Geo-PKO dataset we're working with includes data from previous years, but we just want the past year (July 2019 - June 2020), so we'll filter the data to include only that period. Both the Geo-PKO and ViEWS datasets include `pg_id`, a unique code that corresponds to a specific grid square on the map. This is what we'll use to merge the datasets. _Please note the ViEWS-adapted version of the Geo-PKO dataset is actual only to month 474 (2018), and has not yet been updated to actual data from months following this. Therefore in this exercise we are working with extrapolated data for months 475-486. This page will be updated when the new datasets are published._
```{r}
# filtering for troop deployments over the most recent year
geopkoviews2 <- geopkoviews %>%
filter(between(month_id, 475, 486))
# merging geopko with priogrid predictor data
priogriddf <- left_join(
geopkoviews2, predictors,
by = c("pg_id"),
na.rm = TRUE)
```
That's the dataset merged! Now, to work with it, we'll need to change the class of the `no_troops` and `pg_id` fields to 'numeric' so we can run functions like maximum, average etc.
```{r}
priogriddf$no_troops<-as.numeric(priogriddf$no_troops)
priogriddf$pg_id<-as.numeric(priogriddf$pg_id)
priogriddf$pg_id<-as.numeric(priogriddf$month_id)
```
Next, we're going to take the maximum value of `no_troops` over the year we're working with (July 2019 - June 2020), so we can compare conflict forecasts with the maximum number of troops deployed to a location in the year prior. We're also taking the maximum value of `average_allwthematic_sb` for each location, and removing duplicates so we're left with only one row per location.
```{r}
pgnewdf <- priogriddf %>%
select(pg_id, pg_month_id, no_troops,unpol_dummy,no_tcc,average_allwthematic_sb,average_allwthematic_ns,average_allwthematic_os)
pgnewdf1 <- pgnewdf %>%
group_by(pg_id) %>%
dplyr::filter(average_allwthematic_sb == max(average_allwthematic_sb)) %>%
dplyr::filter(average_allwthematic_ns == max(average_allwthematic_ns)) %>%
dplyr::filter(average_allwthematic_os == max(average_allwthematic_os))
pgnewdf2 <- subset(pgnewdf1, !duplicated(subset(pgnewdf1, select=c(pg_id, no_troops))))
View(pgnewdf2)
```
### Preparing the shapefile and merging with data of interest
Like we mentioned before, the PRIO-grid unit involves dividing the entire world into roughly 100km x 100km squares. That means that if we want to map it, we'll be working with large files, so keep that in mind when you're reading in the shapefile:
```{r}
shapefile <- rgdal::readOGR(".../ViEWS/pgc.geojson")
```
The shapefile contains both geospatial polygon data and numerical data that corresponds to the ViEWS dataset; specifically, a PRIO-grid ID and a country ID. Here's what the non-spatial data looks like, showing five rows in the dataset.
```{r}
kable(shapefile@data[101:106,]) %>% kable_styling() %>%
scroll_box(width = "100%", height = "200px")
```
To work with the data within this shapefile, we need to fortify the shapefile. We also convert the IDs to rownames to make it easier to work with. And, finally, we merge it with `pgnewdf2`, which we created earlier.
```{r}
# fortify
shapefile@data$id <- rownames(shapefile@data)
shapefile.df <- fortify(shapefile, region = "id")
# merge data of interest
shapefile.df <- merge(shapefile, pgnewdf2, by.x = "pg_id", by.y = "pg_id", all.x=F, all.y=T, duplicateGeoms=TRUE)
# checking it worked - shapefile.df has the new attributes, yay
View(shapefile.df)
```
### Mapping Geo-PKO and ViEWS data
To map the data, we're going to use the `leaflet` package (and a bunch of others to support it). The first thing we do is set up our colour palette and bins. The other thing we include is a small segment of code that fixes spacing between any NA value in the legend, and the remainder of the legend.
```{r}
bins <- c(0, 10, 20, 50, 100, 200, 500, 1000, Inf)
pal <- colorNumeric("viridis", NULL)
#to fix spacing of NA in legend
css_fix <- "div.info.legend.leaflet-control br {clear: both;}" # CSS to correct spacing
html_fix <- htmltools::tags$style(type = "text/css", css_fix) # Convert CSS to HTML
```
Next, let's map. We include three colour layers to shade squares according to their conflict forecast value. These layers cover state-based conflict, non-state conflict, and one-sided violence. Troop deployments are incorporated as labels, which you can see for each square on hover.
```{r}
map <- leaflet(shapefile.df) %>%
addTiles() %>%
addPolygons(color = "#444444", weight = 0.25, smoothFactor = 0.5,
opacity = 0.05, fillOpacity = 0.4,
fillColor = ~pal(shapefile.df$average_allwthematic_sb),
group = "State-Based Conflict",
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addPolygons(color = "#444444", weight = 0.25, smoothFactor = 0.5,
opacity = 0.05, fillOpacity = 0.4,
fillColor = ~pal(shapefile.df$average_allwthematic_ns),
group = "Non-State Conflict",
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addPolygons(color = "#444444", weight = 0.25, smoothFactor = 0.5,
opacity = 0.05, fillOpacity = 0.4,
fillColor = ~pal(shapefile.df$average_allwthematic_os),
group = "One-Sided Violence",
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addPolygons(color = "#444444", weight = 0.1, smoothFactor = 0.5,
opacity = 0.0, fillOpacity = 0.0,
fillColor = ~pal(shapefile.df$no_troops),
label=paste("Troops Deployed: ", shapefile.df$no_troops),
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px", color="blue"),
textsize = "15px", direction = "auto"),
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addLegend("bottomright",
pal = pal,
values = shapefile.df$average_allwthematic_sb,
title = "Conflict Forecast",
opacity = 1) %>%
addLayersControl(
baseGroups = c("State-Based Conflict", "Non-State Conflict", "One-Sided Violence"),
options = layersControlOptions(collapsed = FALSE)
)
map <- map %>% htmlwidgets::prependContent(html_fix) # legend NA fix
# to save as HTML, you can use the following code:
# saveWidget(map, file="pkoviews - priogrid.html")
map
```
And there we have it: an interactive map to view recent peacekeeping deployments (2019-2020) and projected conflict risk over the next 36 months.
### Doing the same at country level
Now we're going to go through the same steps as above, but use country-level databases so we can view the same information at country level.
```{r}
# filtering geo-pko for the most recent 12 months
countrygeopkoviews2 <- countrygeopkoviews %>%
filter(between(month_id, 475, 486))
# selecting relevant variables from the predictor dataset
countrypredict2 <- countrypredict %>%
select(month_id, country_id, average_basewthematic_sb, average_basewthematic_ns, average_basewthematic_os)
# merging the two datasets
countrydf <- left_join(
countrygeopkoviews2, countrypredict2,
by = c("country_id", "month_id"),
na.rm = TRUE)
# converting classes to numeric
countrydf$no_troops<-as.numeric(countrydf$no_troops)
countrydf$country_id<-as.numeric(countrydf$country_id)
# selecting relevant variables from the merged dataset
cnewdf <- countrydf %>%
select(country_id, country_month_id, no_troops,unpol_dummy,no_tcc,average_basewthematic_sb,average_basewthematic_ns,average_basewthematic_os)
# filtering to include only the maximum value per location for conflict forecast and maximum value per location for conflict forecast
cnewdf1 <- cnewdf %>%
group_by(country_id) %>%
dplyr::filter(no_troops == max(no_troops), )%>%
dplyr::filter(average_basewthematic_sb == max(average_basewthematic_sb)) %>%
dplyr::filter(average_basewthematic_ns == max(average_basewthematic_ns)) %>%
dplyr::filter(average_basewthematic_os == max(average_basewthematic_os))
# removing duplicates so we have only one value per location for number of troops and conflict forecast
cnewdf2 <- subset(cnewdf1, !duplicated(subset(cnewdf1, select=c(country_id, no_troops))))
# reading in the shapefile
cshapefile <- rgdal::readOGR("C:/Users/tanus/Documents/R/R Files/ViEWS/country.geojson")
# fortifying and merging the shapefile with our dataset
cshapefile@data$id <- rownames(cshapefile@data)
cshapefile.df <- fortify(cshapefile, region = "id")
cshapefile.df <- merge(cshapefile, cnewdf2, by.x = "country_id", by.y = "country_id", all.x=F, all.y=T, duplicateGeoms=TRUE)
# mapping using leaflet (we already set up the bins, colour palette and legend fix in the PRIO-grid-level process)
cmap <- leaflet(cshapefile.df) %>%
addTiles() %>%
addPolygons(color = "#444444", weight = 0.25, smoothFactor = 0.5,
opacity = 0.05, fillOpacity = 0.4,
fillColor = ~pal(cshapefile.df$average_basewthematic_sb),
group = "State-Based Conflict",
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addPolygons(color = "#444444", weight = 0.25, smoothFactor = 0.5,
opacity = 0.05, fillOpacity = 0.4,
fillColor = ~pal(cshapefile.df$average_basewthematic_ns),
group = "Non-State Conflict",
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addPolygons(color = "#444444", weight = 0.25, smoothFactor = 0.5,
opacity = 0.05, fillOpacity = 0.4,
fillColor = ~pal(cshapefile.df$average_basewthematic_os),
group = "One-Sided Violence",
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addPolygons(color = "#444444", weight = 0.1, smoothFactor = 0.5,
opacity = 0.0, fillOpacity = 0.0,
fillColor = ~pal(cshapefile.df$no_troops),
label=paste("Troops Deployed: ", cshapefile.df$no_troops),
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px", color="blue"),
textsize = "15px", direction = "auto"),
highlightOptions = highlightOptions(color = "white", weight = 2,
bringToFront = FALSE)) %>%
addLegend("bottomright",
pal = pal,
values = cshapefile.df$average_basewthematic_sb,
title = "Conflict Forecast",
opacity = 1) %>%
addLayersControl(
baseGroups = c("State-Based Conflict", "Non-State Conflict", "One-Sided Violence"),
options = layersControlOptions(collapsed = FALSE)
)
cmap <- cmap %>% htmlwidgets::prependContent(html_fix) # legend NA fix
# to save as HTML, use the following code:
# saveWidget(cmap, file="pkoviews - country.html")
cmap
```
This now gives us insights at both the PRIO-grid level and country level. This can be extended to include more useful features; for example, a time-slider can help us identify how the risk of conflict changes given peacekeeping deployments, and vice versa. We'll soon add information on merging with UCDP data, which is already used around the world to inform conflict prevention and response.