Skip to content


Repository files navigation

Measuring the Landscape of Civil War

This is a package, documentation, and replication repository for the paper "Measuring the Landscape of Civil War," Journal of Peace Research, February 15, 2018

The Paper:

Measuring the Landscape of Civil War - Read the Paper

Measuring the Landscape of Civil War - Read the Online Appendix

The Authors:

Replication Code and Analysis

Self Contained Package

All of the files necessary for reproducing our analysis are including in a self contained R package "MeasuringLandscape." You can install the package MeasuringLandscapeCivilWar from github with the instructions below:

if(!require(devtools)) install.packages("devtools")


The analysis and figures in the paper and statistical appendix are produced in a number of R Notebooks.

NOTE: Several parts of this analysis are stochastic, specific coefficient estimates and p-values will vary with each execution. Substantive results will be consistent across runs. We encourage the reader to run the replication multiple times and observe the variation.

  • 00 Project Setup: Useful commands for installing necessary packages and setting up the project.

File Preparation:

  • 01 Prep Events Counts: Loads and cleans a novel dataset of violent events observed during the 1950s Mau Mau Rebellion.
  • 02 Prep Gazetteers: Cleans and combines a large number of gazetteer of place names for looking up locations by name and retrieving their coordinates.

Fuzzy Matcher: A supervised learning pipeline for matching two placenames to one another even when they are spelled slightly differently.

Georeferencer: A supervised learning pipeline for assigning a real-world coordinate to a placename.

  • 05 Georeferencer: Takes in locations of events described as text and returns all possible matches across different gazetteers.
  • 06 Ensemble and Hand Rules: Ranks the returned matches from best to worst. First, using simple hand rules of what kind of match to prefer over others. Then second, with a supervised model that attempts to predict which match will be geographically closest to the true location (fewest kilometers away from the right answer).

Analysis: Main analysis of the paper.

  • 07 Recall Accuracy: Rate georeferencing options in terms of recall (how many event locations they recover) and accuracy (how far away their imputed locations tend to be from the true location)

  • 08 Predict Missingness DV: Rate georeferencing options in terms of how systematic they are at recovering locations for certain kinds of events but not others.

  • 09 Predicted Effects: Demonstrate what kinds of events tend to systematically get excluded. Here, in terms of whether the event would have received an original military coordinate or not.

  • 10 Bias: Demonstrate that the kinds of locations that are imputed are different from the true locations, in terms of things like population, distance from roads, ruggedness, etc.

  • 11 So What: Demonstrate that different georeferencing decisions will produce different results in a simple linear regression model in terms of both statistical significance and substantive effects.

  • 12 Kenya Events with Suggested Codings: Release the event dataset with a single georeferencing based on the ensemble method.


No description, website, or topics provided.







No packages published