Mapping Education Deserts: A Toolset for Examining Physical Access to Schooling and Other Public Goods with Open-Source Geospatial Analysis
This is a companion repository of code for the paper The Last Mile in School Access: Mapping Education Deserts in Developing Countries, by Daniel Rodriguez-Segura and Brian Heseung Kim. The article is available open-source via Development Engineering here: https://doi.org/10.1016/j.deveng.2021.100064
From the paper abstract:
With recent advances in high-resolution satellite imagery and machine vision algorithms, fine-grain geospatial data on population are now widely available: kilometer-by-kilometer, worldwide. In this paper, we showcase how researchers and policymakers in developing countries can leverage these novel data to precisely identify “education deserts” – localized areas where families lack physical access to education – at unprecedented scale, detail, and cost-effectiveness. We demonstrate how these analyses could valuably inform educational access initiatives like school construction and transportation investments, and outline a variety of analytic extensions to gain deeper insight into the state of school access across a given country. We conduct a proof-of-concept analysis in the context of Guatemala, which has historically struggled with educational access, as a demonstration of the utility, viability, and flexibility of our proposed approach. We find that the vast majority of Guatemalan population lives within 3 km of a public primary school, indicating a generally low incidence of distance as a barrier to education in that context. However, we still identify concentrated pockets of population for whom the distance to school remains prohibitive, revealing important geographic variation within the strong country-wide average. Finally, we show how even a small number of optimally-placed schools in these areas, using a simple algorithm we develop, could substantially reduce the incidence of education deserts in this context. We make our entire codebase available to the public – fully free, open-source, heavily documented, and designed for broad use – allowing analysts across contexts to easily replicate our proposed analyses for other countries, educational levels, and public goods more generally.
We recommend reading the paper closely before attempting to apply these analytic tools for your own purposes, as there exist several important methodological considerations to be aware of. That said, we are excited to provide this codebase freely to all in the hopes that other analysts and policymakers can usefully apply these tools to support equity and accessibility to varying school levels, country contexts, and public goods (e.g., wells, libraries, etc.).
Using this Repository
To use this toolset, begin by downloading the files into a folder of your choice (alongside your chosen population and school location datasets, and a simple shapefile of your country; more info on obtaining these data can be found in the paper), and opening and editing
<Code/00_main.R>. This main code file includes all necessary instructions, documentation, file dependency info, etc. to start. All file specifications, pathing, and analytic decisions are set there explicitly in the first half of the document, with instructions for completing each specification included. Note that all parameters and code are by default set to replicate our analysis as presented for public primary schooling in Guatemala; we have made explicit comments within each script about where code should be altered for your own purposes.
Because each context and data circumstance is different, you may find that this codebase does not exactly suit your needs (e.g., you run into data type errors, it doesn't quite provide output you wanted, etc.). Each individual code file (e.g.,
<Code/01_import.R>) also includes heavily documented code for each step and can be easily modified to accommodate differing file formats, variable names, and so on as needed. You may find the sf cheat sheet useful for any geospatial data transformations you require: https://github.com/rstudio/cheatsheets/blob/master/sf.pdf
That said, this is a multi-stage analysis that still requires some familiarity with R to use properly. Even so, please don't hesitate to reach out to us with questions, concerns, or other inquiries! You can also feel free to write an issue or send a pull request to this repository as well. We'd be happy to help how we can!