Permalink
Cannot retrieve contributors at this time
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
338 lines (247 sloc)
9.85 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Reading and writing GeoJSON files in R" | |
author: "Robin Lovelace" | |
date: "11/26/2014" | |
output: html_document | |
--- | |
## Introduction | |
[GeoJSON](http://geojson.org/) is an open data format that | |
extends [JSON](http://www.json.org/), the JavaScript Object Notation | |
to encode spatial information. | |
GeoJSON has the following advantages: | |
- All the geographical information is self contained (unlike shapefiles, which have multiple files for one spatial object) | |
- It is an open standard which is becoming increasingly popular | |
- GeoJSON files display as maps on GitHub by default! | |
The disadvantages are: | |
- GeoJSON files are [quite large](http://zevross.com/blog/2014/04/22/spatial-data-on-a-diet-tips-for-file-size-reduction-using-topojson/) compared with non text formats such as shapefiles and | |
spatial databases. | |
- GeoJSON files can be slow to read/write and are not supported by | |
all GIS software (yet!). | |
## Reading GeoJSONs with readOGR | |
As illustrated by this [post](http://www.jose-gonzalez.org/create-geojson-r/#.VHVoMY_6IrE), it is easy to read and write GeoJSON files | |
in R, using the [**rgdal**](http://cran.r-project.org/web/packages/rgdal/index.html) package, which calls | |
[gdal](http://www.gdal.org/) commands. Here's an example using | |
publicly available GeoJSON file: | |
```{r, echo=FALSE} | |
system("rm /tmp/gas*") | |
``` | |
```{r} | |
library(rgdal) # the library we'll be using | |
u <- "https://raw.githubusercontent.com/codeforboston/open_data_cambridge/master/Infrastructure/UtilityFacilities.geojson" | |
downloader::download(url = u, destfile = "/tmp/gas.GeoJSON") | |
gas <- readOGR(dsn = "/tmp/gas.GeoJSON", layer = "OGRGeoJSON") | |
summary(gas) | |
``` | |
Note that to test whether the GeoJSON driver works on your system, | |
you can use the following [command](http://stackoverflow.com/questions/24183007/is-it-possible-to-read-geojson-or-topojson-file-in-r-to-draw-a-choropleth-map): | |
```{r} | |
"GeoJSON" %in% ogrDrivers()$name | |
``` | |
As shown in the above example, | |
GeoJSON files can have coordinate reference system (CRS) | |
allocated to them through the `"crs"` 'member': | |
```{r, engine='node', eval=FALSE} | |
"crs": { | |
"type": "name", | |
"properties": { | |
"name": "urn:ogc:def:crs:OGC:1.3:CRS84" | |
} | |
} | |
``` | |
Note that we can see this definition in the `gas.GeoJSON` file: | |
```{r} | |
readLines("/tmp/gas.GeoJSON")[3] | |
``` | |
This 'member' is defined in plain text in the GeoJSON file | |
before the spatial `"features"` are defined. Please see this | |
small example of [gas stations](http://giscience.tw.rpi.edu/rpi_group_projects/watershedders/data/gas_station.geojson) to see how the CRS | |
is defined in the context of a complete GeoJSON file. | |
## Writing GeoJSONs with writeOGR | |
To write GeoJSON files, the sensibly named corrollary of | |
`readOGR` is used, `writeOGR`: | |
```{r} | |
if(file.exists("/tmp/gas2.geojson")){ | |
file.remove("/tmp/gas2.geojson") | |
} | |
``` | |
```{r} | |
geojsonio::geojson_write(gas, file = "/tmp/gas2.geojson") | |
``` | |
## Multifeature geojsons | |
Based on [this](https://stackoverflow.com/questions/30583048/convert-features-of-a-multifeature-geojson-into-r-spatial-objects/30593077#30593077), geojson files can usually be loaded into R with `readOGR`, as illustrated above. | |
However, this fails for multifeature geojsons. | |
Reproducible example: | |
```{r} | |
downloader::download("https://github.com/Robinlovelace/Creating-maps-in-R/raw/master/data/test-multifeature.geojson", "test.geojson") | |
try(test <- rgdal::readOGR("test.geojson", "OGRGeoJSON")) # fails with: | |
``` | |
There are 2 solutions. One uses rgdal and the gdal argument | |
`require_geomType`. The other uses geojsonio. | |
### Multifeature geojsons with rgdal | |
The first solution uses new features introduced in gdal 0.9-3. | |
It is worth checking (and maybe updating) your gdal version. | |
(Thanks to [@hrbrmstr](https://stackoverflow.com/users/1457051/hrbrmstr) for this detailed information.) | |
```{r} | |
rgdal::getGDALVersionInfo() # check the version of gdal | |
system("gdal-config --version") # asking gdal its version from bash | |
``` | |
``` | |
Before update, this was: | |
[1] "GDAL 1.11.1, released 2014/09/24" | |
After updating gdal, e.g. like this (Ubuntu), it changed: | |
$ sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable && sudo apt-get update | |
$ sudo apt-get install gdal-bin | |
``` | |
gdal can provide us with information about the layers in the | |
multipart geojson. | |
```{r} | |
ogrListLayers("test.geojson") | |
``` | |
It can also provide more general information about the object, | |
if we ask it correctly. | |
```{r} | |
try(ogrInfo("test.geojson")) # no information on the file | |
system("ogrinfo test.geojson ") | |
try(ogrInfo("test.geojson", "OGRGeoJSON")) | |
``` | |
The output from the previous lines indicates how we can | |
access just one layer. We use the argument `require_geomType` | |
to specify which of the geometries to load: | |
```{r} | |
testp_info <- | |
ogrInfo("test.geojson", "OGRGeoJSON", require_geomType="wkbPoint") | |
testp_df <- as.data.frame(testp_info$iteminfo) | |
head(testp_df) | |
``` | |
Note we can use the same command for any of the geometries: | |
```{r, eval=FALSE} | |
ogrInfo("test.geojson", "OGRGeoJSON", require_geomType = "wkbPolygon") | |
``` | |
``` | |
NOTE: keeping only 23 wkbPolygon of 781 features | |
note that extent applies to all features | |
Source: "test.geojson", layer: "OGRGeoJSON" | |
Driver: GeoJSON number of rows 781 | |
selected geometry type: wkbPolygon with 23 rows | |
Feature type: wkbPolygon with 2 dimensions | |
Extent: (12.48326 41.88355) - (12.5033 41.89629) | |
CRS: +proj=longlat +datum=WGS84 +no_defs | |
Number of fields: 78 | |
name type length typeName | |
1 area 4 0 String | |
2 bicycle 4 0 String | |
3 highway 4 0 String | |
``` | |
Armed with this information, we can read in the features: | |
```{r} | |
points <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbPoint") | |
lines <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbLineString") | |
polygons <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbPolygon") | |
``` | |
Let's plot the data (see https://stackoverflow.com/questions/30583048/ ) | |
```{r} | |
plot(lines, col="#7f7f7f") | |
plot(polygons, add=TRUE) | |
plot(points, add=TRUE, col="red") | |
``` | |
## Solution from geojsonio | |
This is an alternative solution: | |
```{r} | |
test <- geojsonio::geojson_read("test.geojson") | |
``` | |
Then we can filter out the feature types of interest: | |
```{r} | |
obj = Filter( | |
function(f){f$geometry$type=="LineString"}, | |
test$features) | |
``` | |
## Solution from Spacedman | |
```{r} | |
mess <- 'ogr2ogr -where "OGR_GEOMETRY=\'LINESTRING\'" -f "GeoJSON" lines.geojson test.geojson' | |
system(mess) | |
testl <- rgdal::readOGR("lines.geojson", layer = "OGRGeoJSON") | |
``` | |
[1]: https://github.com/Robinlovelace/Creating-maps-in-R/blob/master/vignettes/geoJSON.Rmd | |
[2]: https://github.com/Robinlovelace/Creating-maps-in-R/blob/master/data/test-multifeature.geojson | |
## gdal fails to convert certain types of CRS | |
Let's see what R has created from writeOGR: | |
```{r} | |
gas2 <- readLines("/tmp/gas.GeoJSON") | |
gas2[1:4] | |
``` | |
Frustratingly, this fails to write the CRS into the file. | |
**rgdal**'s ability to read a crs from a GeoJSON file is no | |
guarantee of it's ability to then write it out again. | |
It seems | |
**rgdal** has **lost the CRS in translation**. | |
This can problem be caused by using an old version of | |
`gdal`. Not in this case though: incomplete CRS values can | |
lead `rgdal` to omit the `"crs"` in the file, as explained by | |
Roger Bivand in a lengthy [conversation](http://r-sig-geo.2731867.n2.nabble.com/WriteOGR-to-GeoJSON-loses-CRS-td7586913.html) | |
on the [R-sig-geo](https://stat.ethz.ch/mailman/listinfo/r-sig-geo) | |
email list. To test that the problem is the fault of gdal and | |
not R, we can use the `ogr2ogr` command line tool, supplied by | |
gdal. | |
```{r, echo=FALSE} | |
system("rm /tmp/gas3.*") | |
``` | |
```{r} | |
# use call gdal from the operating system to convert original GeoJSON: | |
system("ogr2ogr -f 'ESRI Shapefile' /tmp/gas3.shp '/tmp/gas.GeoJSON' ") | |
gas3 <- readOGR(dsn = "/tmp", layer = "gas3") | |
proj4string(gas3) | |
proj4string(gas) | |
``` | |
The above code shows that gdal has successfully converted the | |
original GeoJSON file to a shapefile, maintaining the CRS. | |
(Note that the CRS is identical but its definition is slightly different.) | |
Let's take a look at the `.proj` file that gdal created: | |
```{r} | |
gas3_proj <- readLines("/tmp/gas3.prj", skipNul = T) | |
gas3_proj | |
``` | |
There are various ways to solve the problem. | |
The 'outside R' solution would be to write the file | |
first as a shapefile, and then use `ogr2ogr` to convert | |
it to a GeoJSON. But as shown in the code below, this fails: | |
```{r} | |
writeOGR(gas, "/tmp/", layer = "gas4", driver = "ESRI Shapefile") | |
system("ogr2ogr -f 'GeoJSON' /tmp/gas5.GeoJSON '/tmp/gas4.shp' ") | |
readLines("/tmp/gas4.prj") | |
readLines("/tmp/gas5.GeoJSON")[1:3] | |
``` | |
# CRS definitions | |
As Roger Bivand shows, if the CRS is defined in certain | |
ways, the `"crs"` member will be written to the GeoJSON file. | |
But it is difficult to know how to define a CRS other than | |
by its [EPSG name](http://spatialreference.org/ref/epsg/4283/): | |
```{r} | |
gas6 <- gas # duplicate the gas object | |
proj4string(gas6) <- CRS("+init=epsg:4283") | |
proj4string(gas6) | |
# the below works, but incorrect CRS | |
# proj4string(gas6) <- CRS("+proj=longlat +ellps=clrk66 +datum=NAD27") | |
geojsonio::geojson_write(gas6, file = "/tmp/gas6.geojson") | |
readLines("/tmp/gas6.geojson")[1:4] | |
``` | |
# A manual solution | |
To solve the problem manually, we can simply add the correct projection | |
to the GeoJSON file that has been created. Thankfully this is straightforward, as line 3 of the files are left empty by gdal, presumably ready | |
for such an eventuality: | |
```{r} | |
gas7 <- readLines("/tmp/gas2.geojson") | |
gas7[1:4] | |
gas7[3] <- '"crs": {"type": "EPSG", "properties": {"code": 4283}},' | |
writeLines(gas7, con = "/tmp/gas8.GeoJSON") | |
gas8 <- readOGR(dsn = "/tmp/gas8.GeoJSON", layer = "OGRGeoJSON") | |
proj4string(gas8) | |
``` | |
This is a messy solution but it's the only one I could find to write | |
GeoJSON files with a CRS defined for Australian data services' default CRS. | |
```{r, echo=FALSE} | |
# library(rgdal) | |
# library(pryr) | |
# pryr::show_c_source(readOGR) | |
# showMethods(readOGR) | |
# ??readOGR | |
``` | |