Skip to content
Permalink
master
Switch branches/tags
Go to file
 
 
Cannot retrieve contributors at this time
---
title: "Reading and writing GeoJSON files in R"
author: "Robin Lovelace"
date: "11/26/2014"
output: html_document
---
## Introduction
[GeoJSON](http://geojson.org/) is an open data format that
extends [JSON](http://www.json.org/), the JavaScript Object Notation
to encode spatial information.
GeoJSON has the following advantages:
- All the geographical information is self contained (unlike shapefiles, which have multiple files for one spatial object)
- It is an open standard which is becoming increasingly popular
- GeoJSON files display as maps on GitHub by default!
The disadvantages are:
- GeoJSON files are [quite large](http://zevross.com/blog/2014/04/22/spatial-data-on-a-diet-tips-for-file-size-reduction-using-topojson/) compared with non text formats such as shapefiles and
spatial databases.
- GeoJSON files can be slow to read/write and are not supported by
all GIS software (yet!).
## Reading GeoJSONs with readOGR
As illustrated by this [post](http://www.jose-gonzalez.org/create-geojson-r/#.VHVoMY_6IrE), it is easy to read and write GeoJSON files
in R, using the [**rgdal**](http://cran.r-project.org/web/packages/rgdal/index.html) package, which calls
[gdal](http://www.gdal.org/) commands. Here's an example using
publicly available GeoJSON file:
```{r, echo=FALSE}
system("rm /tmp/gas*")
```
```{r}
library(rgdal) # the library we'll be using
u <- "https://raw.githubusercontent.com/codeforboston/open_data_cambridge/master/Infrastructure/UtilityFacilities.geojson"
downloader::download(url = u, destfile = "/tmp/gas.GeoJSON")
gas <- readOGR(dsn = "/tmp/gas.GeoJSON", layer = "OGRGeoJSON")
summary(gas)
```
Note that to test whether the GeoJSON driver works on your system,
you can use the following [command](http://stackoverflow.com/questions/24183007/is-it-possible-to-read-geojson-or-topojson-file-in-r-to-draw-a-choropleth-map):
```{r}
"GeoJSON" %in% ogrDrivers()$name
```
As shown in the above example,
GeoJSON files can have coordinate reference system (CRS)
allocated to them through the `"crs"` 'member':
```{r, engine='node', eval=FALSE}
"crs": {
"type": "name",
"properties": {
"name": "urn:ogc:def:crs:OGC:1.3:CRS84"
}
}
```
Note that we can see this definition in the `gas.GeoJSON` file:
```{r}
readLines("/tmp/gas.GeoJSON")[3]
```
This 'member' is defined in plain text in the GeoJSON file
before the spatial `"features"` are defined. Please see this
small example of [gas stations](http://giscience.tw.rpi.edu/rpi_group_projects/watershedders/data/gas_station.geojson) to see how the CRS
is defined in the context of a complete GeoJSON file.
## Writing GeoJSONs with writeOGR
To write GeoJSON files, the sensibly named corrollary of
`readOGR` is used, `writeOGR`:
```{r}
if(file.exists("/tmp/gas2.geojson")){
file.remove("/tmp/gas2.geojson")
}
```
```{r}
geojsonio::geojson_write(gas, file = "/tmp/gas2.geojson")
```
## Multifeature geojsons
Based on [this](https://stackoverflow.com/questions/30583048/convert-features-of-a-multifeature-geojson-into-r-spatial-objects/30593077#30593077), geojson files can usually be loaded into R with `readOGR`, as illustrated above.
However, this fails for multifeature geojsons.
Reproducible example:
```{r}
downloader::download("https://github.com/Robinlovelace/Creating-maps-in-R/raw/master/data/test-multifeature.geojson", "test.geojson")
try(test <- rgdal::readOGR("test.geojson", "OGRGeoJSON")) # fails with:
```
There are 2 solutions. One uses rgdal and the gdal argument
`require_geomType`. The other uses geojsonio.
### Multifeature geojsons with rgdal
The first solution uses new features introduced in gdal 0.9-3.
It is worth checking (and maybe updating) your gdal version.
(Thanks to [@hrbrmstr](https://stackoverflow.com/users/1457051/hrbrmstr) for this detailed information.)
```{r}
rgdal::getGDALVersionInfo() # check the version of gdal
system("gdal-config --version") # asking gdal its version from bash
```
```
Before update, this was:
[1] "GDAL 1.11.1, released 2014/09/24"
After updating gdal, e.g. like this (Ubuntu), it changed:
$ sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable && sudo apt-get update
$ sudo apt-get install gdal-bin
```
gdal can provide us with information about the layers in the
multipart geojson.
```{r}
ogrListLayers("test.geojson")
```
It can also provide more general information about the object,
if we ask it correctly.
```{r}
try(ogrInfo("test.geojson")) # no information on the file
system("ogrinfo test.geojson ")
try(ogrInfo("test.geojson", "OGRGeoJSON"))
```
The output from the previous lines indicates how we can
access just one layer. We use the argument `require_geomType`
to specify which of the geometries to load:
```{r}
testp_info <-
ogrInfo("test.geojson", "OGRGeoJSON", require_geomType="wkbPoint")
testp_df <- as.data.frame(testp_info$iteminfo)
head(testp_df)
```
Note we can use the same command for any of the geometries:
```{r, eval=FALSE}
ogrInfo("test.geojson", "OGRGeoJSON", require_geomType = "wkbPolygon")
```
```
NOTE: keeping only 23 wkbPolygon of 781 features
note that extent applies to all features
Source: "test.geojson", layer: "OGRGeoJSON"
Driver: GeoJSON number of rows 781
selected geometry type: wkbPolygon with 23 rows
Feature type: wkbPolygon with 2 dimensions
Extent: (12.48326 41.88355) - (12.5033 41.89629)
CRS: +proj=longlat +datum=WGS84 +no_defs
Number of fields: 78
name type length typeName
1 area 4 0 String
2 bicycle 4 0 String
3 highway 4 0 String
```
Armed with this information, we can read in the features:
```{r}
points <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbPoint")
lines <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbLineString")
polygons <- readOGR("test.geojson", "OGRGeoJSON", require_geomType="wkbPolygon")
```
Let's plot the data (see https://stackoverflow.com/questions/30583048/ )
```{r}
plot(lines, col="#7f7f7f")
plot(polygons, add=TRUE)
plot(points, add=TRUE, col="red")
```
## Solution from geojsonio
This is an alternative solution:
```{r}
test <- geojsonio::geojson_read("test.geojson")
```
Then we can filter out the feature types of interest:
```{r}
obj = Filter(
function(f){f$geometry$type=="LineString"},
test$features)
```
## Solution from Spacedman
```{r}
mess <- 'ogr2ogr -where "OGR_GEOMETRY=\'LINESTRING\'" -f "GeoJSON" lines.geojson test.geojson'
system(mess)
testl <- rgdal::readOGR("lines.geojson", layer = "OGRGeoJSON")
```
[1]: https://github.com/Robinlovelace/Creating-maps-in-R/blob/master/vignettes/geoJSON.Rmd
[2]: https://github.com/Robinlovelace/Creating-maps-in-R/blob/master/data/test-multifeature.geojson
## gdal fails to convert certain types of CRS
Let's see what R has created from writeOGR:
```{r}
gas2 <- readLines("/tmp/gas.GeoJSON")
gas2[1:4]
```
Frustratingly, this fails to write the CRS into the file.
**rgdal**'s ability to read a crs from a GeoJSON file is no
guarantee of it's ability to then write it out again.
It seems
**rgdal** has **lost the CRS in translation**.
This can problem be caused by using an old version of
`gdal`. Not in this case though: incomplete CRS values can
lead `rgdal` to omit the `"crs"` in the file, as explained by
Roger Bivand in a lengthy [conversation](http://r-sig-geo.2731867.n2.nabble.com/WriteOGR-to-GeoJSON-loses-CRS-td7586913.html)
on the [R-sig-geo](https://stat.ethz.ch/mailman/listinfo/r-sig-geo)
email list. To test that the problem is the fault of gdal and
not R, we can use the `ogr2ogr` command line tool, supplied by
gdal.
```{r, echo=FALSE}
system("rm /tmp/gas3.*")
```
```{r}
# use call gdal from the operating system to convert original GeoJSON:
system("ogr2ogr -f 'ESRI Shapefile' /tmp/gas3.shp '/tmp/gas.GeoJSON' ")
gas3 <- readOGR(dsn = "/tmp", layer = "gas3")
proj4string(gas3)
proj4string(gas)
```
The above code shows that gdal has successfully converted the
original GeoJSON file to a shapefile, maintaining the CRS.
(Note that the CRS is identical but its definition is slightly different.)
Let's take a look at the `.proj` file that gdal created:
```{r}
gas3_proj <- readLines("/tmp/gas3.prj", skipNul = T)
gas3_proj
```
There are various ways to solve the problem.
The 'outside R' solution would be to write the file
first as a shapefile, and then use `ogr2ogr` to convert
it to a GeoJSON. But as shown in the code below, this fails:
```{r}
writeOGR(gas, "/tmp/", layer = "gas4", driver = "ESRI Shapefile")
system("ogr2ogr -f 'GeoJSON' /tmp/gas5.GeoJSON '/tmp/gas4.shp' ")
readLines("/tmp/gas4.prj")
readLines("/tmp/gas5.GeoJSON")[1:3]
```
# CRS definitions
As Roger Bivand shows, if the CRS is defined in certain
ways, the `"crs"` member will be written to the GeoJSON file.
But it is difficult to know how to define a CRS other than
by its [EPSG name](http://spatialreference.org/ref/epsg/4283/):
```{r}
gas6 <- gas # duplicate the gas object
proj4string(gas6) <- CRS("+init=epsg:4283")
proj4string(gas6)
# the below works, but incorrect CRS
# proj4string(gas6) <- CRS("+proj=longlat +ellps=clrk66 +datum=NAD27")
geojsonio::geojson_write(gas6, file = "/tmp/gas6.geojson")
readLines("/tmp/gas6.geojson")[1:4]
```
# A manual solution
To solve the problem manually, we can simply add the correct projection
to the GeoJSON file that has been created. Thankfully this is straightforward, as line 3 of the files are left empty by gdal, presumably ready
for such an eventuality:
```{r}
gas7 <- readLines("/tmp/gas2.geojson")
gas7[1:4]
gas7[3] <- '"crs": {"type": "EPSG", "properties": {"code": 4283}},'
writeLines(gas7, con = "/tmp/gas8.GeoJSON")
gas8 <- readOGR(dsn = "/tmp/gas8.GeoJSON", layer = "OGRGeoJSON")
proj4string(gas8)
```
This is a messy solution but it's the only one I could find to write
GeoJSON files with a CRS defined for Australian data services' default CRS.
```{r, echo=FALSE}
# library(rgdal)
# library(pryr)
# pryr::show_c_source(readOGR)
# showMethods(readOGR)
# ??readOGR
```