Some inspiration for this post came from the beautiful books of Lovelace et al. (2019), Pebesma & Bivand (2019) and Hijmans (2019), and from various websites.
Luckily, quite a list of open standards is available! Below, some powerful and widely-used single-file formats are introduced. Single-file data sources are readily amenable to exchange and publication.
I see you can’t wait to start practicing, so you can also head straight over to the tutorial on vector formats and the tutorial on the GeoTIFF raster format! In these tutorials, a comparison table of vector/raster file formats is also presented.
GDAL (Geospatial Data Abstraction Library) is by far the most used collection of open-source drivers for:
In other words, it is the preferred workhorse for reading and writing many geospatial file formats, used in the background by a lot of geospatial applications . Using GDAL is the easiest way to conform to open standards.
So, in R we use packages that use GDAL in the background, such as rgdal
, sp
, sf
, raster
and stars
.
filename.gpkg
). It shares this property with shapefiles, which however pose multiple limitations,1 so the GeoPackage is a more than suitable replacement.filename.geojson
) contains one vector layer. Note that one vector layer can combine different feature geometry types, e.g. points and linestrings. JSON itself is a common and straightforward open data format. It is a text file readable both by humans and machines (see the tutorial for an example). GeoJSON adds the necessary specification to JSON for standardized storage of geographic feature data, but it is still a plain JSON text file.sf
-repository).RFC7946=YES
. Here, on-the-fly reprojection to WGS84 will happen automatically. It applies 7 decimal places for coordinates, i.e. approximately 1 cm. Given the advantages, we advise to explicitly use RFC7946. Several functions in R allow the user to provide options that are passed to GDAL, so we can ask to deliver RFC7946 (see the tutorial).filename.tif
). It uses a small set of reserved TIFF tags to store information about CRS, extent and resolution of the raster.Hijmans R. (2019). Spatial Data Science with R. URL: https://rspatial.org/.
Lovelace R., Nowosad J. & Muenchow J. (2019). Geocomputation with R. URL: https://geocompr.robinlovelace.net.
Pebesma E. & Bivand R. (2019). Spatial Data Science. URL: https://www.r-spatial.org/book.
Some problems with shapefiles are: they’re not an open format, they consist of multiple files and they have restrictions regarding file size, column name length, number of columns and the feature types that can be accommodated.↩
Note that personal geodatabases have their size limited to 250-500 MB; a GeoPackage can have a size of about 140 TB if the filesystem can handle it.↩
Though GeoJSON 2008 is obsoleted, the now recommended RFC7946 standard is still officially in a proposal stage. That is probably the reason why GDAL does not yet default to RFC7946. A somehow confusing stage, it seems.↩
When versioning GeoJSON files, mind the order of your data when rewriting them: reordering could produce large diffs. Interested in combining GeoJSON and GitHub? Surprise yourself!↩