/
index.qmd
190 lines (146 loc) · 13.3 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
---
title: "Upcoming changes to popular R packages for spatial data: what you need to do"
author: Jakub Nowosad
date: "2023-06-04"
slug: rgdal-retirement
categories:
- posts
- rstats
tags: [rspatial, rgdal, rstats]
draft: false
---
## The issue
Three popular R packages for spatial data handling won't be available on CRAN after October 2023.^[Their retirement was [announced almost two years ago](https://stat.ethz.ch/pipermail/r-sig-geo/2021-September/028760.html).] These packages are:
- **rgdal**: a package that provides bindings to the [GDAL](https://gdal.org/) and [PROJ](https://proj.org/) libraries.
In other words, it gives a capability of reading and writing spatial data formats, creating coordinate reference systems and projecting geometries.
- **rgeos**: a package that provides bindings to the [GEOS](https://libgeos.org/) library.
It allows to perform various spatial operations, such as buffering, clipping, or unioning.
- **maptools**: an older set of various tools for manipulating spatial data pre-dating **rgdal** and **rgeos**.
As you can see, two of these packages, **rgdal** and **rgeos** link and allow access to **external** (non-R) spatial libraries for reading and writing spatial data formats and performing spatial operations, such as reprojections and spatial joins.
If you are a user or developer of any of these packages, you need to be aware of the upcoming changes.
From the user perspective, you won't be able to install these packages from CRAN soon, and thus your workflows will be broken.
On the other hand, if you are a developer and your package uses some of the above packages, it will stop working and you won't be able to retain (or submit) it to CRAN.
Very soon you will need to find some alternatives to these packages.
Gladly, most of **rgdal**, **rgeos**, and **maptools** functionality is already available in modern R packages, such as **sf** and **terra**, as you will see below.
## The context
**rgdal** and **maptools** were first published on CRAN in 2003, and **rgeos** in 2011.
These packages were developed when the spatial data analysis ecosystem in R was [still very young](https://r.geocompx.org/intro.html#the-history-of-r-spatial), and they were subsequently used for many (ecological) data analysis workflows.
Writing a popular, complex package is a lot of work, and maintaining it is even much more work.
This can be seen, for example, by looking at [the release versions of the **rgdal** package](https://cran.r-project.org/src/contrib/Archive/rgdal/).
There were 151 releases of this package between 2003 and 2023, which means that there was a new release every 7 weeks on average.
Now, let's consider that each release requires many hours of (usually voluntary) work.
**rgdal**, **maptools**, and **rgeos** have a few things in common.
While they are independent, they are also closely related to each other -- all of them depend on the **sp** package for spatial data representation in R.^[This they share with the **raster** package, which was builtd on **sp** representations and used **rgdal** and **rgeos** under the hood. Since September 2022, **raster** uses **terra**, not **rgdal** or **rgeos**.]
Another thing they have in common is that currently we have more modern alternatives to them, such as **sf** (first released in 2016) and **terra** (2020).
Finally, the main author of **rgdal**, **maptools**, and **rgeos** is the same person -- Roger Bivand, who retired from his position at the Norwegian School of Economics in 2021.
Thus, given the workloads of maintaining these packages, the existence of modern alternatives, and the retirement of the main author, it was decided to retire these packages.^[More details about the context of these changes may be found in the paper by [Bivand (2021)](https://doi.org/10.1007/s10109-020-00336-0).]
## How to test if your code is affected?
The most basic way to test if your code is affected by the upcoming changes is to check if you use **rgdal**, **rgeos**, and **maptools** in your scripts or as dependencies in your packages.
If so -- your workflows may be broken soon!
There is also a possibility that you use the **sp** package but not the affected packages.
In this case, you may still be touched by the changes, because the **sp** package had interacted with **rgdal** and **rgeos** in the background.
For example, if you run the `spTransform()` function from the **sp** package, it used the **rgdal** package in the background.
Thus, you may add the following code to your script to check if it will work in the future:
```{r}
options("sp_evolution_status" = 2) # use sf instead of rgdal and rgeos in sp
library(sp)
```
If you get an error, it means that your code is affected by the changes.^[But if you make the **sf** package available, `spTransform()` will use it instead of **rgdal**. Similarly, many workflows using **raster** can make **terra** available, and remove any mention of **rgdal** or **rgeos**.]
There is also another possible situation -- you are using some packages that depend on the affected packages.
This one is the most difficult to check, because you need to inspect the dependencies of all the packages you use.^[However, maintainers of these packages have been actively contacted with the information of both the need to act, and suggesting ways to migrate from retiring packages to **sf** or **terra**. Many packages depended on **rgdal** and **rgeos** because **raster** used to depend on them; they can safely be replaced by the use by **raster** of **terra** "under the hood".]
## Solutions for users
My first advice would be: do yourself a favor, and just stop using the retired packages today.
Write your new scripts using modern alternatives, such as **sf** or **terra**.
In most cases they are easy to use, and you will be able to find a lot of examples and other resources online.
If you have some old scripts that use the retired packages, you have a few options and your decision should depend on your circumstances.
If you do not plan to use them in the future, you can just leave them as they are.
On the other hand, if you want to perform similar spatial operations, you should consider rewriting the scripts using modern alternatives -- see the tables below for the equivalents of the most popular functions from **rgdal**, **rgeos**, and **maptools**.
You also may have old **sp** objects that you want to use in the future.
Then, you can convert them to the modern equivalents with functions such as `sf::st_as_sf()` or `terra::vect()`.^[Read more about this in the [Conversions between different spatial classes in R](https://geocompx.org/post/2021/spatial-classes-conversion/) and [Progress on R-spatial evolution, Apr 2023](https://r-spatial.org/r/2023/04/10/evolution3.html#conserving-sp-workflows) blog posts.]
Finally, you may decide to still use the retired packages.
You can do that by either not updating your R installation and packages, or by creating a docker image with the old R and packages versions.
However, this is generally not a good idea, because **rgdal**, **rgeos**, and **maptools** will be archived on CRAN soon, and thus you (and other people) won't be able to install them directly.^[Their installation will require downloading source packages from the CRAN package archive, and the providing the external software libraries needed to install **rgdal** and **rgeos** from source.]
## Solutions for package developers
Look at your package dependencies and check if you use any of the affected packages.
If that is the case, you need to find alternatives to them.
In general, there are two main alternative options^[There are also other options, such as using the **vapour** or **geos** package, but they are not discussed here.]:
- **sf** package, which is a modern alternative to **rgdal** and **rgeos**, and also provides a spatial vector data representation in R.
- **terra** package, which gives support to working with spatial vector and raster data in R.
You just need to be aware that functions in these packages are not always identical to the ones in the affected packages.
They may have different arguments, different defaults, expect different inputs, or return different outputs.
For example, **rgdal**'s `readOGR()` function returns a `Spatial*DataFrame` object, while **sf**'s `read_sf()` returns an `sf` object and **terra**'s `vect()` returns a `SpatVector` object.^[Of course, you can convert these objects back to **sp** classes and carry on as before, if you choose. However, this choice is most suited to legacy settings rather than actively maintained packages.]
```{r}
#| echo: false
#| eval: true
rgdal_df = tibble::tribble(
~rgdal, ~sf, ~terra,
"readOGR()", "read_sf()", "vect()",
"writeOGR()", "write_sf()", "writeVector()",
)
knitr::kable(rgdal_df)
```
The **rgeos** functions also have their equivalents in **sf** and **terra**.^[You just need to be aware that they may return different values compared to **rgeos**. For example, **sf** may use [a different algorithm to calculate the area of a polygon for data with a geographical coordinate reference system (CRS).](https://r-spatial.org/r/2020/06/17/s2.html)]
```{r}
#| echo: false
#| eval: true
rgeos_df = tibble::tribble(
~rgeos, ~sf, ~terra,
"gArea()", "st_area()", "expanse()",
"gBuffer()", "st_buffer()", "buffer()",
"gCentroid()", "st_centroid()", "centroids()",
"gDistance()", "st_distance()", "distance()",
"gIntersection()", "st_intersection()", "crop()",
"gIntersects()", "st_intersects()", "relate()",
)
```
Interestingly, the most often used **maptools** functions have been already deprecated for a long time and alternative tools for the same purposes existed, e.g., `unionSpatialPolygons()` could be replaced with `rgeos::gUnaryUnion()`, and `maptools::spRbind()` with `sp::rbind()`.
The table below shows their modern substitutes.
```{r}
#| echo: false
#| eval: true
maptools_df = tibble::tribble(
~maptools, ~sf, ~terra,
"unionSpatialPolygons()", "st_union()", "aggregate()",
"spRbind()", "rbind()", "rbind()",
)
knitr::kable(rgeos_df)
knitr::kable(maptools_df)
```
The **sp** package is not going to be removed from CRAN soon, but it is good to stop using it because it is no longer being actively developed.
Here you can find some replacements for its basic functions:
```{r}
#| echo: false
#| eval: true
sp_df = tibble::tribble(
~sp, ~sf, ~terra,
"bbox()", "st_bbox()", "ext()",
"coordinates()", "st_coordinates()", "crds()",
"identicalCRS()", "st_crs(x) == st_crs(y)", "crs(x) == crs(y)",
"over()", "st_intersects()", "relate()",
"point.in.polygon()", "st_intersects()", "relate()",
"proj4string()", "st_crs()", "crs()",
"spsample()", "st_sample()", "spatSample()",
"spTransform()", "st_transform()", "project()",
)
knitr::kable(sp_df)
```
You also may want to visit the [comparison table](https://github.com/r-spatial/evolution/blob/main/pkgapi_230305_refs.csv) and the [migration wiki](https://github.com/r-spatial/sf/wiki/Migrating) to get more complete lists of alternative functions to the ones from **rgdal**, **rgeos**, and **maptools**.
Additionally, you may still want to accept **sp** objects as inputs and return them as outputs of your functions.
In such case, you can use the `sf::st_as_sf()` or `terra::vect()` functions to convert **sp** objects to the modern equivalents, perform required spatial operations with **sf**/**terra**, and then convert the results back to **sp** objects with `as(x, "Spatial")`.
Packages depending on **sp** would also need to add **sf** as their weak dependency.^[See the coercion section of the [Progress on R-spatial evolution, Apr 2023](https://r-spatial.org/r/2023/04/10/evolution3.html) blog post.]
## More resources
You may use the [**evolution**](https://github.com/r-spatial/evolution/issues) GitHub repository to ask questions about the retirement process.
To learn more about the **rgdal**, **rgeos**, and **maptools** retirement, you can check the following resources:
- [The announcement of their retirement](https://stat.ethz.ch/pipermail/r-sig-geo/2021-September/028760.html)
- A blog post series about the retirement process: [part 1](https://r-spatial.org/r/2022/04/12/evolution.html), [part 2](https://r-spatial.org/r/2022/12/14/evolution2.html), [part 3](https://r-spatial.org/r/2023/04/10/evolution3.html), [part 4](https://r-spatial.org//r/2023/05/15/evolution4.html)
- A video about this retirement process: <https://youtu.be/TlpjIqTPMCA>
- Scripts converting code from the Applied Spatial Data Analysis with R 2nd edition book to run without retiring packages: <https://github.com/rsbivand/sf_asdar2ed>
On the other hand, if you want to learn more about their alternatives and how to use them, there are many resources available online:
- **sf** package's [official website](https://r-spatial.github.io/sf/)
- **terra** package's [documentation](https://rspatial.github.io/terra/reference/terra-package.html)
- The [Geocomputation with R](https://r.geocompx.org/) book
- The [Spatial Data Science with applications in R](https://r-spatial.org/book/) book and its [section](https://r-spatial.org/book/sp-raster.html) on the retirement process
By exploring these resources, you will be able to find alternative workflows that suit your needs and continue your work with spatial data in R.
Good luck!
## Acknowledgements
I would like to thank [Roger Bivand](https://fosstodon.org/@rsbivand) and [Maximilian H.K. Hesselbarth](https://www.maxhesselbarth.com/) for their comments and suggestions that helped to improve this document.