Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor description of sf interfaces, big it up #743

Merged
merged 15 commits into from
Feb 5, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion 01-introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ if(knitr::is_latex_output()){

It would have been difficult to produce Figure \@ref(fig:interactive) using R a few years ago, let alone as an interactive map.
This illustrates R's flexibility and how, thanks to developments such as **knitr** and **leaflet**, it can be used as an interface to other software, a theme that will recur throughout this book.
The use of R code, therefore, enables teaching geocomputation with reference to reproducible examples such as that provided in Figure \@ref(fig:interactive) rather than abstract concepts.
The use of R code, therefore, enables teaching geocomputation with reference to reproducible examples representing real world phenomena, rather than just abstract concepts.

## Software for geocomputation
<!--rl-->
Expand Down
51 changes: 47 additions & 4 deletions 02-spatial-data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This is the first practical chapter of the book, and therefore it comes with som
We assume that you have an up-to-date version of R installed and that you are comfortable using software with a command-line interface such as the integrated development environment (IDE) RStudio.
<!--or VSCode?-->

<!-- Should we update these references to more up-to-date resources? -->
If you are new to R, we recommend reading Chapter 2 of the online book *Efficient R Programming* by @gillespie_efficient_2016 and learning the basics of the language with reference to resources such as @grolemund_r_2016.
Organize your work (e.g., with RStudio projects) and give scripts sensible names such as `02-chapter.R` to document the code you write as you learn.
\index{R!pre-requisites}
Expand Down Expand Up @@ -107,8 +108,16 @@ source("https://github.com/Robinlovelace/geocompr/raw/main/code/02-vectorplots.R
knitr::include_graphics(c("figures/vector_lonlat.png", "figures/vector_projected.png"))
```

**sf** is a package providing classes for geographic vector data.
Not only does **sf** supersede **sp**, it also provides a consistent command-line interface to GEOS\index{GEOS} and GDAL\index{GDAL}, superseding **rgeos** and **rgdal** (described in Section \@ref(the-history-of-r-spatial)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Robinlovelace one overall comment that came to mind yesterday -- maybe we could limit mentions of sp to minimum in the main text, and leave it to footnotes (as we already have dropped it from the first edition of the book)...

The **sf** package provides classes for geographic vector data and a consistent command-line interface to important low level libraries for geocomputation:

- GEOS\index{GEOS}, for geometry operations such as calculating buffers and centroids, covered in Chapter \@ref(geometric-operations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest different order of the libraries: start with GDAL, then PROJ, then GEOS and S2. I am also unsure if we should list liblwgeom here, as it is not mentioned in the sf intro message...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested different order partially to keep GEOS and S2 together as they have overlapping capabilities but for data with different CRSs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"calculating buffers and centroids on data with a projected CRS" (or similar)?

- PROJ, a powerful library for coordinate system transformations, which underlies the content covered in Chapter \@ref(reproj-geo-data)
- [S2](https://s2geometry.io/), a spherical geometry engine written in C++ developed by Google, via the [**s2**](https://r-spatial.github.io/s2/) package, covered in Section \@ref(s2) below and in Chapter \@ref(reproj-geo-data)
- GDAL\index{GDAL}, for reading, writing and manipulating a wide range of geographic data formats, covered in Chapter \@ref(read-write)
<!-- - [liblwgeom](https://github.com/postgis/postgis/tree/master/liblwgeom), a geometry engine used by PostGIS, via the [**lwgeom**](https://r-spatial.github.io/lwgeom/) package -->

Information about these interfaces is provided by **sf** the first time the package is loaded, explaining the meaning of the text `Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1; sf_use_s2() is TRUE` that appears below the `library(sf)` command at the beginning of this chapter.
At the time of writing (2022), **sf** is the only package/system *in any language* that provides a unified interface to this wide range of high performance functionality from an interactive command-line environment for reproducible research.
This section introduces **sf** classes in preparation for subsequent chapters (Chapters \@ref(geometric-operations) and \@ref(read-write) cover the GEOS and GDAL interface, respectively).

### An introduction to simple features {#intro-sf}
Expand Down Expand Up @@ -258,7 +267,6 @@ class(nc_dfr)
class(nc_tbl)
```


As described in Chapter \@ref(attr), which shows how to manipulate `sf` objects with **tidyverse** functions, **sf** is now the go-to package for analysis of spatial vector data in R (not withstanding the **spatstat** package ecosystem which provides numerous functions for spatial statistics).
Many popular packages build on **sf**, as shown by the rise in its popularity in terms of number of downloads per day, as shown in Section \@ref(r-ecosystem) in the previous chapter.
Transitioning established packages and workflows away from legacy packages **rgeos** and **rgdal** takes time [@bivand_progress_2021], but the process was given a sense of urgency by messages printed when they were loaded, which state that they "will be retired by the end of 2023".
Expand Down Expand Up @@ -774,6 +782,42 @@ sf::st_crs(df_sf) = 4326
It is fast and reliable at 'casting' geometry columns to different types, a topic covered in Chapter \@ref(geometric-operations).
Benchmarks, in the package's [documentation](https://dcooley.github.io/sfheaders/articles/examples.html#performance) and in test code developed for this book, show it is much faster than the `sf` package for such operations.

### Spherical geometry operations with S2 {#s2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW -- do we mention what is a "planar geometry" anywhere is the book?


Spherical geometry engines are based on the fact that world is round while simple mathematical procedures for geocomputation, such as calculating a straight line between two points or the area enclosed by a polygon, assume planar (projected) geometries.
Since **sf** version 1.0.0, R supports spherical geometry operations 'out of the box', thanks to its interface to Google's S2 spherical geometry engine via the **s2** interface package.
S2 is perhaps best known as an example of a Discrete Global Grid System (DGGS).
Another example is the [H3](https://eng.uber.com/h3/) global hexagonal hierarchical spatial index [@bondaruk_assessing_2020].

Although potentially useful for describing locations anywhere on Earth using character strings such as [e66ef376f790adf8a5af7fca9e6e422c03c9143f](https://developers.google.com/maps/documentation/gaming/concepts_playable_locations), the main benefit of **sf**'s interface to S2 is its provision of drop-in functions for calculations such as distance, buffer, and area calculations, as described in **sf**'s built in documentation which can be opened with the command [`vignette("sf7")`](https://r-spatial.github.io/sf/articles/sf7.html).

**sf** can run in two modes with respect to S2: on and off.
By default the S2 geometry engine is turned on, as can be verified with the following command:

```{r}
sf_use_s2()
```

An example of the consequences of turning the geometry engine off is shown below, by creating buffers around the `india` object created earlier in the chapter (note the warnings emitted when S2 is turned off):

```{r}
india_buffer_with_s2 = st_buffer(india, 1)
sf_use_s2(FALSE)
india_buffer_without_s2 = st_buffer(india, 1)
```

```{r s2example, fig.show='hold', out.width="60%", echo=FALSE, fig.cap="Example of the consequences of turning off the S2 geometry engine. Both representations of a buffer around India were created with the same command but the light green polygon object was created with S2 switched on, resulting in a buffer of 1 m. The larger red polygon was created with S2 switched off, resulting in a buffer with inaccurate units of degrees longitude/latitude."}
plot(st_geometry(india_buffer_without_s2), expandBB = c(0, 0.2, 0.1, 1), col = "red")
plot(st_geometry(india_buffer_with_s2), expandBB = c(0, 0.2, 0.1, 1), col = "lightgreen", add = TRUE)
```

Throughout this book we will assume that S2 is turned on, unless explicitly stated.
Turn it on again with the following command.

```{r}
sf_use_s2(TRUE)
```

## Raster data

The spatial raster data model represents the world with the continuous grid of cells (often also called pixels; Figure \@ref(fig:raster-intro-plot):A).
Expand Down Expand Up @@ -1028,7 +1072,6 @@ For now, it is sufficient to know:
- Knowing which CRS your data is in, and whether it is in geographic (lon/lat) or projected (typically meters), is important and has consequences for how R handles spatial and geometry operations
- CRSs of `sf` objects can be queried with the function `st_crs()`, CRSs of `terra` objects can be queried with the function `crs()`


```{r vector-crs, echo=FALSE, fig.cap="Examples of geographic (WGS 84; left) and projected (NAD83 / UTM zone 12N; right) coordinate systems for a vector data type.", message=FALSE, fig.asp=0.56, fig.scap="Examples of geographic and projected CRSs (vector data)."}
# source("https://github.com/Robinlovelace/geocompr/raw/main/code/02-vector-crs.R")
knitr::include_graphics("figures/02_vector_crs.png")
Expand Down
2 changes: 1 addition & 1 deletion _01-ex.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ E1. Think about the terms 'GIS'\index{GIS}, 'GDS' and 'geocomputation' described

E2. Provide three reasons for using a scriptable language such as R for geocomputation instead of using a graphical user interface (GUI) based GIS such as QGIS\index{QGIS}.

E3. Name two advantages and two disadvantages of using mature vs recent packages for geographic data analysis\index{geographic data analysis} (for example **sp** vs **sf**\index{sf}, or **raster** vs **terra**).
E3. Think about real world problems you would like to solve with help from geographic data. Try sketching a workflow, with rectangles representing datsets and ovals representing methods/processes to transform them, with a pen and paper or a digital sketching tool such as Excalidraw.

<!--toDo: rl -->
<!--add solutions!-->
16 changes: 16 additions & 0 deletions geocompr.bib
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,22 @@ @book{blangiardo_spatial_2015
keywords = {\#nosource}
}

@article{bondaruk_assessing_2020,
title = {Assessing the State of the Art in {{Discrete Global Grid Systems}}: {{OGC}} Criteria and Present Functionality},
shorttitle = {Assessing the State of the Art in {{Discrete Global Grid Systems}}},
author = {Bondaruk, Ben and Roberts, Steven A. and Robertson, Colin},
date = {2020-03-01},
journaltitle = {Geomatica},
volume = {74},
number = {1},
pages = {9--30},
publisher = {{NRC Research Press}},
issn = {1195-1036},
doi = {10.1139/geomat-2019-0015},
url = {https://cdnsciencepub.com/doi/abs/10.1139/geomat-2019-0015},
urldate = {2021-08-12}
}

@book{borcard_numerical_2011,
title = {Numerical Ecology with {{R}}},
author = {Borcard, Daniel and Gillet, François and Legendre, Pierre},
Expand Down