Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor description of sf interfaces, big it up #743

Merged
merged 15 commits into from Feb 5, 2022
Merged

Conversation

Robinlovelace
Copy link
Collaborator

No description provided.

@Robinlovelace Robinlovelace marked this pull request as ready for review February 4, 2022 11:08
@Robinlovelace
Copy link
Collaborator Author

This can basically close #705 bringing us close to submission of part 1 for peer review. Is anyone up for adding content to C5 of geometry operations to match this? Not essential but maybe worth mentioning in passing.

@@ -108,7 +108,16 @@ knitr::include_graphics(c("figures/vector_lonlat.png", "figures/vector_projected
```

**sf** is a package providing classes for geographic vector data.
Not only does **sf** supersede **sp**, it also provides a consistent command-line interface to GEOS\index{GEOS} and GDAL\index{GDAL}, superseding **rgeos** and **rgdal** (described in Section \@ref(the-history-of-r-spatial)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Robinlovelace one overall comment that came to mind yesterday -- maybe we could limit mentions of sp to minimum in the main text, and leave it to footnotes (as we already have dropped it from the first edition of the book)...

Not only does **sf** supersede **sp**, it also provides a consistent command-line interface to GEOS\index{GEOS} and GDAL\index{GDAL}, superseding **rgeos** and **rgdal** (described in Section \@ref(the-history-of-r-spatial)).
Not only does **sf** supersede **sp**, it also provides a consistent command-line interface to five important low level libraries for geocomputation:

- GEOS\index{GEOS}, for geometry operations such as calculating buffers and centroids, covered in Chapter \@ref(geometric-operations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest different order of the libraries: start with GDAL, then PROJ, then GEOS and S2. I am also unsure if we should list liblwgeom here, as it is not mentioned in the sf intro message...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested different order partially to keep GEOS and S2 together as they have overlapping capabilities but for data with different CRSs

- [S2](https://s2geometry.io/), a spherical geometry engine written in C++ developed by Google, via the [**s2**](https://r-spatial.github.io/s2/) package, covered in Section \@ref(s2) below and in Chapter \@ref(reproj-geo-data)

Information about these interfaces is provided by **sf** the first time the package is loaded, explaining the meaning of the text `Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1; sf_use_s2() is TRUE` that appears below the `library(sf)` command at the beginning of this chapter.
**sf** is the only package/system *in any language* that provides a unified interface to this wide range of high performance functionality from an interactive command-line environment for reproducible research.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be worth to say "is currently"

```{r vector-crs, echo=FALSE, fig.cap="Examples of geographic (WGS 84; left) and projected (NAD83 / UTM zone 12N; right) coordinate systems for a vector data type.", message=FALSE, fig.asp=0.56, fig.scap="Examples of geographic and projected CRSs (vector data)."}
# source("https://github.com/Robinlovelace/geocompr/raw/main/code/02-vector-crs.R")
knitr::include_graphics("figures/02_vector_crs.png")
```

### Spherical geometry operations with S2 {#s2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not this subsection be in the vector data section? S2 only works on spatial vectors...

Not only does **sf** supersede **sp**, it also provides a consistent command-line interface to GEOS\index{GEOS} and GDAL\index{GDAL}, superseding **rgeos** and **rgdal** (described in Section \@ref(the-history-of-r-spatial)).
Not only does **sf** supersede **sp**, it also provides a consistent command-line interface to five important low level libraries for geocomputation:

- GEOS\index{GEOS}, for geometry operations such as calculating buffers and centroids, covered in Chapter \@ref(geometric-operations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested different order partially to keep GEOS and S2 together as they have overlapping capabilities but for data with different CRSs

@@ -774,6 +782,42 @@ sf::st_crs(df_sf) = 4326
It is fast and reliable at 'casting' geometry columns to different types, a topic covered in Chapter \@ref(geometric-operations).
Benchmarks, in the package's [documentation](https://dcooley.github.io/sfheaders/articles/examples.html#performance) and in test code developed for this book, show it is much faster than the `sf` package for such operations.

### Spherical geometry operations with S2 {#s2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW -- do we mention what is a "planar geometry" anywhere is the book?

- GDAL\index{GDAL}, for reading, writing and manipulating a wide range of geographic data formats, covered in Chapter \@ref(read-write)
- GEOS\index{GEOS}, for geometry operations such as calculating buffers and centroids, covered in Chapter \@ref(geometric-operations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"calculating buffers and centroids on data with a projected CRS" (or similar)?

@Robinlovelace
Copy link
Collaborator Author

Heads-up @Nowosad I've updated the intro to sf and its interfaces thanks to your comments. As of most recent commit it says:

sf provides classes for geographic vector data and a consistent command-line interface to important low level libraries for geocomputation:

  • PROJ, a powerful library for coordinate system transformations, which underlies the content covered in Chapter @ref(reproj-geo-data)
  • GDAL\index{GDAL}, for reading, writing and manipulating a wide range of geographic data formats, covered in Chapter @ref(read-write)
  • GEOS\index{GEOS}, a planar geometry engine for operations such as calculating buffers and centroids on data with a projected CRS, covered in Chapter @ref(geometric-operations)
  • S2, a spherical geometry engine written in C++ developed by Google, via the s2 package, covered in Section @ref(s2) below and in Chapter @ref(reproj-geo-data)

At the time of writing (2022), sf is the only package in any language that provides a unified command line interface (CLI) to this wide range of geographic libraries in an interactive environment for data science.
Information about these interfaces is printed by sf the first time the package is loaded: the message Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1; sf_use_s2() is TRUE that appears below the library(sf) command at the beginning of this chapter tells us the versions of linked GEOS, GDAL and PROJ libraries (these vary between computers and over time) and whether or not the S2 interface is turned on.
A unique feature of sf is that you can change switch the default geometry engine used on unprojected data: 'switching off' S2 can be done with the command sf::sf_use_s2("FALSE"), meaning that the planar geometry engine GEOS will be used by default for all geometry operations, including geometry operations on unprojected data.
As we will see in Section @ref(s2), planar geometry is based on 2 dimensional space.
Planar geometry engines such as GEOS assume 'flat' (projected) coordinates while spherical geometry engines such as S2 assume unprojected (lon/lat) coordinates.

@Nowosad
Copy link
Member

Nowosad commented Feb 5, 2022

👍🏻

@Nowosad
Copy link
Member

Nowosad commented Feb 5, 2022

Feel free to merge it.

@Robinlovelace
Copy link
Collaborator Author

Thanks for the review. Merging 🚀

@Robinlovelace Robinlovelace merged commit c3729d9 into main Feb 5, 2022
@Robinlovelace Robinlovelace deleted the s2-2022 branch February 5, 2022 14:15
@jannes-m
Copy link
Collaborator

jannes-m commented Feb 5, 2022

I would tone down since e. g., Postgis also makes use of gdal, geos and proj (among others) and an s2 extension is also available. And I would guess the same is true for Grass and if it does not support s2, well, then most certainly it supports. an ellipsoidal representation of the Earth. Qgis is using the same libraries and there is also some s2 support (only did a very quick check). Postgis, Grass and QGIS also provide command line interfaces (Python and SQL).

@Robinlovelace
Copy link
Collaborator Author

Thanks for the feedback @jannes-m. Can you put in a PR or just commit to the main branch that tones it down?

@Robinlovelace
Copy link
Collaborator Author

PR preferred here as it allows conversation and iteration before making the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants