Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should sf_use_s2 be FALSE by default? #2141

Closed
eblondel opened this issue Mar 31, 2023 · 3 comments
Closed

should sf_use_s2 be FALSE by default? #2141

eblondel opened this issue Mar 31, 2023 · 3 comments

Comments

@eblondel
Copy link

I'm wondering the rationale behind having sf_use_s2 set to TRUE by default when loading sf:

  • First, this has been introduced later. For backward compatibility, it would make more sense to have it set FALSE by default.
  • The sf aka Simple Features, although not explicitely mentioned through sf, takes its origin in ISO / OGC standard specifications (ISO 19125). This is technology independent, while in this standard landscape, that sf implements very well in R, the Google s2 is a technology dependent feature that AFAIK has nothing to deal with the original Simple features standard. It sounds more a plugin added to sf.
    Based on this, it would be better to have sf_use_s2 set to FALSE by default. What do you think?

Thanks in advance

@edzer
Copy link
Member

edzer commented Apr 3, 2023

Hi Emmanuel, great question.

First, this has been introduced later. For backward compatibility, it would make more sense to have it set FALSE by default.

that attitude would put new users, unaware of history, in the "old", flat Earth / GIS mode, without good reason.

This is technology independent

sf is an implementation, and an implementation can never be technology independent. The "traditional" open source SF implementation has been (and still is) JTS, which has been translated into C++ as GEOS. In the sp days, rgeos was the GEOS backend doing geometry operations, in sf the links to GEOS were integrated (for maintenance reasons, essentially). I don't think that anything in ISO 19125 tells you that the area of POLYGON((0 89,1 89,1 90,0 90,0 89)) where coordinates are long/lat geodetic associated with OGC:CRS84, should be 1 (eh, one what?), yet that is what GEOS gives you (it's CRS unaware). Go ask it to draw a buffer with size 1 around that polygon, and try to explain the outcome to a newcomer in spatial data science.

Before 1.0, sf has tried to do distances and areas for geodetic coordinates correctly, using the routines in GeographicLib, but buffers and intersections would still be "wrong" (flat Earth). s2geometry provides an alternative to GEOS that doesn't assume the Earth is flat, but does all computations on the sphere, which is a more realistic approximation of the Earth's surface than the flat plane. s2geometry is an open source library, just like GEOS, and is actively used by Google and maintained by its engineers.

The reason that s2 is an independent R package is purely practical; GEOS, GDAL and PROJ could have been set up as independent packages. sf would surely have used package geos if it had been available in 2016.

it would be better to have sf_use_s2 set to FALSE by default.

See also Chrisman's quote used as the motto of this chapter. Could you describe what you think would go better if we'd continue to propagate the GIS worldview that geodetic coordinates should be treated as Cartesian coordinates?

@eblondel
Copy link
Author

eblondel commented Apr 3, 2023

Thanks for your answer. On the backward compatibility: This "attitude" (how you call it) is to maintain a software behavior identical in time, which is a minimum requirement in software engineering, because the current users have been building workflows on top of it, and some break because the introduction of s2 as default broke the compatibility with previous behaviors. They are not only newcomers, they are also oldcomers that build software and analyses on top of sf. I don't think this an issue with new users, as long as you inform them of limitations, ways to mitigate it for specific use cases (working global scale), and at least set a transition phase where you progressively introduce a feature as plugin, setting it to FALSE by default, and maybe after make it default, and let people get aware of that and adapt consequently. But maybe this is what you did, and i might have missed this transition throughout sf releases.

For the rest, I still think there is a confusion, between the name of the package, which makes completely sense within the frame of the standards (since the package has the name of the standard), and what has been extended with s2, in the sence that some processes, that do not fail based on standard-compliant libraries, fail with s2; because - maybe - their data model behind does not rely on the ISO/OGC standard. If you read carefully my post, you will see i'm not questioning s2 capability.

I may share some of your arguments related to data projection at global scale, but GIS and spatial data science is far from being bound to global scale, and the entire GIS community extensively uses metric data projections, and for good reasons. Personnally I don't want to pretend question main fundaments of the GIS science, just because of the global use case (that I know very well because I practice it through international organizations), and as for newcomers, they learn spatial data science, and part of this learning is also made of specific data handling, and use cases they will discover in time.

Cheers

@edzer
Copy link
Member

edzer commented Apr 3, 2023

I may share some of your arguments related to data projection at global scale, but GIS and spatial data science is far from being bound to global scale,

It's not only global scale data, it's also all data that is close to the poles, data crossing the antimeridian, and directional problems (e.g., computing buffers or distances) further away from the equator.

and the entire GIS community extensively uses metric data projections, and for good reasons.

sf_use_s2(TRUE) didn't change any of that, and I've never discouraged anyone doing that, I only discouraged the implicit use of plate carree or equirectangular projections when data are not projected. If you want that, you can do it, by using st_transform() or sf_use_s2(FALSE). I don't think it's good as default (although it still is - but that might change - for plot.sf() -- e.g. pkg tmap has a more sensible defaults).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants