Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New algorithm for planar st_make_valid #1655

Closed
dbaston opened this issue Apr 22, 2021 · 18 comments
Closed

New algorithm for planar st_make_valid #1655

dbaston opened this issue Apr 22, 2021 · 18 comments

Comments

@dbaston
Copy link
Contributor

dbaston commented Apr 22, 2021

GEOS 3.10 will include a new method for making invalid geometries valid. Should this method become the default in sf ? Is there interest in exposing parameters for the "MakeValid" algortithm in sf ? Currently the parameters would be:

  • choice of algorithm
  • whether "collapsed geometries" should be retained (e.g., an invalid polygon that is turned into a line)

The major difference between the two algorithms is the handling of invalid polygons with self-intersections and overlaps Here's some examples of how the new algorithm handles invalid polygonal geometry:

Polygon with self-overlap

image

Old result:

image

Polygon with overlapping holes

image

Old result:
image

MultiPolygon with overlapping elements

image

Old result:
image

Originally posted by @dr-jts in libgeos/geos#433 (comment)

@dbaston
Copy link
Contributor Author

dbaston commented Apr 22, 2021

@dbaston Dan: do you know whether the choice of GEOS_MAKE_VALID_ORIGINAL or GEOS_MAKE_VALID_BUFFERED also runs through GEOSMakeValid_r() controlled say by an environment variable, or only GEOSMakeValidWithOptions_r(). If the former, reverse dependency checking will not involve changes in compiled code in R packages using GEOS, which would let us report quicker. I'm rebuilding the github pramsey fork main-geometry-fixer branch now, having first forgotten to make sure that I had the right branch. However, just running revdeps for rgeos/sf with the existing GEOSMakeValid_r() may be a no-op if that simply uses GEOS_MAKE_VALID_ORIGINAL, so I'd like to be sure that the new code is actually being used.

@rsbivand GEOS does not use environment variables. Currently the only way to see the new behavior would be to use the GEOSMakeValidWithOptions signature.

@rsbivand
Copy link
Member

Thanks! I am now adding a proof-of-concept to rgeos (R-forge) including an environment variable switch to aid testing. If this is helpful, copyng into sf shouldn't be hard.

@rsbivand
Copy link
Member

rsbivand commented Apr 22, 2021

@pramsey: this thread is more specific (transfer from #1649 (comment)). So far in rgeos replicating the examples given (I had to fake two of the three as I couldn't find them in https://docs.google.com/document/d/19YEQS0goSpZlwaYivS6ZpxJ5gRth2gdG6aY2XVd5fcc/edit# or https://github.com/locationtech/jts/blob/master/modules/tests/src/test/resources/testxml/misc/TestInvalidA.xml):

image
image
image

but so far no luck using the options to keep collapsed geometries (the second buffered should also keep collapsed). Please let me know when you push to the branch in your fork.

@dr-jts
Copy link

dr-jts commented Apr 22, 2021

but so far no luck using the options to keep collapsed geometries (the second buffered should also keep collapsed). Please let me know when you push to the branch in your fork.

@rsbivand I'm unclear about your test above - I don't see any obvious collapse.

If you want to test the keepCollapse option, try this geometry:

MULTIPOLYGON (((10 40, 40 40, 40 10, 10 10, 10 40), (16 17, 31 32, 24 25, 16 17)), ((50 40, 50 40, 50 40, 50 40, 50 40)))

Result with keepCollapse:

GEOMETRYCOLLECTION (POINT (50 40), POLYGON ((40 40, 40 10, 10 10, 10 40, 40 40)))

image

Note that keepCollapse keeps entire geometries which have collapsed, but does not keep collapsed holes.

@rsbivand
Copy link
Member

rsbivand commented Apr 23, 2021

After updating to the changed function name and parameter object, I see:

image

which looks about right. sp doesn't like collections of objects of different types, sf is more compliant, but even in the sp object setting this looks OK. SVN rev 653 has completed building, so the rgeos source code will be easier to access from http://download.r-forge.r-project.org/src/contrib/rgeos_0.5-7.tar.gz, but will (of course) fail on any other 3.10.0 than https://github.com/pramsey/geos/tree/main-geometry-fixer (< 3.10.0 is OK). Function help page at https://rgeos.r-forge.r-project.org/reference/topo-unary-gMakeValid.html (updated to add another environment variable to set keep collapsed).

@pramsey
Copy link

pramsey commented Apr 23, 2021

@rsbivand I merged into main, so hopefully that is the last re-naming (sorry, did one more on you) you have to handle. Thanks for taking an early look at the output, @dr-jts will be happy to see the feedback.

@rsbivand
Copy link
Member

rsbivand commented Apr 24, 2021

@pramsey thanks for the heads-up!

With status before this commit, I did not see any differences between original/buffered/buffered-keepCollapsed for the ~200 R packages that are reverse dependencies of rgeos, but did see

"mapmisc" Error in gTopoDim(spgeom2) : class not supported:SpatialCollections
"maptools" Error in rgeos::gSimplify(spgeom = SP, tol = tolerance, topologyPreserve = topologyPreserve) : Geometry collections may not contain other geometry collections

that I don't recall seeing before - in all three 3.10.dev variants. I don't think this is caused by 3.9/3.10 improvements in GEOS, but if it is, this is useful because the previous behaviour was not as intended. So far so good - next is to implement the changes in sf #1649 in relation to s2 integration for spherical geometries. The reverse dependencies of sf will be more informative, because many of those of rgeos are older, so were written before GEOSMakeValid() was available, and most likely do not exercise the function. rgeos at SVN rev. 663 is up to date with the gitea main GEOS repo with LINEWORK/STRUCTURE.

@rsbivand
Copy link
Member

Error analysis: in 3.10.0, there is a change in geometries returned by GEOSSimplify_r():

"mapmisc" Error in gTopoDim(spgeom2) : class not supported:SpatialCollections : Problems in GEOSSimplify_r() returning geometry collections for individual geometries - not a problem in 3.9.1. Resolved by changing the default in rgeos::gSimplify() to use GEOSTopologyPreserveSimplify_r() instead by default if not given specifically by the user or rev-dep package.

"maptools" Error in rgeos::gSimplify(spgeom = SP, tol = tolerance, topologyPreserve = topologyPreserve) : Geometry collections may not contain other geometry collections: Failure was again in GEOSSimplify_r() returning a polygon and a line but the implementation only supported same dimension in and out; resolution, use GEOSTopologyPreserveSimplify_r(). In 3.9.1, GEOSSimplify_r() passes, no failure. The specific example is at: http://maptools.r-forge.r-project.org/reference/thinnedSpatialPoly.html#examples.

Otherwise, no visible changes from GEOSMakeValid_r(), but as noted above, rgeos uses were mostly written pre-3.8.0, so probably don't use GEOSMakeValid_r() anyway. More sf rev-deps do use st_make_valid().

@pramsey
Copy link

pramsey commented Apr 25, 2021

This is kind of odd, there's been very little change in that code line, but there is one commit since 3.9, could this have had the effect you see? libgeos/geos@e6c6fa3#diff-97c48336edbb9a6f90de3e7b4424c22c8a97c9337631ac5cab30f8ff44ad6e3e

@rsbivand
Copy link
Member

rsbivand commented Apr 25, 2021

Possibly, since in the observed cases before 3.10, simplified polygons were returned just as polygons (dimension 2), but with 3.10dev there are geometry collections with polygons and lines unless topology is preserved.

@dr-jts
Copy link

dr-jts commented Apr 25, 2021

This is kind of odd, there's been very little change in that code line, but there is one commit since 3.9, could this have had the effect you see? libgeos/geos@e6c6fa3#diff-97c48336edbb9a6f90de3e7b4424c22c8a97c9337631ac5cab30f8ff44ad6e3e

That could well be the problem. And it's likely important to fix (in JTS and GEOS) since it's (a) a regression and (b) unwanted behaviour. I guess there weren't any unit tests for DPSimplfy that produced that behaviour.

@dr-jts
Copy link

dr-jts commented Apr 25, 2021

Any chance of getting a simple reproducer for the simplify error?

@rsbivand
Copy link
Member

rsbivand commented Apr 26, 2021

Here are two North Carolina NAD27 counties, which return non-polygon output for 3.10dev with topology preserve FALSE, but are OK for topology preserve TRUE for the same tolerance value of 0.05 (for 3.9.1, both TRUE and FALSE pass) for this file:
nc34.zip
In rgeos:

> library(rgeos)
Loading required package: sp
rgeos version: 0.5-7, (SVN revision 666)
 GEOS runtime version: 3.10.0dev-CAPI-1.15.0 
 Linking to sp version: 1.4-6 
 Polygon checking: TRUE 
> nc34 <- readWKT(readLines("nc34.wkt"))
> xx <- gSimplify(nc34, tol=0.05)
> xx <- gSimplify(nc34, tol=0.05, topologyPreserve=TRUE)
> xx <- gSimplify(nc34, tol=0.05, topologyPreserve=FALSE)
output subgeometry 1, row.name: 2
subsubgeometry 0: Polygon
subsubgeometry 1: Polygon
subsubgeometry 2: LineString
Error in gSimplify(nc34, tol = 0.05, topologyPreserve = FALSE) : 
  Geometry collections may not contain other geometry collections
> class(nc34)
[1] "SpatialPolygons"
attr(,"package")
[1] "sp"

image

The output for county 4 alone passes, because it is a valid geometry collection, so 3 and 4 (here base-1 1 & 2, base-0 0 & 1) are both needed to show the geometry collection in geometry collection problem, which is not supported in the R class representation, in which geometry collections can only contain points, lines, polygons and their multi-variants, but not other geometry collections. I think that Currituck National Wildlife Refuge gets reduced to a line string here when topology preserve is FALSE. For 3.9.1, this line is discarded before attempting to convert back to the sp representation:

Screenshot from 2021-04-26 11-00-40

@dr-jts
Copy link

dr-jts commented Apr 26, 2021

Thanks for the reproducer. This is definitely a regression. Reported upstream as GEOS 1115. We'll work on resurrecting the old behaviour.

@dr-jts
Copy link

dr-jts commented Apr 26, 2021

GEOS DouglasPeuckerSimplifier is now fixed to avoid returning mixed collections (libgeos/geos@f3dfbdc)

@rsbivand
Copy link
Member

@dr-jts Thanks, the two failing reverse dependencies are now behaving as before without needing to impose topology preservation. Let's see how quickly sf takes up GEOSMakeValidWithParams(), because more of sf's reverse dependencies use st_make_valid().

edzer added a commit that referenced this issue Jun 22, 2022
@edzer
Copy link
Member

edzer commented Jun 22, 2022

OK, I now see, using @dr-jts ' example above:

library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE
"MULTIPOLYGON (((10 40, 40 40, 40 10, 10 10, 10 40), (16 17, 31 32, 24 25, 16 17)), ((50 40, 50 40, 50 40, 50 40, 50 40)))" |> 
  st_as_sfc() |> 
  st_make_valid() |>  # current defaults: geos_method = "valid_structure", geos_keep_collapsed = TRUE
  st_as_text()
# [1] "GEOMETRYCOLLECTION (POINT (50 40), POLYGON ((10 40, 40 40, 40 10, 10 10, 10 40)))"

"MULTIPOLYGON (((10 40, 40 40, 40 10, 10 10, 10 40), (16 17, 31 32, 24 25, 16 17)), ((50 40, 50 40, 50 40, 50 40, 50 40)))" |> 
  st_as_sfc() |> 
  st_make_valid(geos_method = "valid_linework") |>  # should ignore keep_collapsed
  st_as_text()
# [1] "GEOMETRYCOLLECTION (POLYGON ((40 40, 40 10, 10 10, 10 40, 40 40)), MULTILINESTRING ((16 17, 24 25), (24 25, 31 32)), POINT (50 40))"

"MULTIPOLYGON (((10 40, 40 40, 40 10, 10 10, 10 40), (16 17, 31 32, 24 25, 16 17)), ((50 40, 50 40, 50 40, 50 40, 50 40)))" |> 
  st_as_sfc() |> 
  st_make_valid(geos_keep_collapsed = FALSE) |> 
  st_as_text()
# [1] "POLYGON ((10 40, 40 40, 40 10, 10 10, 10 40))"

Two questions:

  1. is adopting the new strategy (valid_structure & keep_collapsed) is a good default? Or should the old strategy be a better default, so we don't get any revdep surprises now?
  2. in the GEOS docs I see for setKeepCollapsed: "When this parameter is not set to 0, the GEOS_MAKE_VALID_STRUCTURE method will drop any component that has collapsed into a lower dimensionality. For example, a ring collapsing to a line, or a line collapsing to a point." Should drop be keep? If that is not the case, then I don't understand the output I get.

@dr-jts
Copy link

dr-jts commented Jun 22, 2022

  1. in the GEOS docs I see for setKeepCollapsed: "When this parameter is not set to 0, the GEOS_MAKE_VALID_STRUCTURE method will drop any component that has collapsed into a lower dimensionality. For example, a ring collapsing to a line, or a line collapsing to a point." Should drop be keep? If that is not the case, then I don't understand the output I get.

Yes, that should be "keep". I'll make the change.

edzer added a commit that referenced this issue Jun 22, 2022
@edzer edzer closed this as completed Mar 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants