Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change GEOSVoronoiDiagram to return edges as MultiLineString #702

Merged
merged 4 commits into from
Nov 22, 2022

Conversation

BuonOmo
Copy link
Contributor

@BuonOmo BuonOmo commented Oct 14, 2022

change

the getDiagramEdges now only returns a MultiLineString, whether there is one or many lines. Which reduces to two the number of possible kind of geometries given by GEOSVoronoiDiagram

reason

I've been working on adding voronoi_diagram to RGeo. And looking at the Shapely PR for voronoi diagram and mine in RGeo, it seems like the GEOSVoronoiDiagram CAPI method causes confusion by returning three possible kind of geometries:

  • GeometryCollection (when onlyEdges is false)
  • MultiLineString (onlyEdges is true and there are at least two lines)
  • LineString (onlyEdges is true and there is only one line)

The parameter onlyEdges changes the outcome, which is ok to reason about. However, the fact that we could have either a LineString or a MultiLineString is way harder to work around in my opinion.

Note for reviewer

It's been year since I haven't made a single line of C++, feel free to fix the implementation! To be honest, I've copied the code from other parts of the library, until one was working 😅

Alternative

We could also wrap the result in a GeometryCollection, to comply to the original description and be even more consistent. But I guess this would be more work here, and less efficient as well

@@ -138,7 +138,15 @@ VoronoiDiagramBuilder::getDiagramEdges(const geom::GeometryFactory& geomFact)
}
std::unique_ptr<geom::Geometry> clipPoly(geomFact.toGeometry(&diagramEnv));
std::unique_ptr<Geometry> clipped(clipPoly->intersection(edges.get()));
return clipped;

if (dynamic_cast<LineString*>(clipped.get())) {
Copy link
Member

@pramsey pramsey Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will work, but it would be nice to avoid all the reaching into the unique_ptr<> quite so much.

if (clipped->getGeometryTypeId() == GEOS_LINESTRING) {
  std::vector<std::unique_ptr<LineString>> lines;
  lines.emplace_back(clipped.release());
  return geomFact.createMultiLineString(std::move(lines));
} else {
  return clipped;
}

This hasn't been subjected to a compiler or anything, but I think it will work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the idea!

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Oct 17, 2022

Actually, after running a benchmark, the alternative is not really slower than the first proposition. Hence I'll change for the alternative. I'll do that in a separate commit, which will be easier to reverse.

Benchmark
///usr/bin/env g++ ${0} -std=c++11 -isystem ../google/benchmark/include -I/usr/local/include -L../google/benchmark/build/src -L/usr/local/lib -lbenchmark -lgeos -o ${0%%.cc} && ${0%%.cc} ; exit $?
#include <benchmark/benchmark.h>

#include <geos/triangulate/VoronoiDiagramBuilder.h>

#include <geos/io/WKTReader.h>
#include <geos/io/WKBReader.h>

using namespace geos::triangulate;
using namespace geos::geom;
using namespace geos::io;

#define bench(name, wkt) static void multiline_##name(benchmark::State& state) { \
  geos::triangulate::VoronoiDiagramBuilder builder; \
  std::unique_ptr<Geometry> geom(readTextOrHex(wkt)); \
  const GeometryFactory& geomFact(*GeometryFactory::getDefaultInstance()); \
  builder.setSites(*geom); \
  for (auto _ : state) { \
    builder.getDiagramEdges(geomFact); \
  } \
} \
BENCHMARK(multiline_##name); \
static void collection_##name(benchmark::State& state) { \
  geos::triangulate::VoronoiDiagramBuilder builder; \
  std::unique_ptr<Geometry> geom(readTextOrHex(wkt)); \
  const GeometryFactory& geomFact(*GeometryFactory::getDefaultInstance()); \
  builder.setSites(*geom); \
  for (auto _ : state) { \
    builder.getDiagramEdgesCollection(geomFact); \
  } \
} \
BENCHMARK(collection_##name);

static std::unique_ptr<Geometry>
readTextOrHex(const char* wkt)
{
    WKTReader wktreader;
    WKBReader wkbreader;
    std::string wktstr(wkt);
    if ("01" == wktstr.substr(0, 2) || "00" == wktstr.substr(0, 2))
    {
        std::stringstream is(wkt);
        return wkbreader.readHEX(is);
    }
    else
        return wktreader.read(wktstr);
}

bench(short, "LINESTRING (10 10, 20 20, 30 30)")
bench(long, "0104000080170000000101000080EC51B81E11A20741EC51B81E85A51C415C8FC2F528DC354001010000801F85EB5114A207415C8FC2F585A51C417B14AE47E1BA3540010100008085EB51B818A20741A8C64B3786A51C413E0AD7A3709D35400101000080000000001BA20741FED478E984A51C413E0AD7A3709D3540010100008085EB51B818A20741FED478E984A51C413E0AD7A3709D354001010000800AD7A37016A20741FED478E984A51C413E0AD7A3709D35400101000080000000001BA2074154E3A59B83A51C413E0AD7A3709D3540010100008085EB51B818A2074154E3A59B83A51C413E0AD7A3709D354001010000800AD7A37016A2074154E3A59B83A51C413E0AD7A3709D35400101000080000000001BA20741AAF1D24D82A51C413E0AD7A3709D3540010100008085EB51B818A20741AAF1D24D82A51C413E0AD7A3709D35400101000080F6285C8F12A20741EC51B81E88A51C414160E5D022DB354001010000802222222210A2074152B81EC586A51C414160E5D022DB354001010000804F1BE8B40DA2074152B81EC586A51C414160E5D022DB354001010000807B14AE470BA2074152B81EC586A51C414160E5D022DB354001010000802222222210A20741B81E856B85A51C414160E5D022DB354001010000804F1BE8B40DA20741B81E856B85A51C414160E5D022DB354001010000807B14AE470BA20741B81E856B85A51C414160E5D022DB35400101000080A70D74DA08A20741B81E856B85A51C414160E5D022DB35400101000080D4063A6D06A20741B81E856B85A51C414160E5D022DB354001010000807B14AE470BA207411F85EB1184A51C414160E5D022DB35400101000080A70D74DA08A207411F85EB1184A51C414160E5D022DB35400101000080D4063A6D06A207411F85EB1184A51C414160E5D022DB3540")

BENCHMARK_MAIN();
Run on (10 X 24.1209 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB
  L1 Instruction 128 KiB
  L2 Unified 4096 KiB (x10)
Load Average: 3.19, 2.77, 2.67
-----------------------------------------------------------
Benchmark                 Time             CPU   Iterations
-----------------------------------------------------------
multiline_short      313247 ns       313187 ns         2235
collection_short     318665 ns       318316 ns         2197
multiline_long      2349749 ns      2349218 ns          294
collection_long     2407711 ns      2406856 ns          291

Personnaly I prefer this option as it is even simpler in the CAPI GEOSVoronoiDiagram usage.

BuonOmo added a commit to BuonOmo/geos that referenced this pull request Oct 17, 2022
**change**

the getDiagramEdges now only returns a GeometryCollection, whether there
is one or many lines. So `GEOSVoronoiDiagram` now can only return a
GeometryCollection itself.

**reason**

I've been working on adding `voronoi_diagram` to RGeo. And looking at
the [Shapely PR for voronoi diagram][1] and [mine in RGeo][2], it seems
like the `GEOSVoronoiDiagram` CAPI method causes confusion by returning
three possible kind of geometries:

- `GeometryCollection` (when `onlyEdges` is false)
- `MultiLineString` (`onlyEdges` is true and there are at least two lines)
- `LineString` (`onlyEdges` is true and there is only one line)

The fact that we could have either a LineString or a MultiLineString is
quite confusing. And wrapping the whole result in a GeometryCollection
is actually [not a significant overhead][3]. Hence the choice of always
returning a GeometryCollection.

[1]: shapely/shapely#881 (comment)
[2]: rgeo/rgeo#334 (comment)
[3]: libgeos#702 (comment)
BuonOmo added a commit to BuonOmo/geos that referenced this pull request Oct 17, 2022
**change**

the getDiagramEdges now only returns a GeometryCollection, whether there
is one or many lines. So `GEOSVoronoiDiagram` now can only return a
GeometryCollection itself.

**reason**

I've been working on adding `voronoi_diagram` to RGeo. And looking at
the [Shapely PR for voronoi diagram][1] and [mine in RGeo][2], it seems
like the `GEOSVoronoiDiagram` CAPI method causes confusion by returning
three possible kind of geometries:

- `GeometryCollection` (when `onlyEdges` is false)
- `MultiLineString` (`onlyEdges` is true and there are at least two lines)
- `LineString` (`onlyEdges` is true and there is only one line)

The fact that we could have either a LineString or a MultiLineString is
quite confusing. And wrapping the whole result in a GeometryCollection
is actually [not a significant overhead][3]. Hence the choice of always
returning a GeometryCollection.

[1]: shapely/shapely#881 (comment)
[2]: rgeo/rgeo#334 (comment)
[3]: libgeos#702 (comment)
@BuonOmo BuonOmo requested a review from pramsey November 7, 2022 12:36
@pramsey
Copy link
Member

pramsey commented Nov 9, 2022

I'm a little concerned about the CI failure, is this rebased to main? I don't really see how your API change could cause the failure, but neither do I want to merge it if it's somehow coming from there....

@dbaston
Copy link
Member

dbaston commented Nov 9, 2022

I think we're ok, the failure was fixed in main at 433b91a

@dr-jts
Copy link
Contributor

dr-jts commented Nov 9, 2022

Does this change the return value to always be an actual GeometryCollection, which contains LineStrings if onlyEdges is true?

If so, I am not in favour of this change. One reason is that MultiLineString is more widely supported than a GeometryCollection in the GEOS API (i.e. by spatial predicates and overlay ops). Also, since it is a subclass of GeometryCollection there should be no difference in downstream processing.

Was the stated intention was to eliminate the possibility of returning a single LineString? This is more reasonable., especially since I'm not sure a Voronoi diagram can ever contain a single edge. Although there are other places in the API which can return either a single LineString or a MultiLineString, so does this really need to a special case?

Note that the getNumGeometries and getGeometryN methods are on the Geometry class. This is a deliberate difference to the SFS standard API, and was done to allow more uniform handling of collections and singleton geometries.

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Nov 10, 2022

@pramsey the failure was in main at the time indeed, so I'll rebase

@dr-jts, both options were considered (and commits for both options are available to be discussed in the branch). I'd be happy to rollback to the previous one.

For the reasoning behind it, I'm not directly using GEOS C++ API, but GEOS C API to maintain its Ruby bindings (RGeo)

As I have seen in the Shapely implementation of voronoi diagram, and as I see now for RGeo, the fact that this method can return this wide range of types is confusing. And the single LineString is even more, as in more dynamic languages where you can directly iterate over the result, it would be counter intuitive to offer iteration over a single geometry.

I'll rollback to the initial proposition of making sure we at least have a collection. Tell me if that works for you or not.

EDIT: You'll find two commits with the two proposal, and the commit messages indicating the reasoning behind. Hopefully the latter is ok with you

**change**

the getDiagramEdges now only returns a GeometryCollection, whether there
is one or many lines. So `GEOSVoronoiDiagram` now can only return a
GeometryCollection itself.

**reason**

I've been working on adding `voronoi_diagram` to RGeo. And looking at
the [Shapely PR for voronoi diagram][1] and [mine in RGeo][2], it seems
like the `GEOSVoronoiDiagram` CAPI method causes confusion by returning
three possible kind of geometries:

- `GeometryCollection` (when `onlyEdges` is false)
- `MultiLineString` (`onlyEdges` is true and there are at least two lines)
- `LineString` (`onlyEdges` is true and there is only one line)

The fact that we could have either a LineString or a MultiLineString is
quite confusing. And wrapping the whole result in a GeometryCollection
is actually [not a significant overhead][3]. Hence the choice of always
returning a GeometryCollection.

[1]: shapely/shapely#881 (comment)
[2]: rgeo/rgeo#334 (comment)
[3]: libgeos#702 (comment)
Improve consistency of GEOSVoronoiDiagram
**change**

the getDiagramEdges now only returns a MultiLineString, whether there is
one or many lines. Which reduces to two the number of possible kind of
geometries given by `GEOSVoronoiDiagram`

**reason**

I've been working on adding `voronoi_diagram` to RGeo. And looking at
the [Shapely PR for voronoi diagram][1] and [mine in RGeo][2], it seems
like the `GEOSVoronoiDiagram` CAPI method causes confusion by returning
three possible kind of geometries:

- `GeometryCollection` (when `onlyEdges` is false)
- `MultiLineString` (`onlyEdges` is true and there are at least two lines)
- `LineString` (`onlyEdges` is true and there is only one line)

The parameter `onlyEdges` changes the outcome, which is ok to reason
about. However, the fact that we could have either a LineString or a
MultiLineString is way harder to work around in my opinion.

[1]: shapely/shapely#881 (comment)
[2]: rgeo/rgeo#334 (comment)
@@ -107,13 +107,13 @@ class GEOS_DLL VoronoiDiagramBuilder {
std::unique_ptr<geom::GeometryCollection> getDiagram(const geom::GeometryFactory& geomFact);

/** \brief
* Gets the faces of the computed diagram as a {@link geom::GeometryCollection}
* Gets the faces of the computed diagram as a {@link geom::MultiLineString}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should say "edges"

@@ -107,13 +107,13 @@ class GEOS_DLL VoronoiDiagramBuilder {
std::unique_ptr<geom::GeometryCollection> getDiagram(const geom::GeometryFactory& geomFact);

/** \brief
* Gets the faces of the computed diagram as a {@link geom::GeometryCollection}
* Gets the faces of the computed diagram as a {@link geom::MultiLineString}
* of {@link geom::LineString}s, clipped as specified.
*
* @param geomFact the geometry factory to use to create the output
* @return the faces of the diagram
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should say "edges"

@@ -107,13 +107,13 @@ class GEOS_DLL VoronoiDiagramBuilder {
std::unique_ptr<geom::GeometryCollection> getDiagram(const geom::GeometryFactory& geomFact);

/** \brief
* Gets the faces of the computed diagram as a {@link geom::GeometryCollection}
* Gets the faces of the computed diagram as a {@link geom::MultiLineString}
* of {@link geom::LineString}s, clipped as specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should say "with the edge lines clipped as specified"

@dr-jts
Copy link
Contributor

dr-jts commented Nov 10, 2022

The MultiLineString option looks good to me.

The C API doc needs some work to describe the two potential results. Feel free to take a crack at this, or it can be fixed up later.

@dr-jts
Copy link
Contributor

dr-jts commented Nov 10, 2022

To me it feels like maybe there should be a separate C API function to return Voronoi edges (which is how this is exposed in PostGIS). Maybe that horse has left the barn, though.

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Nov 10, 2022

To me it feels like maybe there should be a separate C API function to return Voronoi edges (which is how this is exposed in PostGIS). Maybe that horse has left the barn, though.

@dr-jts yeah I felt exactly the same way when implementing. Since lots of libraries depend on GEOS CAPI, I did not dare.

I'll try and and fix the documentation, but I'm not sure I'd be the best at this. I'll try tomorrow, if it is not enough, I'll revert so you can merge this before it takes too long. What do you think?

@dr-jts
Copy link
Contributor

dr-jts commented Nov 10, 2022

I'll try and and fix the documentation, but I'm not sure I'd be the best at this. I'll try tomorrow, if it is not enough, I'll revert so you can merge this before it takes too long. What do you think?

Sounds good!

@dr-jts
Copy link
Contributor

dr-jts commented Nov 10, 2022

Another reason for sticking with MultiLineString as the return type for edges is that it's non-breaking change. Switching to a GeometryCollection might break API.

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Nov 10, 2022

Another reason for sticking with MultiLineString as the return type for edges is that it's non-breaking change. Switching to a GeometryCollection might break API.

It is still breaking since it doesn't return a LineString anymore. Although this could arguably be considered a bug fix.

And per documentation, returning a GeometryCollection would actually be already, since that is what is written.

Anyway, I'm fooling around. The MultiLineString choices seems practical, and will be easy enough to explain in RGeo

@BuonOmo BuonOmo removed the request for review from pramsey November 12, 2022 14:09
@BuonOmo BuonOmo requested a review from dr-jts November 12, 2022 14:09
@BuonOmo
Copy link
Contributor Author

BuonOmo commented Nov 20, 2022

@pramsey @dr-jts is there any blocker left?

@robe2
Copy link
Member

robe2 commented Nov 22, 2022

@dr-jts @dbaston @pramsey Is this okay to commit or something else needs to be done?

@dr-jts dr-jts changed the title Improve consistency of GEOSVoronoiDiagram Change GEOSVoronoiDiagram to return edges as MultiLineString Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants