Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

centroid fixups #629

Merged
merged 18 commits into from
Mar 8, 2021
Merged

centroid fixups #629

merged 18 commits into from
Mar 8, 2021

Conversation

michaelkirk
Copy link
Member

@michaelkirk michaelkirk commented Feb 20, 2021

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

  • Fixes the implementation for a MultiLineString with zero length lines
  • impl Centroid for Geometry and it's remaining variants.

FYI: I found these via the JTS test suite runner I'm working on here: https://github.com/michaelkirk/jts-test-runner/pull/1/files

Status:

Copy link
Member

@frewsxcv frewsxcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran out of time and still need to read through impl<T> Centroid for GeometryCollection<T>, but the rest looks good! Would be great if we had tests for the new implementations, but no worries if you don't have capacity

Cargo.toml Show resolved Hide resolved
.iter()
.flat_map(|line_string| Some(Point::from(*line_string.0.first()?)))
.collect();
MultiPoint(points).centroid()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to explore in the future is extracting the body of the impl Centroid for MultiPoint implementation into a separate function that takes an iterator of Points. Then we can call it here to avoid the Vec allocation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've gone ahead and pushed a commit to this PR to achieve this.

geo/src/algorithm/centroid.rs Show resolved Hide resolved
}

fn centroid_of_coords<T>(
mut coords_iter: impl Iterator<Item = Coordinate<T>>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered implementing this as impl Centroid for CoordsIter but thought it might be a footgun, since, e.g. Polygon implements CoordsIter, but we'd never want to use the impl CoordsIter for Centroid for a Polygon.

Copy link
Member

@frewsxcv frewsxcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty!

geo/src/algorithm/centroid.rs Outdated Show resolved Hide resolved
geo/src/algorithm/centroid.rs Outdated Show resolved Hide resolved
(Dimensions::TwoDimensional, Some(centroid)) => {
let mut centroid_accum = centroid_2d_accum.unwrap_or_default();

// REVIEW: unsigned?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so? It hurts my brain thinking about it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should be unsigned_area. We don't enforce orientations of polygon, and for centroid, the weights can't be negative (it's a convex combination)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that some test cases fail if I switch this to signed_area.

| Geometry::Polygon(_)
| Geometry::MultiPolygon(_)
| Geometry::Rect(_)
| Geometry::Triangle(_) => T::zero(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever expect this branch to get hit? If not, do you have opinions about using unreachable!() here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's complicated 😅

For Geometry::Point(_), and Geometry::MultiPoint(_), I'm confident that it should never happen and could add an unreachable.

HasDimensions for Rect and Triangle supports degenerating to 1-D.

impl<C: CoordNum> HasDimensions for Triangle<C> {
    ...
    fn dimensions(&self) -> Dimensions {
        if self.0 == self.1 && self.1 == self.2 {
            // degenerate triangle is a point
            Dimensions::ZeroDimensional
        } else if self.0 == self.1 || self.1 == self.2 || self.2 == self.0 {
            // degenerate triangle is a line
            Dimensions::OneDimensional
        } else {
            Dimensions::TwoDimensional
        }
    }
    ...
}

So I guess it's expected for some inputs.

For Polygon/MultiPolygon... it's certainly impossible for valid Polygons (closed linestring with 3+ points), but am not confident that we'd want to crash a production instance.

How about a debug_assert for those?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a debug_assert for those?

Up to you! My preference is to have assertions (or a comment) since it'll indicate to the reader that this is not a branch we expect to get hit, and could surface bugs

@michaelkirk michaelkirk force-pushed the mkirk/centroid-fixups branch 2 times, most recently from 6da8207 to 7343b5d Compare February 23, 2021 19:14
@michaelkirk michaelkirk force-pushed the mkirk/centroid-fixups branch 2 times, most recently from 61eb102 to c3b3ff2 Compare February 25, 2021 23:09
@michaelkirk
Copy link
Member Author

Ok! I've completed my TODOS:

Status:

I rebased, but here are the changes since your last review @frewsxcv:
https://github.com/georust/geo/pull/629/files/4b5ab0862bb06e19501534035414513c9053c1df..c3b3ff2a19a9db14709a708e4a3aeb8436f58675

geo/Cargo.toml Outdated
@@ -34,6 +34,9 @@ use-serde = ["serde", "geo-types/serde"]
[dev-dependencies]
approx = "0.4.0"
criterion = { version = "0.3" }
# jts-test-runner is an internal crate which exists only to be part of the geo test suite.
# As such it's kept unpublished. It's in a separate repo primarily because it's kind of large.
jts-test-runner = { git = "https://github.com/georust/jts-test-runner", commit = "bf873f7858792498b70451c2818c40d4e0975264" }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be controversial... but it seemed like a lot of ceremony to go through the publish dance for something which is basically an internal dependency.

I could be convinced otherwise!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything, I'd reconsider just including the crate in the main geo repo - It's 15MB of xml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to note is that the current cargo package size limit is 10mb…

Copy link
Member Author

@michaelkirk michaelkirk Feb 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good to know! My current solution is indeed to just agglomerate the test fixtures into the jts-test-runner crate binary.

I'm including a pretty small subset at this point, but good to know that we may eventually need to do something else.

Still, I'm now more inclined to leave it as a separate crate.

(Dimensions::TwoDimensional, Some(centroid)) => {
let mut centroid_accum = centroid_2d_accum.unwrap_or_default();

// REVIEW: unsigned?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should be unsigned_area. We don't enforce orientations of polygon, and for centroid, the weights can't be negative (it's a convex combination)

geo/Cargo.toml Outdated
@@ -34,6 +34,9 @@ use-serde = ["serde", "geo-types/serde"]
[dev-dependencies]
approx = "0.4.0"
criterion = { version = "0.3" }
# jts-test-runner is an internal crate which exists only to be part of the geo test suite.
# As such it's kept unpublished. It's in a separate repo primarily because it's kind of large.
jts-test-runner = { git = "https://github.com/georust/jts-test-runner", commit = "bf873f7858792498b70451c2818c40d4e0975264" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for leaving it as a separate unpublished crate. The Cargo.toml should be rev = <commit hash> (not commit =)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, thanks!

(Dimensions::Empty, _) | (_, None) => continue,
(Dimensions::ZeroDimensional, Some(centroid)) => {
let mut centroid_accum = centroid_0d_accum.unwrap_or_default();
let weight = T::one();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inaccurate IMO. Suppose we have a GeomCollection (MultiPoint with 3 points, MultiPoint with 4points), we would want the centroid to be the same as MultiPoint(the 7 points together). However, the above logic will weight them differently.

I think we should extend our Centroid trait to add a method that returns both the centroid Point, and the total weight of the geometry. The weight is number of points, or the total distance, or total area as per the dimension. It should also simplify our code in many cases were we re-compute the weights.

Copy link
Member Author

@michaelkirk michaelkirk Feb 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooph good catch. I'm going to rework this, but might not have time for a few days.

Thank you so much for catching this mistake!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I believe I've addressed this, but it ended being non-trivial. In the process I found a few more bugs and fleshed out some more tests (almost always the issues were with degenerate geometries).

I think we should extend our Centroid trait to add a method that returns both the centroid Point, and the total weight of the geometry.

I started with this approach, and it would've worked fine, but I felt things ended up cleaner when I bundled the state and logic into a separate private CentroidOperation accumulation struct.

Even in cases where there is a single geometry, we'll still utilize the aggregating operation, which at first seems weird. This is primarily to avoid redundantly having to recompute the centroids weight in the case the geometry does appear in a collection. Since computing the centroid essentially involves computing its weight anyway, there's not much lost by tracking it in this way, except some small extra stack allocations in some cases.

Since I've essentially re-written the trait at this point, it'd be great if I could get another review.

Copy link
Contributor

@rmanoka rmanoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool work @michaelkirk ! There's only one minor change I've requested in one of the degeneracy handling logic (1-D triangle vs closed line string).

fn add_triangle(&mut self, triangle: &Triangle<T>) {
match triangle.dimensions() {
ZeroDimensional => self.add_coord(triangle.0),
OneDimensional => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems different from how we are handling a degenerate ring. I think we should simply add all three line segments in this case rather than just the longest one. The centroid point itself doesn't change, but the weight would be double what it is now.

These sort of degenerate cases are not really valid as per SFS, so we just have to be internally consistent. Below is a natural test case that users would expect to pass (IMO).

#[test]
fn degenerate_triangle_like_ring() {
    let triangle = Triangle(c(0., 0.), c(1., 1.), c(2., 2.));
    let poly: Polygon<_> = triangle.clone().into();

    let line = Line::new(c(0., 1.), c(1., 3.));

    let g1 = GeometryCollection(vec![triangle.into(), line.into()]);
    let g2 = GeometryCollection(vec![poly.into(), line.into()]);
    assert_eq!(g1.centroid(), g2.centroid());
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a great point.

Plus your suggestion is simpler. =)

I've rebased and pushed up one new commit to address this for Triangles and Rects (which had the same issue).

Of these Geometry cases:
- only Triangle/Rect support collapsing to 1-D
- Point/MultiPoint should truly never be 1-D
- Polygon/MultiPolygon don't *currently* support being 1-D, but I could
  imagine a future that similarly allows invalid Polygons to collapse.
  In that case I'd rather hit a debug_assert and return a reasonable
  value here than crash someone in production.
Introduces a private CentroidOperation to accumulate results.
@michaelkirk
Copy link
Member Author

Thanks for the review @rmanoka!

Since it was a big change, I'm going to leave it open for a couple days before merging in case @frewsxcv or others want to take another look.

@michaelkirk
Copy link
Member Author

bors r+

@bors
Copy link
Contributor

bors bot commented Mar 8, 2021

Build succeeded:

@bors bors bot merged commit 6e37af2 into master Mar 8, 2021
bors bot added a commit that referenced this pull request Apr 13, 2021
639: Introduce the geomgraph module for DE-9IM Relate trait r=michaelkirk a=michaelkirk

- [x] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/master/CODE_OF_CONDUCT.md).
- [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users.
---

Fixes #513, #515

(I'm sorry it's so large)

~~I'm going to leave it as a draft (edit: 🤦 I failed to actually open the PR as a draft) while I wait to merge #636 and #638 and then do some rebasing, but I don't anticipate doing other large changes before review.~~ *update: ready for review!*

Here's some of the earlier work in pursuit of this:

#514
#516
#523
#524 
#538
#552
#561
#611
#628
#629
#636 

Primarily, this introduces the geomgraph module for a DE-9IM `Relate` trait.

geomgraph implements a topology graph largely inspired by JTS's module of the same name:
https://github.com/locationtech/jts/tree/jts-1.18.1/modules/core/src/main/java/org/locationtech/jts/geomgraph

You can see some of the reference code if you omit the "REMOVE JTS COMMENTS" commit. In some places the implementation is quite close to the JTS source. 

The overall "flow" is pretty similar to that of JTS, but in the small, there were some divergences. It's not easy (or desirable) to literally translate a Java codebase making heavy use of inheritance and pointers to rust. Additionally, I chose to take advantage of `Option` and rust's enums with associated data to make some case analysis more explicit.

There is a corresponding PR in our [jts-test-runner](georust/jts-test-runner#6) crate which includes the bulk of the tests for the new Relate trait.

## Algorithm Overview 

This functionality is accessed on geometries, via the `Relate` trait, e.g. `line.relate(point)` which returns a DE-9IM [`IntersectionMatrix`](https://en.wikipedia.org/wiki/DE-9IM#Matrix_model).

The `Relate` trait is driven by the `RelateOperation`. The `RelateOperation` builds a `GeometryGraph` for each of the two geometries being related. 

A `GeometryGraph` is a systematic way to organize the "interesting" parts of a geometry's structure - e.g. where its vertices, lines, and areas lie relative to one another.

Once the `RelateOperation` has built the two `GeometryGraph`s, it uses them to efficiently compare the two Geometries's structures, outputting the `IntesectionMatrix`.





Co-authored-by: Michael Kirk <michael.code@endoftheworl.de>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
bors bot added a commit that referenced this pull request Apr 13, 2021
639: Introduce the geomgraph module for DE-9IM Relate trait r=frewsxcv,rmanoka a=michaelkirk

- [x] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/master/CODE_OF_CONDUCT.md).
- [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users.
---

Fixes #513, #515

(I'm sorry it's so large)

~~I'm going to leave it as a draft (edit: 🤦 I failed to actually open the PR as a draft) while I wait to merge #636 and #638 and then do some rebasing, but I don't anticipate doing other large changes before review.~~ *update: ready for review!*

Here's some of the earlier work in pursuit of this:

#514
#516
#523
#524 
#538
#552
#561
#611
#628
#629
#636 

Primarily, this introduces the geomgraph module for a DE-9IM `Relate` trait.

geomgraph implements a topology graph largely inspired by JTS's module of the same name:
https://github.com/locationtech/jts/tree/jts-1.18.1/modules/core/src/main/java/org/locationtech/jts/geomgraph

You can see some of the reference code if you omit the "REMOVE JTS COMMENTS" commit. In some places the implementation is quite close to the JTS source. 

The overall "flow" is pretty similar to that of JTS, but in the small, there were some divergences. It's not easy (or desirable) to literally translate a Java codebase making heavy use of inheritance and pointers to rust. Additionally, I chose to take advantage of `Option` and rust's enums with associated data to make some case analysis more explicit.

There is a corresponding PR in our [jts-test-runner](georust/jts-test-runner#6) crate which includes the bulk of the tests for the new Relate trait.

## Algorithm Overview 

This functionality is accessed on geometries, via the `Relate` trait, e.g. `line.relate(point)` which returns a DE-9IM [`IntersectionMatrix`](https://en.wikipedia.org/wiki/DE-9IM#Matrix_model).

The `Relate` trait is driven by the `RelateOperation`. The `RelateOperation` builds a `GeometryGraph` for each of the two geometries being related. 

A `GeometryGraph` is a systematic way to organize the "interesting" parts of a geometry's structure - e.g. where its vertices, lines, and areas lie relative to one another.

Once the `RelateOperation` has built the two `GeometryGraph`s, it uses them to efficiently compare the two Geometries's structures, outputting the `IntesectionMatrix`.





Co-authored-by: Michael Kirk <michael.code@endoftheworl.de>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
@michaelkirk michaelkirk deleted the mkirk/centroid-fixups branch April 16, 2021 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants