New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hole intersection at tangential coordinate #10332

Closed
wants to merge 6 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@nknize
Member

nknize commented Mar 31, 2015

OGC SFA 2.1.10 assertion 3 (https://portal.opengeospatial.org/files/?artifact_id=829 ) allows interior boundaries to touch exterior boundaries provided they intersect at a single point. Issue #9511 provides an example where a valid shape is incorrectly interpreted as invalid (a false violation of assertion 3). When the intersecting point appears as the first and last coordinate of the interior boundary in a polygon, the ShapeBuilder incorrectly counted this as multiple intersecting vertices.

The fix required a little more than just a logic check. Passing the duplicate vertices resulted in a connected component in the edge graph causing a false invalid self crossing polygon. This required additional logic to the edge assignment in order to correctly segment the connected components. Finally, an additional hole validation has been added along with proper unit tests for testing valid and invalid conditions (including dateline crossing polys).

closes #9511

[GEO] Fix hole intersection at tangential coordinate
OGC SFA 2.1.10 assertion 3 allows interior boundaries to touch exterior boundaries provided they intersect at a single point. Issue #9511 provides an example where a valid shape is incorrectly interpreted as invalid (a false violation of assertion 3).  When the intersecting point appears as the first and last coordinate of the interior boundary in a polygon, the ShapeBuilder incorrectly counted this as multiple intersecting vertices. The fix required a little more than just a logic check. Passing the duplicate vertices resulted in a connected component in the edge graph causing an invalid self crossing polygon. This required additional logic to the edge assignment in order to correctly segment the connected components. Finally, an additional hole validation has been added along with proper unit tests for testing valid and invalid conditions (including dateline crossing polys).

closes #9511
* Validates only 1 vertex is tangential (shared) between the interior and exterior of a polygon
*/
protected void validateHole(BaseLineStringBuilder shell, BaseLineStringBuilder hole) {
ArrayList exterior = Lists.newArrayList(Sets.newHashSet(shell.points));

This comment has been minimized.

@rjernst

rjernst Mar 31, 2015

Member

Why the intermediate hash set? The hash set will have random iteration order right? So then the retainAll below becomes O(N^2)? Why not just copy the array lists, sort, and then call retainAll?

This comment has been minimized.

@rjernst

rjernst Mar 31, 2015

Member

Actually, it looks like retainAll always uses contains (I was thinking it could be smarter with a sorted list, but of course it has no way to know it is sorted!). So I would copy, sort, then just iterate both in step?

This comment has been minimized.

@nknize

nknize Mar 31, 2015

Member

I noticed HashSet inherits retainAll from AbstractCollection. So I eliminated the intermediate ArrayList and used it directly to avoid both unnecessary memory use and having to sort.

do {
current.coordinate = shift(current.coordinate, shift);
current.coordinate = shift(current.coordinate, shift);

This comment has been minimized.

@rjernst

rjernst Mar 31, 2015

Member

I know this was already here, but could you rename the variable shift to something that doesn't collide with the function name? Maybe datelineShift?

This comment has been minimized.

@rjernst

rjernst Mar 31, 2015

Member

Or datelineOffset?

@nknize

This comment has been minimized.

Member

nknize commented Apr 1, 2015

Thanks for the feedback @rjernst. You're pretty busy with field mappings so maybe @colings86 has some feedback?

@s1monw

This comment has been minimized.

Contributor

s1monw commented Apr 2, 2015

@colings86 I assigned you here

@clement-tourriere

This comment has been minimized.

Be careful, this PR can create an infinite loop while parsing polygon, if two following edges have the same coordinates.

For instance :

{
    "type": "Polygon",
    "coordinates": [
      [
        [
          -50.2734375,
          43.83452678223682
        ],
        [
          -8.7890625,
          53.74871079689897
        ],
        [
          -8.7890625,
          53.74871079689897
        ],
        [
          -8.7990625,
          53.74871079689897
        ],
        [
          -50.2734375,
          43.83452678223682
        ]
      ]
    ]
  }

To detect a closed loop, you can check that current coordinate is different than previous one :
if (visitedEdge.containsKey(current.coordinate)&& !current.coordinate.equals(prev.coordinate)) {

Fix infinite loop for self-loops in edge graph
The PR review process caught an infinite loop for polygons containing duplicate sequential coordinates (excluding start and end points). This created a self loop in the edge graph that caused an infinite loop during traversal. One proposed solution was to check for duplicate points, ignore, and proceed. This violates the OGC SFA definition of a Simple Curve (the base of a LinearRing). Per 6.1.6.1 in v.1.2.1, "A Curve is simple if it does not pass through the same point twice with the possible exception of the two end points...A curve that is simple and closed is a Ring".

Keeping true to "fail early", this commit catches self-loops during creation of the edge and throws an IllegalShapeException. It complys with the OGC spec and avoids any further processing if an invalid ring is provided. One potential downside is the additional rigor it adds to loosely defined GeoJSON.
@nknize

This comment has been minimized.

Member

nknize commented Apr 2, 2015

Many thanks for the catch @clement-tourriere! I suppose this is a good time for us to re-enable our random shape testing.

I've added a commit that catches self-loops during edge creation and throws an InvalidShapeException if found. This does add more rigor to the shape validation, but it also complies with OGC SFA 6.1.6.1 while staying true to the "fail early" design.

@clement-tourriere

This comment has been minimized.

Contributor

clement-tourriere commented Apr 3, 2015

This kind of validation could be problematic. JTS library considers these kind of shapes as valid. So we can't easily know, before sending shapes to ES cluster, which ones will be valid. I really think that ignoring duplicate sequential coordinates (excluding first and last) in ring could be a better solution.

@nknize

This comment has been minimized.

Member

nknize commented Apr 3, 2015

You raise an interesting point re: pre-validation. JTS violates other OGC specs (e.g., orientation) and handles dateline/pole wrapping differently, so I wouldn't necessarily suggest it as ground truth for pre-validation. Many of the JTS issues will be refactored and corrected. In the meantime you can pre-validate using the ES java API prior to sending it to the cluster. Its an extra step but more cost effective than validating with JTS - and you'll get ground truth w.r.t ES expectations.

@clintongormley clintongormley added v1.5.2 and removed v1.5.1 labels Apr 9, 2015

HashSet interior = Sets.newHashSet(hole.points);
exterior.retainAll(interior);
if (exterior.size() >= 2) {
throw new InvalidShapeException("Invalid polygon, interior cannot share more than one point");

This comment has been minimized.

@colings86

colings86 Apr 9, 2015

Member

Would this be clearer as "Invalid polygon, interior cannot share more than one point with the exterior"?

// found a closed loop - we have two connected components so we need to slice into two distinct components
if (visitedEdge.containsKey(current.coordinate)) {
if (connectedComponents > 0 && current.next != edge) {
throw new InvalidShapeException("Shape contains more than one tangential point");

This comment has been minimized.

@colings86

colings86 Apr 10, 2015

Member

Should we make this message a bit more user friendly? I'm not sure may users will know what a tangential point is

@colings86

This comment has been minimized.

Member

colings86 commented Apr 10, 2015

@nknize left one more error message comment but otherwise LGTM

@nknize

This comment has been minimized.

Member

nknize commented Apr 10, 2015

merged

@nknize nknize closed this Apr 10, 2015

@nknize nknize removed the review label Apr 10, 2015

@clintongormley clintongormley added the >bug label Apr 11, 2015

@clintongormley clintongormley changed the title from [GEO] Fix hole intersection at tangential coordinate to Fix hole intersection at tangential coordinate May 29, 2015

@nknize nknize deleted the nknize:fix/9511 branch May 27, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment