Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IsValidOp.isValid(LineString) does not check for repeated points #486

Open
jagill opened this issue Oct 10, 2019 · 10 comments
Open

IsValidOp.isValid(LineString) does not check for repeated points #486

jagill opened this issue Oct 10, 2019 · 10 comments

Comments

@jagill
Copy link

jagill commented Oct 10, 2019

The isValid method for LineStrings does not check for repeated points. This is a condition for validity in the ISO spec (ISO/IEC 13249-3:2016 section 7.2.3), and is caught by ESRI.

IsValidOp only checks if the coordinates are individually valid, and that there are enough points in the LineString (checkTooFewPoints dedups consecutive points):

  private void checkValid(LineString g)
  {
    checkInvalidCoordinates(g.getCoordinates());
    if (validErr != null) return;
    GeometryGraph graph = new GeometryGraph(0, g);
    checkTooFewPoints(graph);
  }

You can check this with "LINESTRING (0 0, 0 1, 0 1, 1 1, 1 0, 0 0)" :

Geometry geom = new WKTReader().read(wkt);
assert(geom.isValid() == false); // FAILS

Note that also geom.isSimple() == true (which is arguably correct).

I'm happy to make a PR to fix this if desired. And if I'm misunderstanding something, let me know!

@dr-jts
Copy link
Contributor

dr-jts commented Oct 10, 2019

That spec is behind a paywall. Do you have a copy?

@dr-jts
Copy link
Contributor

dr-jts commented Oct 10, 2019

This has come up in the PostGIS project as well:

https://trac.osgeo.org/postgis/ticket/3425

In that context I wrote in defence of the current behaviour, based on my interpretation of the OGC Simple Feature Access for SQL (SFS) spec:

https://lists.osgeo.org/pipermail/postgis-users/2010-March/026265.html
https://gis.stackexchange.com/questions/334261/st-isvalid-and-duplicate-vertex/334264#334264

Is the ISO spec more clear on this issue?

Anyway, this behaviour is intentional in JTS. And it's been there forever, so I am disinclined to change it for the default. Perhaps it could be provided as an option to IsValidOp though, for those that require it.

Is this causing an actual problem for you?

@jagill
Copy link
Author

jagill commented Oct 10, 2019

The ISO spec is a licensed product (unfortunately). I'll quote the relevant bit (under fair use) below; constructing the LineString will ultimately call the ST_Points setter quoted. The ISO spec is more clear on the distinction between ST_IsValid vs ST_IsSimple, and a Curve vs a LineString, which I didn't find clear in the open documentation (like what you linked to).

One thing the ISO spec is more clear on is that checking validity means checks on construction of geometries, and mostly are O(N) time and O(1) space checks (with one or two annoying exceptions). Simplicity checks are generally more expensive (like calculating self-intersections). The spec is unclear if a geometry can be invalid but simple: at least with my current reading, it seems like you can have invalid but simple geometries, which is non-intuitive. I appreciate that JTS doesn't fail on invalid geometries; I think it's approach of constructing permissively but allowing a check is the right one.

A second thing is that, viewed as a Curve, consecutive repeated points are not a validity issue. As your argument correctly points out, consecutive repeated points don't affect the embedded image of the curve, which is all Curve validity cares about. However, LineStrings have an explicit prohibition against repeated points in their construction.

This is an issue for me because I'm exploring replacing Presto's use of Esri with JTS (partially motived by packed coordinates), and this would be a non-backwards-compatible change, and one that reduces our compliance with the spec. I don't know if any algorithms would be ill-behaved for these geometries.

On the other hand, consistency with PostGIS is also a goal, so I'd like to keep as close to upstream as possible!

CREATE METHOD ST_Points
(apointarray ST_Point ARRAY[ST_MaxGeometryArrayElements]) RETURNS ST_LineString
FOR ST_LineString
BEGIN
-- If apointarray is the null value, contains null elements, or
-- contains consecutive duplicate points, then raise an exception. CALL ST_CheckConsecDups(apointarray);
-- If SELF is the null value, then return the null value. IF SELF IS NULL THEN
         RETURN CAST (NULL AS ST_LineString);
      END IF;
-- Check that there are no mixed spatial reference -- systems between SELF and apointarray.
IF (CARDINALITY(apointarray) > 0) AND
(SELF.ST_SRID() <> ST_CheckSRID(apointarray)) THEN SIGNAL SQLSTATE '2FF10'
SET MESSAGE_TEXT = 'mixed spatial reference systems'; END IF;
      RETURN
         SELF.ST_PrivateDimension(1).
END

@dr-jts
Copy link
Contributor

dr-jts commented Oct 11, 2019

Thanks for the spec extract. It does seem clear that repeated points are invalid. I assume this applies to LinearRings as well?

There's a few places where this comes into play:

  1. Geometry constructors - should it error in input with repeated points? Generally JTS tries to be lenient about what is put into a geometry by a client. One key reason is because for one thing it's useful to be able to hold invalid geometry, for further processing. This is the PostGIS philosophy as well
  2. Geometry validation - As noted previously, the spec behaviour could be an optional flag (similar to the flag for ESRI-style polygon topology)
  3. Geometry constructions - (i.e. geometries constructed as a result of JTS operations) JTS tries to construct geometry as cleanly as possible. In particular, repeated points should not be emitted (if they are it's a bug, which should be easy to fix)

Do you which of the above is required for your use cases?

it should be straightforward to implement pre-or-post processors to check for repeated points, if that's desired.

@dr-jts
Copy link
Contributor

dr-jts commented Oct 11, 2019

The issue of polygon topology semantics is a much bigger one, because it's harder (code and performance-wise) to check. I assume that the ISO spec is in line with the SFS here? And the ESRI library is not? Is that an issue as well?

@jagill
Copy link
Author

jagill commented Oct 11, 2019

LinearRings are considered LinearStrings and have all the properties/prohibitions inherited from them.

I really like JTS's approach of not erroring out when constructing invalid geometries, and leaving validity checks to isValid(). The permissive construction is important for serializing and deserializing possibly dirty data, as well as for performance reasons. So I'd like option 2, and putting it behind an optional flag works fine.

"Cleaning up" on construction might be ok, but this prevents people from figuring out if the underlying WKT/WKB/etc is invalid. Ie, ST_IsValid(ST_GeometryFromText('LineString(0 0, 1 1, 1 1, 2 2)')) should return false, but if you clean it up it will return true. Is a boolean like cleanupOnConstruction possible for GeometryFactory/CoordinateSequenceFactory?

I'm happy to leave polygon semantics discussion for another time :). In particular, the ISO spec requires polygon validity to include simplicity in the constituent rings, which is very expensive. I view this as an unfortunate part of the spec. SFS does not really distinguish between isValid() and isSimple(); in fact I don't think it defines ST_IsValid at all.

@dr-jts
Copy link
Contributor

dr-jts commented Oct 11, 2019

"Cleaning up" on construction might be ok, but this prevents people from figuring out if the underlying WKT/WKB/etc is invalid.

Sorry, there was a typo in my comment, which made it unclear that point 3 actually refers to geometry constructed by JTS as a result of operations. Hopefully clearer now (see edit above), and not contentious.

I'm happy to leave polygon semantics discussion for another time :). In particular, the ISO spec requires polygon validity to include simplicity in the constituent rings, which is very expensive.

it probably is a bit more expensive to handle simple rings than without this provision. Although the alternative ESRI-style semantic still does not require that valid rings are noded, and thus still requires a relatively expensive check for node points.

SFS does not really distinguish between isValid() and isSimple(); in fact I don't think it defines ST_IsValid at all.

Correct, the SFS does not define IsValid - a glaring omission. It quickly became obvious during JTS development that checking validity was complex, and required for usability.

@jratike80
Copy link

I am not sure if this is off-topic or not, but I found this type of problematic geometry when I tried to understand why ArcMap and ArcGIS stopped rendering a large layer. If the jump in Z is not at the start/end point then the geometry gets rendered. OpenJUMP reports that there are consecutive points but they are consecutive only when it comes to X and Y. I believe that according to SFS this is not a polygon at all because it is not planar. However, I think that geometry validators usually do not care if XYZ polygons are planar or not. Just wondering about what JTS should say about it.

POLYGON Z ((
        257520 6782200 1, 
        257580 6782200 1, 
        257520 6782140 1, 
        257520 6782200 0, 
        257520 6782200 1
    ))

@dr-jts
Copy link
Contributor

dr-jts commented Jun 21, 2022

Just wondering about what JTS should say about it.

Currently JTS validation only checks X and Y. So this example is considered valid.

It seems overlay finicky to not be able to render because of this situation! Aren't vertical lines supported?

@jratike80
Copy link

It seems overlay finicky to not be able to render because of this situation! Aren't vertical lines supported?

Only at that place of geometry (end/start point). In the middle of the ring the vertical segment does not prevent rendering. I tried to make a bug report through the local ESRI support but I do not know if it was filed or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants