Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CQL2 (Text): WKT Definition #800

Closed
eseglem opened this issue Mar 9, 2023 · 4 comments · Fixed by #904
Closed

CQL2 (Text): WKT Definition #800

eseglem opened this issue Mar 9, 2023 · 4 comments · Fixed by #904

Comments

@eseglem
Copy link
Contributor

eseglem commented Mar 9, 2023

I have noticed a few things which may be issues with the WKT definition in the grammar BNF. There are multiple definitions for WKT around, and I don't actually have access to the ISO spec, so it is hard to be certain what is correct and what isn't.

  • point is defined with an optional Z coord: xCoord yCoord [zCoord] but none of the *TaggedText allow for a "Z" in the text. For example: POINT Z (0 0 0) would be the correct (or at least standard) way to write a 3D point in WKT. But would not be valid according to the grammar.
    • Suggestion: Add ["Z"] to each of the *TaggedText definitions.
    • This doesn't actually enforce "Z" matching up with 2D / 3D coordinates, but I was not worried about that part. At least shapely and PostGIS don't care if its there or not for input. But they will include it when outputting WKT.
  • A BBOX does not seem to be a part of actual WKT but it is included in spatialLiteral. So, according to the grammar, you could have GEOMETRYCOLLECTION (BBOX (0 0 0 0), BBOX (0 0 0 0 0 0)) but tools like shapely and PostGIS are not able to parse that.
    • Suggestion: Create a geometryLiteral without bboxTaggedText and use it for geometryCollectionText
  • The grammar does not support EMPTY as an option. So POLYGON EMPTY would not be parseable. This may be intentional, but I could see some value in having it: S_EQUALS("my_geom", POLYGON EMPTY). And if nothing else, it would seem to be a compatibility issue.
    • Suggestion: Add "EMPTY" to each of the *TaggedText definitions.
    • If it were added to *Text that would allow weird things like POLYGON (EMPTY, EMPTY) which also doesn't work.

I don't know if the following are actually incorrect or not but they do seem to lead to compatibility issues. It doesn't appear to be possible to use some values which are valid in the grammar as input to many tools. It is also an issue when trying to go back and forth between cql2-text and cql2-json.

  • Since {"," point} is repeated 0 or more times, the grammar allows for LINESTRING(0 0). But that isn't a valid LineString in GeoJSON and tools like shapely and PostGIS throw an error when trying to parse it. They require at least 2 points.
    • Suggestion: Use lineStringText = "(" point "," point {"," point} ")"; Requiring at least 2 points..
  • The grammar also allows for polygons with 1-3 points. Like POLYGON((0 0, 1 1)) which also cannot be represented by GeoJSON or parsed by other tools. They require at least 4 points. (And the last point needs to be the same as the first, but I don't think that part can be represented in BNF.)
    • Suggestion: Add linearRingText = "(" point "," point "," point "," point {"," point } ")"; to require at least 4 points and use it instead of line polygonText = "(" linearRingText {"," linearRingText } ")";
  • The GeoJSON spec states "implementations SHOULD avoid nested GeometryCollections" so things end up in a weird spot if the WKT allows for it, but the GeoJSON implementation does not.
    • Personally, I would prefer to see it not allow nesting, but it also seems to be valid WKT. So I may be in the wrong here.
    • I also just noticed, while writing this up, that the cql2.json spec doesn't include GeometryCollection at all. I had mostly ignored that part of the spec and have been going directly from the GeoJSON RFC and https://github.com/geojson/schema.

All of that combined would be something like this:

spatialLiteral = geometryLiteral
               | geometryCollectionTaggedText
               | bboxTaggedText;

geometryLiteral = pointTaggedText
                | linestringTaggedText
                | polygonTaggedText
                | multipointTaggedText
                | multilinestringTaggedText
                | multipolygonTaggedText;

emptySet = "EMPTY"

pointTaggedText = "POINT" ["Z"] ( pointText | emptySet  );
linestringTaggedText = "LINESTRING" ["Z"] ( lineStringText | emptySet  );
polygonTaggedText = "POLYGON" ["Z"] ( polygonText | emptySet );
multipointTaggedText = "MULTIPOINT" ["Z"] ( multiPointText | emptySet );
multilinestringTaggedText = "MULTILINESTRING" ["Z"] ( multiLineStringText | emptySet );
multipolygonTaggedText = "MULTIPOLYGON" ["Z"] ( multiPolygonText | emptySet );
geometryCollectionTaggedText = "GEOMETRYCOLLECTION" ["Z"] ( geometryCollectionText | emptySet );

pointText = "(" point ")";
lineStringText = "(" point "," point {"," point} ")";
linearRingText = "(" point "," point "," point "," point {"," point } ")";
polygonText = "(" linearRingText {"," linearRingText} ")";
multiPointText = "(" pointText {"," pointText} ")";
multiLineStringText = "(" lineStringText {"," lineStringText} ")";
multiPolygonText = "(" polygonText {"," polygonText} ")";
geometryCollectionText = "(" geometryLiteral {"," geometryLiteral} ")";
@nmtoken
Copy link

nmtoken commented Mar 9, 2023

the WKT specification is: https://docs.opengeospatial.org/is/18-010r7/18-010r7.html

@eseglem
Copy link
Contributor Author

eseglem commented Mar 10, 2023

the WKT specification is: https://docs.opengeospatial.org/is/18-010r7/18-010r7.html

Unfortunately that is WKT for Coordinate Reference Systems and not geometries. I was referring to ISO/IEC 13249-3:2016. The best I have found are these two:

  • https://www.ogc.org/standard/sfa/ Section 7.2 (Page 52)
    • It has separate definitions for Regular, Z, M, and ZM though. Which may be the most correct, but the suggestions I made simplify it with Regular and Z combined. And ignored M.
  • https://github.com/postgis/postgis/blob/master/doc/bnf-wkt.txt
    • This one has point with optional Z and M and combines them all into a single definition. Which is like what I suggested, but with M included.
    • Includes some extras which aren't included elsewhere. Curve, Ring, Surface, Triangle, etc.

I thought there was more of an issue with how "EMPTY" was being handled in various grammars. Where polygonText = "EMPTY" | ... and multiPolygonText = "EMPTY" | "(" polygonText {"," polygonText} ")" so that would say MULTIPOLYGON(EMPTY, EMPTY) is valid. And it didn't seem like there was consistency in how things like this are handled. I went back to test this more thoroughly, and perhaps I was running into some bugs.

  • It applies to Polygon, MultiPoint, MultiLineString, and MultiPolygon in both of those definitions.
  • Polygon
    • Shapely parses POLYGON (EMPTY, ...)
    • PostGIS will throw an error. (Bug?)
  • MultiPoint, MultiLineString, and MultiPolygon - They both parse
  • Output WKT
    • Shapely outputs EMPTY. (Bug?)
    • PostGIS outputs the (EMPTY, ...)

To match those definitions "EMTPY" should be on *Text vs *TaggedText.

pointTaggedText = "POINT" ["Z"] pointText;
linestringTaggedText = "LINESTRING" ["Z"] lineStringText;
polygonTaggedText = "POLYGON" ["Z"] polygonText;
multipointTaggedText = "MULTIPOINT" ["Z"] multiPointText;
multilinestringTaggedText = "MULTILINESTRING" ["Z"] multiLineStringText;
multipolygonTaggedText = "MULTIPOLYGON" ["Z"] multiPolygonText;
geometryCollectionTaggedText = "GEOMETRYCOLLECTION" ["Z"] geometryCollectionText;

pointText = emptySet | "(" point ")";
lineStringText = emptySet | "(" point "," point {"," point} ")";
linearRingText = emptySet | "(" point "," point "," point "," point {"," point } ")";
polygonText = emptySet | "(" linearRingText {"," linearRingText} ")";
multiPointText = emptySet | "(" pointText {"," pointText} ")";
multiLineStringText = emptySet | "(" lineStringText {"," lineStringText} ")";
multiPolygonText = emptySet | "(" polygonText {"," polygonText} ")";
geometryCollectionText = emptySet | "(" geometryLiteral {"," geometryLiteral} ")";

Unfortunately, that does lead to some compatibility issues with GeoJSON. If you have MULTIPOLYGON (EMPTY, EMPTY) you might try {"type":"MultiPolygon","coordinates":[[],[]]} which is what PostGIS will output. PostGIS will accept it back in but it gets rolled up to "coordinates": [] as GeoJSON and EMPTY as WKT.

So really, everything related to "EMPTY" is a mess.

@clausnagel
Copy link
Member

clausnagel commented Mar 7, 2024

Found this to be open since almost one year. It seems that this issue has not been considered/addressed in the draft that is currently under vote at the SWG? I guess at least 3D coordinates need fixing in the BNF.

@cportele
Copy link
Member

@eseglem - Apologies, we overlooked this issue.

OGC WKT is defined in the normative reference OGC 06-103r4: OpenGIS® Implementation Standard for Geographic information - Simple feature access - Part 1: Common architecture. As far as I know, the SQL MM WKT extends OGC WKT.

On the comments/proposals, I agree with all the issues that you have identified and will create a PR.

The WKT grammar in 7.2.3 of OGC 06-103r4 for 3D coordinates includes a "z". So we should do this in CQL2 as well.

I also would say that we should prohibit the inclusion of a GEOMETRYCOLLECTION in another GEOMETRYCOLLECTION like we do in CQL2 JSON (the PR #902 already adds GeometryCollection to the CQL2 JSON schema).

The only aspect from you list that I am unsure about is the inclusion of EMPTY geometries. The GeoJSON schemas do not support something like {"type":"Point", "coordinates": []}.

If we would allow EMPTY in CQL2 Text, it would still not be possible to represent something like S_EQUALS("my_geom", POINT EMPTY) in CQL2 JSON. I would, therefore, prefer to not include EMPTY in CQL2. I will exclude these changes for now in the PR.

{"type":"Polygon", "coordinates": []} is for some reason allowed, but the semantics are basically undefined as GeoJSON only states a permission how this could be interpreted:

GeoJSON processors MAY interpret Geometry objects with empty "coordinates" arrays as null objects.

PS: By the way, the WKT grammar allows MULTIPOINT(EMPTY, EMPTY), but not POLYGON(EMPTY).

cportele added a commit that referenced this issue Mar 10, 2024
Closes #800. As discussed in #800 (comment), all suggestions are included except the proposal to support EMPTY in WKT geometries.

In addition, the following changes are included:

- change version and document generation to 21-065r1
- add missing requirements for the `basic-spatial-functions-plus` requirements class in the encoding requirements classes
- add missing conditional dependencies in the encoding ATSs
- fix a spelling mistake in the ATS
- fix an incorrect requirement identifier
Features Part 3: Filtering / Common Query Language (CQL2) automation moved this from In review to Done Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

4 participants