Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st_as_text: format of MULTIPOINT #1219

Closed
ateucher opened this issue Dec 16, 2019 · 10 comments · Fixed by #1221
Closed

st_as_text: format of MULTIPOINT #1219

ateucher opened this issue Dec 16, 2019 · 10 comments · Fixed by #1221

Comments

@ateucher
Copy link
Contributor

ateucher commented Dec 16, 2019

I think that format of MULTIPOINT when output as wkt using st_as_text() may not be in a standard format, in that I think each point element should be surrounded by parentheses. E.g.,

MULTIPOINT ((1 2), (3 4))
# vs
MULTIPOINT (1 2, 3 4)

The Simple Features Specification is a little ambiguous, but in the examples (pg 61) it shows it with the inner parentheses. In addition, this stackoverflow answer outlines a good case for the inclusion of inner parentheses.

This has affected me as I have been using a geoserver wfs service to obtain spatial data, and supplying multipoint features to a geometry predicate (e.g., INTERSECTS) without nested parenthesis fails:

library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(httr)

mp_sfc <- sf::st_as_sfc("MULTIPOINT ((1236285 463726.8), (1228264 463547.4))")
mp_sfc
#> Geometry set for 1 feature 
#> geometry type:  MULTIPOINT
#> dimension:      XY
#> bbox:           xmin: 1228264 ymin: 463547.4 xmax: 1236285 ymax: 463726.8
#> epsg (SRID):    NA
#> proj4string:    NA
#> MULTIPOINT (1236285 463726.8, 1228264 463547.4)

mp <- sf::st_as_text(mp_sfc)
mp # no inner parentheses
#> [1] "MULTIPOINT (1236285 463726.8, 1228264 463547.4)"

format_multipoint <- function(x) {
  gsub("(-?[0-9.]+\\s+-?[0-9.]+)(,?)", "(\\1)\\2", x)
}

mp2 <- format_multipoint(mp)
mp2 # inner parentheses added
#> [1] "MULTIPOINT ((1236285 463726.8), (1228264 463547.4))"

# Issue a request to geoserver using the multipoint without parentheses around
# points
a <- GET("https://openmaps.gov.bc.ca/geo/pub/wfs",
          query = list(
            SERVICE = "WFS",
            VERSION = "2.0.0",
            REQUEST = "GetFeature",
            outputFormat = "application/json",
            typeNames = "WHSE_BASEMAPPING.GBA_LOCAL_REG_GREENSPACES_SP",
            SRSNAME = "EPSG:3005",
            CQL_FILTER = paste0("INTERSECTS(SHAPE, ", mp, ")")
          ))

URLdecode(a$url)
#> [1] "https://openmaps.gov.bc.ca/geo/pub/wfs?SERVICE=WFS&VERSION=2.0.0&REQUEST=GetFeature&outputFormat=application/json&typeNames=WHSE_BASEMAPPING.GBA_LOCAL_REG_GREENSPACES_SP&SRSNAME=EPSG:3005&CQL_FILTER=INTERSECTS(SHAPE, MULTIPOINT (1236285 463726.8, 1228264 463547.4))"
a$status_code
#> [1] 400
content(a, as = "text")
#> No encoding supplied: defaulting to UTF-8.
#> [1] "<?xml version=\"1.0\" encoding=\"UTF-8\"?><ows:ExceptionReport xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" xmlns:ows=\"http://www.opengis.net/ows/1.1\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" version=\"2.0.0\" xsi:schemaLocation=\"http://www.opengis.net/ows/1.1 http://openmaps.gov.bc.ca/geo/schemas/ows/1.1.0/owsAll.xsd\">\r\n  <ows:Exception exceptionCode=\"NoApplicableCode\">\r\n    <ows:ExceptionText>Could not parse CQL filter list.\r\norg.geotools.filter.AttributeExpressionImpl cannot be cast to org.locationtech.jts.geom.Coordinate Parsing : INTERSECTS(SHAPE, MULTIPOINT (1236285 463726.8, 1228264 463547.4)).</ows:ExceptionText>\r\n  </ows:Exception>\r\n</ows:ExceptionReport>\r\n"

# Issue a request to geoserver using the multipoint *with* parentheses around
# points
b <- httr::GET("https://openmaps.gov.bc.ca/geo/pub/wfs",
          query = list(
            SERVICE = "WFS",
            VERSION = "2.0.0",
            REQUEST = "GetFeature",
            outputFormat = "application/json",
            typeNames = "WHSE_BASEMAPPING.GBA_LOCAL_REG_GREENSPACES_SP",
            SRSNAME = "EPSG:3005",
            CQL_FILTER = paste0("INTERSECTS(SHAPE, ", mp2, ")")
          ))

URLdecode(b$url)
#> [1] "https://openmaps.gov.bc.ca/geo/pub/wfs?SERVICE=WFS&VERSION=2.0.0&REQUEST=GetFeature&outputFormat=application/json&typeNames=WHSE_BASEMAPPING.GBA_LOCAL_REG_GREENSPACES_SP&SRSNAME=EPSG:3005&CQL_FILTER=INTERSECTS(SHAPE, MULTIPOINT ((1236285 463726.8), (1228264 463547.4)))"
b$status_code
#> [1] 200
b_content <- content(b, as = "text")
read_sf(b_content)
#> Simple feature collection with 2 features and 19 fields
#> geometry type:  POLYGON
#> dimension:      XY
#> bbox:           xmin: 1228180 ymin: 463348.8 xmax: 1236544 ymax: 463796.4
#> epsg (SRID):    3005
#> proj4string:    +proj=aea +lat_1=50 +lat_2=58.5 +lat_0=45 +lon_0=-126 +x_0=1000000 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
#> # A tibble: 2 x 20
#>   id    LOCAL_REG_GREEN… PARK_NAME PARK_TYPE PARK_PRIMARY_USE REGIONAL_DISTRI…
#>   <chr>            <int> <chr>     <chr>     <chr>            <chr>           
#> 1 WHSE…               19 55F - Gr… Local     Green Space      Metro Vancouver 
#> 2 WHSE…               20 50B - Ut… Local     Park             Metro Vancouver 
#> # … with 14 more variables: MUNICIPALITY <chr>, CIVIC_NUMBER <int>,
#> #   CIVIC_NUMBER_SUFFIX <chr>, STREET_NAME <chr>, LATITUDE <dbl>,
#> #   LONGITUDE <dbl>, WHEN_UPDATED <date>, WEBSITE_URL <chr>,
#> #   LICENCE_COMMENTS <chr>, FEATURE_AREA_SQM <dbl>, FEATURE_LENGTH_M <dbl>,
#> #   OBJECTID <int>, SE_ANNO_CAD_DATA <chr>, geometry <POLYGON [m]>

Created on 2019-12-16 by the reprex package (v0.3.0)

cc @boshek

@edzer
Copy link
Member

edzer commented Dec 16, 2019

I agree that the grammar description copied to the SO answer implies double parentheses. Funny enough, GEOS doesn't do it:

> lwgeom::st_astext(st_multipoint(rbind(c(0,1), c(10,11))))
[1] "MULTIPOINT(0 1,10 11)"

but GDAL/OGR does; here exported to GPKG and read in:

/tmp$ ogrinfo x.gpkg x
INFO: Open of `x.gpkg'
      using driver `GPKG' successful.

Layer name: x
Geometry: Multi Point
Feature Count: 1
Extent: (0.000000, 1.000000) - (10.000000, 11.000000)
Layer SRS WKT:
(unknown)
FID Column = fid
Geometry Column = geom
g: Real (0.0)
OGRFeature(x):1
  g (Real) = 1
  MULTIPOINT ((0 1),(10 11))

@ateucher
Copy link
Contributor Author

I noticed that as well... though I thought that the lwgeom::st_astext was a PostGIS function, not GEOS?

Incidentally, I believe that the database which is being served by geoserver is Oracle, and I've seen elsewhere that while some software/databases accept either, Oracle is pickier (e.g., https://viswaug.wordpress.com/2011/02/13/well-known-text-wkt-representation-for-multipoint/).

Do you think it's worth an issue in the postgis repo as well?

@ateucher
Copy link
Contributor Author

In PostGIS they explicitly don't use inner parens: https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/liblwgeom/lwout_wkt.c#L218-L245. Though looking at the git blame, that was 10 yrs ago, seems the standard has specified the parentheses since then.

@edzer
Copy link
Member

edzer commented Dec 17, 2019

In the sfa document the examples in 6.1.2.6.3 conflict with the grammar and with the example in Table 6. So best we can say is that the standard is ambiguous.

You are right about lwgeom taking the postgis (liblwgeom) function.

@ateucher
Copy link
Contributor Author

ateucher commented Dec 17, 2019

Thanks @edzer, I agree about the ambiguity. What do you think is the best approach for sf? From what I can see, I think the nested parentheses are universally supported, while the bare point coordinate pairs are not.

@edzer
Copy link
Member

edzer commented Dec 17, 2019

If everyone can read the double parens, I see no objection to use them in sf. But we'd have to do a full reverse dependency check before committing to master.

@ateucher
Copy link
Contributor Author

Ok thanks! I'll try to find some time to explore further and put a PR together

ateucher added a commit to ateucher/sf that referenced this issue Dec 18, 2019
@ateucher
Copy link
Contributor Author

Just confirming that while PostGIS's st_astext() returns a MULTIPOINT without nested parens, it does accept them as valid:

CREATE TABLE mytable (
  id SERIAL PRIMARY KEY,
  geom GEOMETRY(Multipoint, 26910),
  name VARCHAR(128)
);
 
-- Add a multipoint (using nested parens)
INSERT INTO mytable (geom, name) 
VALUES (
  ST_GeomFromText('MULTIPOINT ((0 0), (1 1))', 26910), 
  'test'
);

-- Read it back as text
SELECT id, name, st_astext(geom) geomtext FROM mytable;
--  id | name |      geomtext
-- ----+------+---------------------
--   1 | test | MULTIPOINT(0 0,1 1)
-- (1 row)

@ateucher
Copy link
Contributor Author

I have posted a message to the PostGIS mailing list.

ateucher added a commit to ateucher/sf that referenced this issue Dec 18, 2019
- prnt.MULTIPOINT is also used for CIRCULARSTRING and CURVE
@dbaston
Copy link
Contributor

dbaston commented Dec 19, 2019

FWIW, the GEOS WKT parser accepts both forms and refers to the single-paren version as the "deprecated form": https://github.com/libgeos/geos/blob/master/src/io/WKTReader.cpp#L292. This has been the case since 2006.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants