-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WFS / GML parse issue, but QGIS loads GML as file fine? #45017
Comments
Trying to debug this myself, setting a breakpoint in the parser part where the error is returned: Grabbing the output ( output.txt )
|
I was told that the xmltodict module also was 'expat' based (if I am correct the parsing in QGIS is done via expat xml lib), so I tried:
but that runs fine, no parse issue? |
This is a GeoServer bug, not a QGIS one. GeoServer should refuse to expose such a layer directly, or it should modify the attributes whose name starts with a digit. The identifier of a XML element must be a valid QName (https://en.wikipedia.org/wiki/QName), which implies that the unqualified part doesn't start with a digit. libxml2 rejects b.gml:
The OGR GML driver when forced to use Xerces-C too:
Similarly if using the OGR GMLAS driver:
Here Xerces-C rejects the DescribeFeatureType response directly (the GMLAS driver is fully schema aware) So your question why the OGR GML driver in by default Expat mode does accept that, and QGIS QgsGmlStreamingParser which does use it emits a "not well-formed (invalid token)" error is a good one. And that can be easily seen when using the Python Expat bindings:
I'd say Expat not rejecting the file in the namespace unaware mode could be considered as a bug (not sure if it is intended, perhaps running in that mode means that people are expected laxer checks...) I don't think we should try to do something on QGIS side regarding that. If we wanted to do that, that would mean changing the parsing in namespace unaware mode, but this could add potential fragility. |
Thanks @rouault for your research and explanation. I created an issue at geoserver: https://osgeo-org.atlassian.net/jira/software/c/projects/GEOS/issues/GEOS-10231 The fact that QGIS parses the same output in different ways: I'm not not really happy with that, it's not very consequent. BUT current behaviour at least makes QGIS a little forgiving (in case of the file at least)... But I wonder if it would be nice if QGIS would maybe give some more usefull info to the average user. A lot of people are not aware of the Log messages panel, or are just not able to check. The parsers warnings actually points to ':' or 'regelink:' which are actually fine... it is the next chars that are actually the problem, that tricked me too. Should I close this one? |
What is the bug or the crash?
Having a Geoserver WFS, QGIS fails to show Features of it.
BUT: replaying the request again via curl and downloading the gml, QGIS is fine with it.
QGIS tries to request 1 features several times, BUT says it is not 'well-formed':
Retrying request https://myserver/wfs?SERVICE=WFS&REQUEST=GetFeature&VERSION=2.0.0&TYPENAMES=regelink:polygons&COUNT=1&SRSNAME=urn:ogc:def:crs:EPSG::28992: 3/3 2021-09-10T13:28:36 WARNING Error when parsing GetFeature response : Error: not well-formed (invalid token) on line 1, column 3809
If I use curl to retrieve it, QGIS loads it fine.
On position 3809, falls on exactly the colon (":") in the following string:
regelink:220_1_hsi
See
b.zip (one feature)
and
c.zip (more features)
Note: the attributes in this data start with a number (from a postgis db) <= I'm aware this gives troubles
Note2: not sure if k:220 cat depict some utf code or so?
Steps to reproduce the issue
these are 1200 parcels in the netherlands (EPSG:28992) with attributes starting with a number
Versions
3.16 -> master
Supported QGIS version
New profile
Additional context
No response
The text was updated successfully, but these errors were encountered: