Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QEP 53: WFS provider enhancements #53

Open
rouault opened this issue Jan 26, 2016 · 10 comments
Open

QEP 53: WFS provider enhancements #53

rouault opened this issue Jan 26, 2016 · 10 comments

Comments

@rouault
Copy link

rouault commented Jan 26, 2016

QGIS Enhancement 53 (was 35): WFS provider enhancements

Date 2016/01/25
Author Even Rouault (@rouault)
Contact even.rouault at spatialys.com
Status Draft
Funding Land Information New Zealand
Version QGIS 2.16/3.0+

Summary

The purpose of this enhancement proposal is to address a number of existing issues in the WFS provider (listed in the below "Issue Tracking IDs" section), as well as to extend it to support WFS 1.1 and 2.0 protocols. Changes to the QGIS WFS server side are out of scope of this proposal.

Proposed Solution

Due to the scope of changes described below, a substantial rework/rewrite of the WFS provider will be done.

WFS 1.1.0 support

WFS 1.1 brings support for exposing ( in "DefaultMaxFeatures" constraint of the GetCapabilities response ) the maximum number of features that a server can return in a single GetFeature response (when it has such limit). This is for example implemented by TinyOWS, MapServer or GeoServer. This can be used to detect when not all features intersecting the view are available, and so as to order for a new GetFeature request when zooming in. Currently the threshold to detect truncated responses is hardcoded to 500, which is rather arbitrary. Reading DefaultMaxFeatures will solve this.

The default format for a GetFeature response in WFS 1.1 is GML 3.1. QGIS only supports GML 2 currently. The impact is mainly on the parsing of geometries whose XML representation can be much more diverse than in GML 2. The QgsOgcUtils class has some support for GML 3 parsing but it is quite embryonic. It is proposed to rely on OGR_G_CreateFromGML() function that supports GML 3.1/3.2/3.3. Another important impact of WFS 1.1 is that coordinate axis order is assumed to be the one of the EPSG database, i.e. lat,lon order for geographic SRS (and Y/X for a few projected coordinate systems too). Due to issues in (old) server implementations, similarly to WCS and WMS connection configurations, options "Ignore axis orientation" and "Invert axis orientation" will be proposed to fix problems that might arise.

Not in scope (among other aspects of the WFS 1.1 spec):

  • WFS-T 1.1.0

WFS 2.0.0 support / filter improvements

All main features of WFS 1.1.0 also apply to WFS 2.0.0. One interesting new feature of WFS 2.0 is the support for paged GetFeature requests, i.e. the client is able to browse through the features matching a filter with multiple GetFeature requests with explicit STARTINDEX parameter, even if the server implements a limit for the maximum number of features returned. The provider will make use of that capability when advertized by the server in its GetCapabilities response.
Note: some server implementations pre-dating WFS 2.0, like MapServer 6.4, have paging capabilities in WFS 1.0/1.1, but there is no standard way of advertizing it, so this is left out of scope.

WFS 2.0 servers can also advertize capabilities for doing server-side joins (attribute joins, spatial joins, ...). The provider will be upgraded to allow the user to specify such joins. The closest match to then Filter Encoding 2.0 specification used to send WFS 2.0 queries is a subset of SQL. So the "Filter" column in the list of layers will instead be replaced be a "SQL" column allowing both filters to be defined (WHERE clause), as well as joins. The GUI used to define the SQL will have at its top a text zone to enter the statement and in the lower part a "wizard" mode similar to the "SQL query builder" of DB Manager, modified so that join appear explictly.

wfs_sql

The supported subset of SQL will allow statements like the following :

SELECT colX [AS alias], other.colY FROM mainLayer [AS]? aliasMainLayer]? [JOIN otherLayer [AS]? aliasOtherLayer]? ON mainLayer.foo = otherLayer.bar | ST_xxx(mainLayer.foo, ]* [WHERE filter_expr]? [ORDER BY [mainLayer.X [ASC|DESC]]+]?

Table(layer), columns and functions proposed will be determined from the capabilities exposed by the server.

There are SQL builders (for example from QGIS expressions), but no SQL statement analyzer in QGIS, so it is proposed to use the one of GDAL/OGR, which is already used by the OGR WFS client driver to support the above mentionned statements. The OGR SQL parser is however not in the public API of OGR (C++ interface only, not guaranteed to be stabled), so we will import the relevant code into the QGIS code base (within a dedicated namespace / function renaming, to avoid symbol clashes).

After adding a new WFS layer to the project, editing the SQL statement will be possible in the layer properties.

Not in scope (among other aspects of the WFS 2.0 spec):

  • Support for using/creating/deleting WFS (server-)stored queries
  • WFS-T 2.0

WFS-T support

The scope of this work doesn't include an upgrade of WFS-T to support WFS-T 1.1 or 2.0. So only WFS-T 1.0 will be supported. This will require the user to explictly select the WFS 1.0 version (see "Protocol version selection").

Protocol version selection

The version of the protocol chosen by default will be the version advertized by the server in its GetCapabilities response (typically the highest version it supports). The request sent to the server will include the parameter ACCEPTVERSIONS=1.0.0,1.1.0,2.0.0 so that the server advertizes a version that QGIS can support. If for some particular reason (need for WFS-T, defective implementation by the server, or any other compatibility problem), the user wants a particular version to be used, a combobox with choices Automatic/1.0/1.1/2.0 will be added in the Create/Modify WFS connection dialogs.

Profile of GML supported : Simple Feature Level 0

WFS 1.1 and 2.0 servers can return GML encoded features of potentially complex structure, and not a simple flat structure with a single geometry attribute and properties of simple types (integers, real numbers, strings...). This is the Simple Feature Level 0 profile of GML, the one currently supported by QGIS, and which will remain the target with this QEP.
The exception to this rule is when issuing a WFS 2.0 join request : the GetFeature response is a document with a more complex structure, containing fields from the main and joined layers.

Background GetFeature processing

Currently the result of a GetFeature request must be completely downloaded and parsed before the first feature is returned to the calling layer. This is not ideal as it is blocking for the user. It is proposed that the processing of the GetFeature response is done in a worker thread and the features are notified to the main thread iterating over the features. This way the UI will not be blocked when processing large requests (the progress dialog will be made a-modal). Cancelling a download (either through Cancel button, or zoom/pan operations requiring a new request) will cancel both the current request, and the following ones that would have occured with WFS 2.0 paging. (Note: non-GUI based mode will still be possible).

Caching

The WFS provider has 2 modes :

  • the default mode, called "caching mode", where a single GetFeature request is sent without explicit bounding box filter for the whole layer.
  • another mode, "non-caching", where a GetFeature request is sent with a BBOX parameter for current the bounding box.

The UI to select those modes is somewhat confusing as there is a "Cache filter" column (enabled by default) in the layer list and a general checkbox "Only request features overlapping the current view extent" (disabled by default). It appears that the later, when enabled, overrides the former, but this is confusing for users. It is proposed to remove the "Cache filter" column to only keep the general checkbox "Only request features overlapping the current view extent".

It is also proposed to offer the possibility to change the caching/non caching mode after adding the layer in the project by editing the layer properties.

One point to be aware of is that the terminology "caching" vs "non-caching" is somehow misleading since in both modes currently the features downloaded from the server in the last GetFeature response are kept in memory by the provider (allowing for example faster rendering if the view bounding box does not change in "non-caching" mode). This can become a problem when working with huge layers and servers without feature count limit (servers that can for example satisfy the request in streaming mode), when the amount of available RAM might not be sufficient to store all features. This issue will be also met with WFS 2.0 mode with paging. It is proposed that the downloaded features will be serialized instead into a temporary on-disk file (probably a Spatialite DB). This cache file will not be persisted among sessions. In non-caching mode, its content will be augmented with the new features downloaded from the servers each time a new request is needed (note to self: we will have to store BBOXes and remember those whose requests have completed until end from which ones have been partially processed so as to decide when a new download is needed)

Other changes

The QgsWFSProvider class seems to have had a support for a local file mode (opposed to HTTP/HTTPS WFS connections) working with a .gml file and its accompaying .xsd if present. I did not manage to use it through the QGIS UI, so it looks like dead code. This will be removed with this proposal as a clean-up. Similar functionnality can be obtained by opening the .gml file as a vector layer with the OGR GML driver.

Affected Files

At least:

  • src/gui/qgsmanageconnectionsdialog.cpp
  • src/core/qgsgml.*
  • src/core/qgsogcutils.*
  • src/providers/wfs/*

Performance Implications

Discussed in above "Caching" paragraph.

Tests

There are currently no tests in the QGIS tests for the WFS provider. It is intended to add ones, probably using a locally instanciated HTTP server serving predefined XML documents.

Backwards Compatibility

No major issue foreseen at this stage. Existing WFS connections in the configuration will be reused and existing .xml files with exported WFS connections will be imported even if lacking new options. Users that need WFS-T will have to manually select WFS 1.0 version if connecting to a server with WFS 1.1 and/or 2.0 capabilities. Existing projects with WFS layers should still be usable (probably at the expense of not displaying the existing filter due to the change of QGIS expression syntax to SQL)

Issue Tracking IDs

The following issues will be adressed per this proposal:

  1. Panning a non-cached WFS layer causes selection to change - https://hub.qgis.org/issues/10106
  2. WFS client not requesting new features when not-cached - https://hub.qgis.org/issues/9444
  3. New WFS connection option - Max number of features returned - https://hub.qgis.org/issues/9450
  4. WFS client implement simple logic for caching and non-caching - https://hub.qgis.org/issues/14121
  5. WFS layers time out, even though the timeout is not reached - https://hub.qgis.org/issues/9395
  6. $geometry parameter does not expand to layer geometry column name - https://hub.qgis.org/issues/7600
  7. Ensure the OGC filter XML expression contains the GML namespace references - https://hub.qgis.org/issues/14119
  8. Implement Better Filtering for WFS layers - https://hub.qgis.org/issues/14120 (Could make 6 redundant if implemented as proposed)
  9. WFS 2.0 client provider: https://hub.qgis.org/issues/14122
  10. WFS non cached: infinite flashing: http://hub.qgis.org/issues/14156

Documents and links of interest

Votes

(required)

@nyalldawson
Copy link
Contributor

Hi @rouault,

I don't have a lot of experience with WFS, but it's great to see the undermaintained WFS provider getting some attention. A couple of small comments regarding this:

  • Consider implementing a python provider test which inherits from ProviderTestCase (see eg tests/src/python/test_provider_mssql.py). This will help identify any issues in the provider where handling of different requests differs from the other vector providers
  • I'd also strongly suggest implementing handling of QgsFeatureRequest::FilterFids in QgsWFSFeatureIterator. This will give a HUGE performance boost across lots of areas when using WFS layers.
  • Similarly, I'd suggest implementing an expression compiler for QgsExpression -> WFS filters, so that QGIS expression based filters can be handled on the server side (rather then fetching all features and testing them in QGIS). Have a look at QgsSqlExpressionCompiler, and the implementations in QgsOgrExpressionCompiler, QgsSpatiaLiteExpressionCompiler etc. This will also give a huge performance boost to the provider.

@rouault
Copy link
Author

rouault commented Jan 26, 2016

@nyalldawson

Consider implementing a python provider test which inherits from ProviderTestCase (see eg tests/src/python/test_provider_mssql.py). This will help identify any issues in the provider where handling of different requests differs from the other vector providers

Thanks for the pointer. Looking at the runGetFeatureTests() method, due to the large number of tests, it might require a real WFS server especially, which might be impractical. I'd thought rather to have a dummy server returning a few prepared responses. Or perhaps have a dual strategy: rather extensive tests requiring to setup a real WFS server (that would be optionally run), and more simple ones that could be run without requiring a WFS server.

I'd also strongly suggest implementing handling of QgsFeatureRequest::FilterFids in QgsWFSFeatureIterator. This will give a HUGE performance boost across lots of areas when using WFS layers.

Which actions can trigger such type of filter ? This should be implementable through a FeatureId (1.0) /GmlObjectId (1.1) /ResourceId (2.0) filter

Similarly, I'd suggest implementing an expression compiler for QgsExpression -> WFS filters

There's already one implemented in QgsOgcUtils::expressionToOgcFilter() (will require work to support new WFS versions) used currently at layer creation when the user defines the layer. I guess this will have to be used again for filters defined afterwards and combined with the initial filter. But not all QGIS expressions can be turned into valid WFS server filters, so client-side fallback will be sometimes needed.

@NathanW2
Copy link
Member

Who is the main owner of the WFS code?

On Wed, Jan 27, 2016 at 8:38 AM, Even Rouault notifications@github.com
wrote:

@nyalldawson https://github.com/nyalldawson

Consider implementing a python provider test which inherits from
ProviderTestCase (see eg tests/src/python/test_provider_mssql.py). This
will help identify any issues in the provider where handling of different
requests differs from the other vector providers

Thanks for the pointer. Looking at the runGetFeatureTests() method, due to
the large number of tests, it might require a real WFS server especially,
which might be impractical. I'd thought rather to have a dummy server
returning a few prepared responses. Or perhaps have a dual strategy: rather
extensive tests requiring to setup a real WFS server (that would be
optionally run), and more simple ones that could be run without requiring a
WFS server.

I'd also strongly suggest implementing handling of
QgsFeatureRequest::FilterFids in QgsWFSFeatureIterator. This will give a
HUGE performance boost across lots of areas when using WFS layers.

Which actions can trigger such type of filter ? This should be
implementable through a FeatureId (1.0) /GmlObjectId (1.1) /ResourceId
(2.0) filter

Similarly, I'd suggest implementing an expression compiler for
QgsExpression -> WFS filters

There's already one implemented in QgsOgcUtils::expressionToOgcFilter()
(will require work to support new WFS versions) used currently at layer
creation when the user defines the layer. I guess this will have to be used
again for filters defined afterwards and combined with the initial filter.
But not all QGIS expressions can be turned into valid WFS server filters,
so client-side fallback will be sometimes needed.


Reply to this email directly or view it on GitHub
#53 (comment)
.

@nyalldawson
Copy link
Contributor

I'd also strongly suggest implementing handling of QgsFeatureRequest::FilterFids in QgsWFSFeatureIterator. This will give a HUGE performance boost across lots of areas when using WFS layers.
Which actions can trigger such type of filter ? This should be implementable through a FeatureId (1.0) /GmlObjectId (1.1) /ResourceId (2.0) filter

Lots - mostly related to operations on a set of selected features. Eg, zooming to selected features, saving selected features, processing operations which work on selected features.

@rouault
Copy link
Author

rouault commented Jan 26, 2016

Lots - mostly related to operations on a set of selected features. Eg, zooming to selected features, saving selected features, processing operations which work on selected features.

OK, so I think that in practice if the features are already selected, they will be already locally cached, and no extra server query will be needed.

@nyalldawson
Copy link
Contributor

There's already one implemented in QgsOgcUtils::expressionToOgcFilter() (will require work to support new WFS versions) used currently at layer creation when the user defines the layer. I guess this will have to be used again for filters defined afterwards and combined with the initial filter. But not all QGIS expressions can be turned into valid WFS server filters, so client-side fallback will be sometimes needed.

That's exactly how the existing expression feature request filters work - they hand off as much work as possible to the server, and then implement any further checks as a fallback to match the behaviour of QGIS expressions. Eg, spatialite provider can't hand off case sensitive matches, so it performs a case -insensitive match server side and then performs an extra filter on the qgis side to strip out non case-matching results. It still results in much less features sent from the provider -> QGIS to handle.

@nyalldawson
Copy link
Contributor

Oops - pressed the wrong button :)

@rouault rouault changed the title QEP 35 (number TBC): WFS provider enhancements QEP 35: WFS provider enhancements Mar 11, 2016
rouault added a commit to rouault/QGIS that referenced this issue Apr 5, 2016
First part of qgis/QGIS-Enhancement-Proposals#53
(QEP 35: WFS provider enhancements)

Improvements:
- Version autodetection
- On-disk caching of downloaded features
- Background download and progressive rendering
- WFS 1.1 and 2.0 support
- WFS 2.0 GetFeature paging
- Add provider tests

Fixes:
- qgis#10106: Panning a non-cached WFS layer causes selection to change
- qgis#9444: WFS client not requesting new features when not-cached
 qgis#14156: WFS non cached: infinite flashing
- qgis#9450 : New WFS connection option - Max number of features returned
- qgis#14122: Implement WFS 2.0 client provider (partial. no joins or stored queries)

Not in scope: WFS-T 1.1 and 2.0. But WFS-T 1.0 kept (and tested)
@NathanW2
Copy link
Member

+1 This looks really good to me. Thanks to LINZ for funding the work!

@NathanW2
Copy link
Member

@rouault will this document change much. If not I will apply the Final Draft tag.

@rouault
Copy link
Author

rouault commented Apr 11, 2016

@NathanW2 There will be likely some adjustments

rouault added a commit to rouault/QGIS that referenced this issue May 12, 2016
…ovements

Second part of qgis/QGIS-Enhancement-Proposals#53
(QEP 35: WFS provider enhancements)

- URI parameter with sql with SELECT / FROM / JOIN / WHERE / ORDER BY clauses
- handle WFS 2.0 joins
- handle DateTime fields
- enable "Only request features overlapping the view extent" by default (and memorize the settings)
- rework DescribeFeatureType parsing to handle responses with several documents, and some support for attribute types being complexType
- rework feature transfer between downloader and iterator so as to avoid uncontrolled RAM usage when the iterator cannot keep up with the downloader
- turn on WAL journaling for better reader / writer concurrency
- add retry logic based on the 'Max retry in case of tile request errors' setting (renamed 'Max retry in case of tile or feature request errors')
- error to MessageBar in case of failed download
- in progress dialog, add a "Hide" button to mask the dialog
- improve automated testing
- add testing of the GUI of QgsWFSSourceSelect
@NathanW2 NathanW2 changed the title QEP 35: WFS provider enhancements QEP 53: WFS provider enhancements Jan 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants