Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WFS maxFeatures support broken in many cases. #4011

Closed
mapserver-bot opened this issue Apr 4, 2012 · 61 comments
Closed

WFS maxFeatures support broken in many cases. #4011

mapserver-bot opened this issue Apr 4, 2012 · 61 comments
Assignees
Milestone

Comments

@mapserver-bot
Copy link

Reporter: sdlime
Date: 2011/09/06 - 14:48
Trac URL: http://trac.osgeo.org/mapserver/ticket/4011
Passing a value for maxFeatures in a WFS request doesn't work in some cases. This is because of how that filtering is done relative to FILTER processing. Since FILTERs are used increasingly for both attribute and some spatial filters (e.g. intersect) this becomes a problem. The layerObj maxfeatures property is used...

Presently it's up to a driver to apply maxfeatures, for example, PostGIS sets a limit on the SQL statement. Not all drivers implement this so it's impossible to use maxFeatures in some cases. A better solution would be apply feature counting in the query and drawing functions that request geometries from drivers. This allows for all drivers to be supported and makes it possible to count results after all requisite filters have been applied.

Steve

@mapserver-bot
Copy link
Author

Author: myOpenLayersUName
Date: 2011/09/06 - 21:22
I have compiled a version of MapServer for Linux using the patches attached to this ticket and tested the WFS queries I have been previously unable to perform against Shapefiles.

I was able to successfully perform the queries with filters and maxFeatures.

Yay!!!! You can't see it, but I'm doing a 'very' 'happy dance' in my cubicle!

@mapserver-bot
Copy link
Author

Author: adube
Date: 2011/09/26 - 15:54
I just tried this patch and it solved my issue as well : the filters of my queries are performed before the maxFeatures, thus now behaves as expected. Thanks sdlime for the patch.

@mapserver-bot
Copy link
Author

Author: assefa
Date: 2011/09/26 - 16:33
Hi Steve,

Pagination support with WFS: the wfs request can send a startindex and a maxfeatures. The startindex is an integer value the allows users to paginate through a set of results.

Current support is done at 2 levels:

  • for the oracle driver, the layer->startindex element is taken into account to do the queries.
  • for other drivers: msGMLWriteWFSQuery is where we pass through the elements until we get to the start index and output from there to the max features (if set).

With the patch proposed in this bug, we also need to take into account the startindex and apply the logic in the query and draw functions.
Note also that removing the logic at the driver side and put it higher up might end up not being efficient when you deal with a big set of results (which is usually the case for clients using pagination)

@mapserver-bot
Copy link
Author

Author: sdlime
Date: 2011/09/29 - 06:12
I'd appreciate any ideas on how to efficiently support both. Pagination support is more uneven than max results. Kinda seems like if both pagination AND max results are set then the drivers that support pagination would produce a result set with max results or less and would be unaffected by my patch. I mean, in PostGIS if you set a start value the local result set still starts at 0 doesn't it?

If no pagination is necessary then the drivers wouldn't limit things on their side and would let MapServer do it (as in the patch).

Steve

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2011/10/04 - 15:44
Steve, we tried the patch on a development server and it seems that it introduced a new bug. If the startIndex is >0, the maxfeature is not taken into account.

@mapserver-bot
Copy link
Author

Author: sdlime
Date: 2011/10/04 - 17:16
Replying to [comment:7 aboudreault]:

Steve, we tried the patch on a development server and it seems that it introduced a new bug. If the startIndex is >0, the maxfeature is not taken into account.

Which driver? I may have been a bit overzealous in my gutting of the Oracle and/or PostGIS code. My thinking now is that we'd only use maxfeatures IF start index is set as well.

Steve

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2011/10/04 - 19:32
We are using postgis.

@mapserver-bot
Copy link
Author

Author: rouault
Date: 2011/10/04 - 21:39
(in case it might help, or confuse ;-), I've read a possibly related discussion on GeoServer side : see the "Paging" paragraph at http://geoserver.org/display/GEOS/GSIP+61+-+WFS+2.0 )

@mapserver-bot
Copy link
Author

Author: sdlime
Date: 2011/10/06 - 21:17
Replying to [comment:9 aboudreault]:

We are using postgis.

Try backing out the patch for mappostgis.c and test again. It rally shouldn't matter if the driver tries to honor maxfeatures.

Steve

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2012/02/07 - 00:17
Steve, the performance impact is considerable. Oracle and PostGIS both support a maxfeature and index parameters. It looks to me that it's a way better to keep those drivers using the native method of doing this...

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2012/02/07 - 00:28
well, I'm not aware of the other issues... so I will assume that we really can't use those features from the db. I'll attach my modified patch that fixed my issue (maxfeatures ignored if startindex is set but the driver doesn't support it) Please, review and let me know if you and I should commit.

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2012/02/07 - 00:37
note that I wasn't really sure of the use of the 3 if statements (maxfeatures>0 && startindex<0, etc.).. that's what I've modified/simplified. Take a look at it and let me know if you see something wrong for the desired behavior.

@mapserver-bot
Copy link
Author

Author: sdlime
Date: 2012/02/07 - 16:47
This looks good at first glance. A better solution than my original patch. I missed the layer supports paging API function originally. Have you been able to test across providers with both start index and max features?

Steve

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2012/02/07 - 17:19
I've tested postgis and shapefiles providers. I do not have any other providers here.

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2012/02/07 - 18:39
Steve, I see that the startindex is handled in the drivers... shouldn't it be handled in mapserver too? I guess we will experience issues when using maxfeature ''and'' startindex if they are not applied at the same level.. I would suggest to put this in 6.0.3... to be sure to not break 6.0.2.

@mapserver-bot
Copy link
Author

Author: aboudreault
Date: 2012/02/07 - 20:46
for the record, here is how pagination should be done in ORACLE (provided by mdsmith):

select * 
  from ( select a.*, rownum rnum
           from ( YOUR_QUERY_GOES_HERE -- including the order by ) a
          where rownum <= MAX_ROWS )
 where rnum >= MIN_ROWS

@mapserver-bot
Copy link
Author

attachment http://trac.osgeo.org/mapserver/attachment/ticket/4011/4011.patch :

   Patch against branch-6-0 for query side of things...

@mapserver-bot
Copy link
Author

@ghost ghost assigned sdlime Apr 5, 2012
@sdlime
Copy link
Member

sdlime commented May 15, 2012

Been thinking about this more. I'm not convinced at this point we can do this on the driver side unless the driver fully supports common expressions (used for attribute and shape queries). The reason is that we don't know that a feature is in the result set unless is passes filtering in the query functions. The exception might be bounding box queries where the bbox is applied solely on the driver side.

I'm not sure the performance will be that bad. Ok, it will be with paging as you reach the latter pages but that's unavoidable unless everything can be done on the driver side. Setting max features is the far more useful use case and there is really no penalty doing that check on the query function side since you don't have to send extra features over the wire.

Personally I'd like to drop paging support in 6.2. It flat out doesn't work except for the most basic (bbox) query type. If we can limit paging to that case then that would be ideal- don't know if that's possible.

I had envisioned that drivers would be able to examine filters to decide if they could support them, there's a layer API function for that purpose. If a driver supported a filter and supported paging then we could things on the driver side, otherwise it should be purely on the MapServer side.

Steve

@ghost ghost assigned aboudreault May 16, 2012
@aboudreault
Copy link
Member

cc @sdlime , I begin to check this ticket

@aboudreault
Copy link
Member

@sdlime, do you have a test case I could use to be sure I understand the isse and the desired result. Something that use shapefile and/or postgis with FILTERS and good examples of the issue with the expressions?

@sdlime
Copy link
Member

sdlime commented May 17, 2012

Here's the issue, let's say a user issues a query (either straight MapServer or via WFS) to intersect a geometry and apply an attribute filter against a PostGIS-based layer. The problem is that the full query isn't executed in PostGIS. All we do in PostGIS is a bbox query to get candidate shapes and the other tests are done within MapServer. In a case like this you can't do the maxresults or paging computations on the database side since you have no idea if geometry met the filter criteria. See what I mean? It's always been this way except for a couple of simple cases (Assefa would be able to detail).

Complex filters are really quite expensive in this regard because we basically have to process every feature on the MapServer side. Using something like a shapefile or other local data this isn't that big a deal, sucks for network-based resources. Really need to work on a Common Filters => SQL capability to push everything to the database whenever possible. I think we need to develop an RFC.

I'm of the opinion that we should keep this out of the 6.2 release. The patches would still be available for users not using PostgreSQL/Oracle.

Steve

@aboudreault
Copy link
Member

I think I understand. The issue cannot be fixed easily. The only easy solution would be to do what you wanted initially... handling the maxfeature/pagination in mapserver. However, this will highly affect the performance of database backends. I agree that we should keep this out of 6.2 and write a RFC.

@sdlime
Copy link
Member

sdlime commented May 17, 2012

I think it only highly affects performance w/pagination. With maxfeatures the database server is delivering no more data than it would normally would.

@jratike80
Copy link

We have a few featuretypes with more than million rows and we would still be able use WFS for delivering the whole dataset for users. Paging with plain GetFeature would suit that use case perfectly. Could it be possible to support paging already in 6.2 if GetFeature is sent without a filter? Even better if BBOX could be supported. Paging is not officially supported by WFS 1.0.0. and 1.1.0 so we can consider it as a vendor option and then it would not be critical to omit it with filters.

@aboudreault
Copy link
Member

@sdlime, could we do that for WFS Getfeature requests:

1- If the layer does not have FILTER set, do the maxfeature + pagination in drivers (mostly for DB backends).
2- A DB backends HAVE to support maxfeature AND pagination or nothing. (I think it's not a problem)
3- If the layer does have FILTER set, do the maxfeature + pagination in mapserver. The sql would only include the BBOX as currently. This will affect the performance... and it will be stated in a debug message... but at least maxfeature/pagination would be supported for that layer too.

This should be well documented in the migration guide.

Would be nice to get this for 6.2. What do you think?

@msmitherdc
Copy link
Contributor

@aboudreault This would work for our purposes as we try not to set the filters at the WFS level but move them to the sql level so they get processed by the database driver (oracle)

@sdlime
Copy link
Member

sdlime commented Jun 8, 2012

Ok, looking at this now. I think in general this is the right approach. I understand that layer->paginate means supporting both paging and maxfeatures, correct? In regards on when to enable:

DrawVectorLayer() (mapscript draw calls?) - Actually layer->maxfeatures has long been supported for drawing from the CGI or MapScript. In fact that's what it was added for originally. It's an easy to deal with high resolution data. For example, you might sort lakes by area and draw only the 50 largest in a single map draw. My guess is that this use should go away to avoid conflict w/query use. It's not a giant deal since you generally don't query these types of layers.

Drawing uses a bbox so this is probably ok.

msQueryByPoint - ok to enable
msQueryByRect - ok to enable

msQueryByAttribute - ok to enable (I think) since the filter must be native to the RDBMS

WFS GetFeature (if not complex filter set) - doesn't this just call the msQueryBy... functions?

Regarding multiple layer types. It seems like we can just do paging/maxfeatures within each layer. I guess there's no harm in doing so although it makes little sense since there's no way to control paging/maxfeatures on a by/layer basis.

Steve

@sdlime
Copy link
Member

sdlime commented Jun 9, 2012

Alan, can we do this without adding the paging parameter to a layer? nice to avoid more complications there and most drivers won't ever support this. Could store in driver layer info... Not a giant deal but I thought I'd bring it up.

@aboudreault
Copy link
Member

@sdlime , I've made a new patch at: http://download.osgeo.org/mapserver/tickets/b4011-150612.patch

The paging flag is not at driver level. Since there is a reset in all msQueryBy* function, (layer close/open), you'll see that I've been force to get the paging to be able to restore it, so I've also added a msLayerGetPaging function. Otherwise, thing was just reseted. Basically, the queries behavior:

msQueryByPoint: Driver pagination enabled.
msQueryByRect: Driver pagination enabled.
msQueryByAttribute Driver pagination enabled.
msQueryByFilter: Driver pagination disabled.
msQueryByShape: Driver pagination disabled.
msQueryByIndex: Driver pagination disabled.

In mapwfs.c, I set the paging to MS_FALSE if it's a complex query and/or if the request is made on more than 1 layer.

@sdlime
Copy link
Member

sdlime commented Jun 16, 2012

Does the WFS code actually try to execute queries independently of the msQueryBy... functions? Was the maxfeatures check yanked from the shapefile driver (should be)...

Steve

Sent from my iPad

On Jun 15, 2012, at 11:04 AM, Alan Boudreaultreply@reply.github.com wrote:

@sdlime , I've made a new patch at: http://download.osgeo.org/mapserver/tickets/b4011-150612.patch

The paging flag is not at driver level. Since there is a reset in all msQueryBy* function, (layer close/open), you'll see that I've been force to get the paging to be able to restore it, so I've also added a msLayerGetPaging function. Otherwise, thing was just reseted. Basically, the queries behavior:

msQueryByPoint: Driver pagination enabled.
msQueryByRect: Driver pagination enabled.
msQueryByAttribute Driver pagination enabled.
msQueryByFilter: Driver pagination disabled.
msQueryByShape: Driver pagination disabled.
msQueryByIndex: Driver pagination disabled.

In mapwfs.c, I set the paging to MS_FALSE if it's a complex query and/or if the request is made on more than 1 layer.


Reply to this email directly or view it on GitHub:
#4011 (comment)

@aboudreault
Copy link
Member

@sdlime , No, well not that I am aware. WFS getfeature call msQueryBy*. Yes the maxfeatures check has been removed from the mapshape.c driver.

@aboudreault
Copy link
Member

I've just committed my changes in master in 0d4eef8.

@sdlime
Copy link
Member

sdlime commented Jun 21, 2012

Saw you committed, woot!

Sent from my iPad

On Jun 20, 2012, at 6:35 AM, Alan Boudreaultreply@reply.github.com wrote:

@sdlime , No, well not that I am aware. WFS getfeature call msQueryBy*. Yes the maxfeatures check has been removed from the mapshape.c driver.


Reply to this email directly or view it on GitHub:
#4011 (comment)

@sdlime
Copy link
Member

sdlime commented Jun 24, 2012

Compiling master without postgis support enabled seems broken:

mappostgis.c: In function 'msPostGISEnablePaging':
mappostgis.c:3303: error: 'msPostGISLayerInfo' undeclared (first use in this function)
mappostgis.c:3303: error: (Each undeclared identifier is reported only once
mappostgis.c:3303: error: for each function it appears in.)
mappostgis.c:3303: error: 'layerinfo' undeclared (first use in this function)
mappostgis.c:3314: error: syntax error before ')' token
mappostgis.c: In function 'msPostGISGetPaging':
mappostgis.c:3322: error: 'msPostGISLayerInfo' undeclared (first use in this function)
mappostgis.c:3322: error: 'layerinfo' undeclared (first use in this function)
mappostgis.c:3333: error: syntax error before ')' token
make: *** [mappostgis.lo] Error 1

Steve

@tbonfort
Copy link
Member

is there any need for documenting this one, or can it be closed now ?

@tbonfort
Copy link
Member

wfs_filter_startindex2.xml autotest is failing.

  • startindex=0&maxfeatures=10 => OK (10 rows returned)
  • startindex=10&maxfeatures=10 => NOK (1 row returned)

wfs_filter_startindex.xml is also failing, but that is only a question of ordering compared to what is expected. For wfs_filter_startindex2.xml it seems more serious.

@aboudreault
Copy link
Member

The startindex was badly handled. There was a maxfeature check in all msQuery functions but the startindex was handle in mapgml.c. So, most of the time there was not enough features in the resultcache when the startindex was set. ie. maxfeature=3 with startindex=10: FAILS... there was only 3 features in the cache.

  • I've removed the startindex handling in mapgml.c
  • I've changed the msLayerDefaultGetPaging return value to MS_FALSE. (this applies to all drivers without pagination) so the built-in code take care of the paging.
  • I've added the startindex handling in all msQuery functions. Basically, the startindex is only applied there if the layer paging was set to false.

@sdlime - Questions:

  • Please take a look at the changeset and let me know if you see any issues about handling startindex in mapquery.
  • Should I remove totally the maxfeature parameter in msGMLWriteWFSQuery ? I don't think we need it anymore
  • There are some code in mapwfs.c, line 2416 which I'm not sure. I think it should be removed since everything is handled is mapquery.c, but would like your comment about that..

@sdlime
Copy link
Member

sdlime commented Aug 29, 2012

"Should I remove totally the maxfeature parameter in msGMLWriteWFSQuery ? I don't think we need it anymore"

Yes... Maxfeatures should be applied at query time. The template code allows some limiting through the tags themselves but that's a different mechanism.

"There are some code in mapwfs.c, line 2416 which I'm not sure..."

My understanding is that all WFS queries are handled through the msQueryBy... functions so you are correct. There's also code around L2383 (branch-6-2) that also takes maxfeatures into account. Offhand I'd say that isn't needed because maxfeatures is applied before hand. However, looking at that code it looks like maxfeatures is applied across layers and we're doing it within layers (which I think is more reasonable). I don't know the WFS specification well though. Do you?

Steve

@jratike80
Copy link

"However, looking at that code it looks like maxfeatures is applied across layers and we're doing it within layers (which I think is more reasonable)"

If you mean a case where request is having a list of layers and maxfeatures
&TYPENAME=layer1,layer2,layer3&MAXFEATURES=30000
then server should output 30000 features at maximum. It can mean that the whole layer1 with 10000 features will be returned, 20000 first features from the big layer2 and no features at all from the layer3.
With this behaviour the request and result does not necessarily make much sense from the users point of view but it seems to be the default for ArcGIS. But if one thinks that maxfeatures is there mostly for protecting the server and for setting some upper limit for the net traffic per request then it is understandable.

@sdlime
Copy link
Member

sdlime commented Aug 29, 2012

Couple of questions:

  •      Can you point me to the spot in the WFS spec where this is spelled out?
    
  •      What does GeoServer do?
    
  •      You can’t protect a server with a user configurable parameter. I could just as easily set no limit. You can protect the client application though.
    

If we do have to make it query-wide then we may have to do some things differently:

  •      Add a global maxfeatures parameter to the queryObj (this is what the WFS code would set)
    
  •      Dynamically set layer->maxfeatures on what’s left based on the query->maxfeatures
    
  •      Layer->maxfeatures should still be respected even without query->maxfeatures set…
    

Likewise, paging becomes a global operation (startindex also exists in the queryObj).

Steve

@jratike80
Copy link

Hi,
Web Feature Service Implementation Specification
OGC 04-094
Version: 1.1.0
Page 35
" The optional maxFeatures attribute can be used to limit the number of explicitly requested features (i.e. features specified via the GetFeature/Query/@typename attribute) that a GetFeature request presents in the response document. The maxFeatures value applies to whole result set and the constraint in applied to the features in the order in which they are presented. In addition, feature members contained in a requested feature collection do not count – the requested feature collection counts as one feature. Once the maxFeatures limit is reached, the result set is truncated at that point. There is no default value defined and the absense of the attribute means that all feature type instances in the result should should be returned to the client."

MaxFeatures can be used for protecting WFS server because it can be set on the server side too and clients cannot override (hopefully) the setting. Mapserver, Geoserver and TinyOWS all support setting MaxFeatures on server side. My TinyOWS does not seem to advertise it but Mapserver does
ows:Constraint name="DefaultMaxFeatures">ows:Value100000/ows:Value/ows:Constraint
Captured from http://188.64.1.61/cgi-bin/mapserver_wfs?service=WFS&version=1.1.0&request=GetCapabilities

DefaultMaxFeatures is on page 86, table 7b in the WFS 1.1.0 standard.

-Jukka-

@sdlime
Copy link
Member

sdlime commented Aug 29, 2012

Will have to let Alan react. I think it's still very manageable. Just need the global values to alter layer values (down, not up if they are already set).

Steve

@aboudreault
Copy link
Member

I've modified WFSGetFeature to support maxfeatures across layers. Here's the current behavior of MS:

WFSGetFeature:

  • maxfeature/1 layer: driver pagination enabled
  • maxfeature/2+layers: driver pagination enabled (Yes, the global maxfeatures is set properly after the first request)
  • maxfeature+startindex/1 layer: driver pagination disabled
  • maxfeature+startindex/2+layers: driver pagination disabled

MapScript:

  • Everything seems to by within layers. setting layer->maxfeatures and/or startindex will affect only the layer itself, even if we call map->queryBy*. Is that ok?

I'm a little bit concerned about my change in a beta release...... haven't been able to test all queryBy* functions.

@sdlime
Copy link
Member

sdlime commented Aug 30, 2012

"maxfeature+startindex/1 layer: driver pagination disabled", should that be enabled? Seems like with only one layer in play many drivers can support both.

Where is global maxfeatures stored? We could add a means in MapScript of setting global maxfeatures/startindex values I suppose but really I think the within layer behavior makes the most sense anyway.

@aboudreault
Copy link
Member

sorry, was a typo: maxfeature+startindex/1 layer: driver pagination enabled.

global maxfeatures/startindex are stored in the queryObj. But I also think we should let things are they are now using mapscript. The maxfeatures server protection is more a standard of WFSGetFeature.

@tbonfort
Copy link
Member

considering this one closed now

mkofahl pushed a commit to faegi/mapserver that referenced this issue Apr 9, 2013
mkofahl pushed a commit to faegi/mapserver that referenced this issue Apr 9, 2013
mkofahl pushed a commit to faegi/mapserver that referenced this issue Apr 9, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants