Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expression compiler for postgres #2012

Closed
wants to merge 7 commits into from
Closed

Conversation

m-kuhn
Copy link
Member

@m-kuhn m-kuhn commented Apr 27, 2015

This is a preliminary prototype pull request which is not ready for merging. Do only look into it if you would like to discuss this topic on a theoretical / technical / architectural level.

Basics

This pull request allows to filter features already on server side hence making use of server side indexes, (stronger hardware) and reducing network load. This can result in a huge performance gain in certain scenarios (requesting few features of a big table). Only supported expressions will be sent to the database. Expressions using unsupported operators or functions will gracefully fallback to local evaluation.

To make use of this feature

  • Enable it in options: data sources
  • QgsFeatureRequest().setFilterExpression( expression )
  • or QgsVectorLayer::getFeatures( expression )

Testing

To be able to develop pragmatically testing for postgres server was introduced. Travis now brings up a postgres (9.1) with postgis (2.1) server and the unit tests run against that. The expressions are executed with and without compiler against that dataset so it can be compared that the same features are filtered, hence the compiler generates a suitable result. This will make it very easy to just let the same compiler run for other databases and test it with the same dataset on other backends.

Visitor pattern

It does not make use of the visitor pattern. The main issue with the visitor pattern is, that there is no way to return something from a visited node (return type is void). It may be possible to overcome this by making the return type a QVariant (or keeping somehow track of the currently visited node in the visitor class but that's pretty hacky).
If that's done, there's still a (possible) problem of not having context information about the currently visited node. I do not know if this will be a requirement but it's worth considering. It would probably be possible to keep track of that in the visitor class.
The current solution works quite nicely so I am not yet convinced that using the visitor pattern will have benefits. Please prove me wrong :)

Feature requests with FilterRect

A QgsFeatureRequest can either contain a FilterRect or a FilterExpression but not both (mutually exclusive). FilterRects are pretty well supported already by most providers and sent to the backend. Expressions not (yet). Sometimes a combination of these two filter types may come in handy (In particular in the rule based renderer where you may filter with an expression but also the currently visible extent should be used as filter criteria).

The FilterRect could be translated to an expression like category == 2 AND geom INTERSECTS( visibleextent ). On most providers this will result in fetching all features and filtering them with the expression engine and thus a considerable (unacceptable) performance impact. So in order to go with that solution one would need to try to compile the expression and only if a successful compilation has been reported the FilterExpression should be set and in any other case the FilterRect should be set.

The FilterRect could be changed so it will not be handled mutually exclusive but additionally. This will result in having to rewrite the logic in all the providers to handle that properly. If the FilterRect isNull() it will not be considered, if it is set it will be appended to the WHERE clause with an AND. This is a change in semantics that will probably not affect many things (as hardly ever somebody will have set two filters on a request, because so far only one was considered anyway...).

Acknowledgements

This work has been kindly supported by the Swiss QGIS User Group and SIGE

@nyalldawson
Copy link
Collaborator

The implementation looks good to me, and a big +1 for inclusion of the Postgres unit tests.

@m-kuhn
Copy link
Member Author

m-kuhn commented May 2, 2015

@wonder-sk Do you have any comments concerning the usage of the visitor pattern?

/** The name of the function. */
QString name();
const QString& name();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: returning implicitly shared classes (like QString, QVariant, QList etc) by value is IMHO better. Qt API does that all the time, and it leaves you some extra flexibility: one can return a temporary instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC returning a QString will create a deep copy, returning a const QString will create a shallow copy. Not sure though.

I don't think return by reference will hurt if one wants to return a temporary one can change it again, we never cared much for ABI compatibility. But I don't mind much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK there's no deep copying. The copy constructor (and assignment operator) of implicitly shared objects always makes a shallow copy. Deep copy is only involved by a call to detach() which is used by non-const methods. (When using QSharedDataPointer, the non-const operator->() calls detach() automatically).

It seems to me that returning const QString does not have much use:
http://stackoverflow.com/questions/8716330/purpose-of-returning-by-const-value

My preference is QString for returns, const QString& for arguments - to be consistent with Qt API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds all reasonable, it seems my memory tricked me. I'll revert these changes.

@wonder-sk
Copy link
Member

@m-kuhn great stuff, looking forward to see more of it :-)

Regarding visitors, I don't really have a strong preference. I understand that strictly using visitor would be a bit clumsy here. Also the "expression to OGC filter" compiler is implemented in the same way like you did it for postgres.

@NathanW2 NathanW2 removed the Prototype label May 7, 2015
@m-kuhn m-kuhn force-pushed the expcomp branch 8 times, most recently from 4a97ae9 to aa0c230 Compare May 21, 2015 14:04
@m-kuhn m-kuhn closed this May 21, 2015
@m-kuhn m-kuhn reopened this May 21, 2015
@m-kuhn m-kuhn force-pushed the expcomp branch 2 times, most recently from 896a205 to 6b3a427 Compare May 21, 2015 15:11
This commit allows to filter features already on server side. Only supported
expressions will be sent to the database. Expressions using unsupported
operators or functions will gracefully fallback to local evaluation.

To make use of this feature

 * Enable it in options: data sources
 * QgsFeatureRequest().setFilterExpression( expression )
 * or QgsVectorLayer::getFeatures( expression )
basically to get a framework to test the expression compiler
@nyalldawson
Copy link
Collaborator

manually merged

@m-kuhn m-kuhn deleted the expcomp branch May 22, 2015 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants