Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Refactor SQLite plugin #1039

Open
kkaefer opened this Issue · 3 comments

2 participants

@kkaefer

Currently, the SQLite plugin tries to do a lot of magic to figure out the table the geometry column in a query came from in order to use the correct index. The naming convention for indexes is idx_TABLENAME_FIELDNAME. The plugin needs to know the table name in order to construct the query, which references the R*Tree index table.

I suggest removing all index join functionality from the plugin and instead rely on views that automatically join the index to the table. The end user only supplies the view name and SQLite handles the R*Tree index join in the background. On binding, we check the fields of query, and if there's a xmin, xmax, ymin and ymax column in the result columns of the query, we just assume that the table is actually a view and internally joins to the index. That change would make the SQLite plugin more similar to how the PostGIS plugin works.

Additionally, instead of using IN() subqueries for the indexes, we should recommend using a JOIN to filter by bbox:

SELECT [fields] FROM ([query]) WHERE id IN
    (SELECT pkid FROM idx_osm_minorroads_geometry
        WHERE xmin >= -1 AND ymin >= -1 AND xmax >= 1 AND ymax <= 1);

becomes

SELECT [fields] FROM ([query])
    WHERE xmin >= -1 AND ymin >= -1 AND xmax <= 1 AND ymax <= 1;

with [query] being a view

SELECT m.*, k.xmin, k.xmax, k.ymin, k.ymax
    FROM [table] m
    JOIN [table_index] k
    ON m.id = k.pkid;

or some other query that operates on that view. Alternatively, users can just specify the query that joins the data table to the R*Tree index table.

My benchmarks suggest that this is a lot faster (I measured an improvement of about 3x). Adding the views into the mix doesn't have any measurable negative effects on query performance.

If a table doesn't have xmin/xmax/ymin/ymax fields, we just run a non-spatial query:

SELECT [fields] FROM ([query]);

This doesn't deal with indexes in external files or automatic index creation. Instead of automatically creating an index, we should expose an API that allows creating the index and the according view programatically so that end users can use a product that employs Mapnik to explicitly create an indexed view.

@springmeyer
Owner
  • great plan, this looks awesome
  • indexing api is planned, soon. Let me know if this demands it landing first.
  • my only concern is that we likely should continue to support, as a fallback, the idx_TABLENAME_FIELDNAME method since I think is what spatialite uses (though not outside the db), and I figure we should be able to read directly from a spatialite enabled db without having to re-index.
@kkaefer

I've already switched the imposm sqlite plugin to create a table osm_foo with an index table idx_osm_foo_geometry. The view that joins the index internally is named osm_foo_indexed now. This would enable both types of tables in Mapnik: It first checks whether the query it got passed returns xmin/xmax/ymin/ymax fields, if not, it tries to join the idx_*_geometry. That means we sort of rely on the user choosing the right (indexed) view.

@springmeyer
Owner

re: in in queries, looks like this might have been optimized in 3.7.15: "Enhance the IN operator to allow it to make use of indices that have numeric affinity." from http://www.sqlite.org/news.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.