Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEOS-6842] Improve the dbase reader to filter WHERE clauses using an ODBC driver #763

Merged
merged 1 commit into from
Mar 7, 2015

Conversation

ahuarte47
Copy link
Contributor

Manages the optional RECNO field index of the ShapefileDataStore to fast quering using some optional DBF ODBC provider.

Now it implements this improvement for two ODBC providers running in Windows. But it can offers this feature using other jodbc driver for Linux...

It fixes:
https://osgeo-org.atlassian.net/browse/GEOS-6842

This PR replaces the #697

@ahuarte47
Copy link
Contributor Author

Note:

The improvement is perceived for --big-- shapes. This patch with an available ODBC provider avoids the full reading of data that GeoTools does by default. Also, GeoServer in some situations forces to read the shapefile twice in one single WFS request.

*ShapeFile "parcela_rustica.shp" for my tests:
~500,000 polygons 2D
~500mb of size

image

*Results:
- GeoTools without changes: 4,100ms
- GeoTools using the 'Advantage StreamlineSQL ODBC Driver': 1,068ms
- GeoTools using the 'Advantage StreamlineSQL ODBC Driver' + CDX file: 630ms

We create the CDX files with Visual Fox Pro, are optional.

Sorry, I can not attach the code in a zip file, I insert it directly in this message.

/**
 * Shapefile tester application!
 */
public class App 
{
    public static int s_featureCount = 0;

    public static void main( String[] args )
    {           
        int featureCount = 0;

        try
        {
            com.vividsolutions.jts.util.Stopwatch stopwatch = 
                  new com.vividsolutions.jts.util.Stopwatch();

            File shapeFile = new File("D:/Mapas/Navarra/SHAPEs/Catastro/parcela_rustica.shp");
            String sql = "DMUNICIPIO = 157 AND DPOLIGONO = 11 AND DPARCELA = 80";

            /**
             * Test the optional Recno field to quickly lookup records.
             * 
             * Now it only works for two ODBC drivers running in Windows SO's:
             *  - Microsoft ODBC FoxPro Driver (x86).
             *  - Advantage StreamlineSQL ODBC driver (x86/x64).
             * It is feasible use the 'Advantage StreamlineSQL ODBC' in Linux platforms.
             */ 
            java.sql.DriverManager.setLogWriter(new java.io.PrintWriter(System.out));
            featureCount += ProcessShapeFile(shapeFile, sql, true);

            System.out.println("OK! Count="+featureCount+"Time="+stopwatch.getTimeString());
        }
        catch (Exception error)
        {
            System.out.println("ERROR="+error.getMessage());
        }
    }
    public static int ProcessShapeFile(File shapeFile, String sql, boolean traceBoundingBox) 
          throws IOException, CQLException
    {
        int featureCount = 0;

        FileDataStore store = FileDataStoreFinder.getDataStore(shapeFile);
        SimpleFeatureSource featureSource = store.getFeatureSource();

        Filter filter = CQL.toFilter(sql);       
        Query query = new Query("", filter);
        query.setMaxFeatures(2);

        SimpleFeatureCollection featureCollection = featureSource.getFeatures(query);
        SimpleFeatureIterator featureIterator = featureCollection.features();

        while (featureIterator.hasNext())
        {
            SimpleFeature feature = featureIterator.next();

            if (traceBoundingBox)
            {
                BoundingBox bbox = feature.getBounds();
                System.out.println(bbox.toString());
            }
            featureCount++;
        }
        featureIterator.close();

        return featureCount;
    }
}

We have to implement this patch to publish with GeoServer WFS services using shapefiles so big as this. Our customer does not allow convert the geometries to other geodatabase format (e.g. Postgis).

An real WFS request to a GeoServer published using this patch:
https://idena.tracasa.es/ogc/wfs?service=WFS&request=GetFeature&srsName=epsg%3A25830&typeName=CATAST_Pol_ParcelaRusti%2CCATAST_Pol_ParcelaUrba%2CCATAST_Pol_ParcelaMixta&version=1.1.0&cql_filter=CMUNICIPIO%3D27%20AND%20POLIGONO%3D1%20AND%20PARCELA%3D300&outputFormat=json

It uses 3 shapefiles with around of 700mb.

Best Regards
Alvaro

@aaime
Copy link
Member

aaime commented Mar 7, 2015

Merging, thanks for your contibution and for providing explanations.

aaime added a commit that referenced this pull request Mar 7, 2015
[GEOS-6842] Improve the dbase reader to filter WHERE clauses using an ODBC driver
@aaime aaime merged commit 9e915ff into geotools:master Mar 7, 2015
@ahuarte47
Copy link
Contributor Author

Thanks to you Andrea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants