Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes GPKG prefered input format for OGR algs #Fixes 29097 #32082

Merged
merged 4 commits into from
Oct 6, 2019

Conversation

SrNetoChan
Copy link
Member

Description

Currently, when a GDAL algorithm uses a memory layer (or runs from the temporary output in a processing model) the input layer is converted to a shapefile, with all its drawbacks. The bigger one being that the names of the columns get truncated.

This makes the default transaction format Geopackage.

Fixes #29097

Checklist

  • Commit messages are descriptive and explain the rationale for changes
  • Commits which fix bugs include Fixes #11111 at the bottom of the commit message
  • I have read the QGIS Coding Standards and this PR complies with them
  • New unit tests have been added for core changes
  • I have run the scripts/prepare-commit.sh script before each commit
  • I have evaluated whether it is appropriate for this PR to be backported, backport requests are left as label or comment

@@ -89,6 +89,7 @@ def getOgrCompatibleSource(self, parameter_name, parameters, context, feedback,
# and extract selection if required
ogr_data_path = self.parameterAsCompatibleSourceLayerPath(parameters, parameter_name, context,
QgsVectorFileWriter.supportedFormatExtensions(),
'gpkg',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, don't hardcode this format (even if it's better than shp :) )

QgsVectorFileWriter.supportedFormatExtensions()[0] returns 'gpkg' is GDAL have gpkg support (it can be disabled) instead 'shp'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call it hardcoded... but who am I? :-p

If gdal has support to gpkg, why isn't the first format from that list used? It must be forced somewhere else. No?

@SrNetoChan
Copy link
Member Author

@lbartoletti seems that the SHP is "hardcoded" directly on the parameterAsCompatibleSourceLayerPath() method, because if no preferedFormat argument is given, "shp" is used:

https://qgis.org/api/classQgsProcessingParameters.html#a9a9de0b49b654398111a2f9190154972

So, it seems to me that this is the only way to do it, unless we want to change the default format for the the function, which I suppose will affect many more algorithms.

What I propose is that at least for GDAL algorithms, which supports gpkg (and it's no probable that someone switch it off), we start to use geopackage as preferredFormat instead.

If we agree on this, I will work on fixing the tests (that are probably expecting an shapefile as output)

@nyalldawson
Copy link
Collaborator

I believe what @lbartoletti is asking is that you first test to see if gpkg is an available format for writing vectors, and if not, fallback to shapefile.

@SrNetoChan
Copy link
Member Author

SrNetoChan commented Oct 2, 2019

@nyalldawson @lbartoletti

Ahh! So, confirm that the gpkg driver for GDAL driver before setting it as default? If it is, use gpkg, if not (who knows why the user has turned it off) use shapefile instead (which AFAIK is always on)?

Can someone point me to the class or function where I could check that? thanks!

@lbartoletti
Copy link
Member

I believe what @lbartoletti is asking is that you first test to see if gpkg is an available format for writing vectors, and if not, fallback to shapefile.

Totally :)

supports gpkg (and it's no probable that someone switch it off),

Unfortunately, I know some people who disable it (and others because they require dependencies). ¯_(ツ)_/¯

@SrNetoChan
Copy link
Member Author

SrNetoChan commented Oct 2, 2019

Nevermind, as soon as you turn the gpkg driver off, the QgsVectorFileWriter.supportedFormatExtensions() no longer returns the gpkg, but as you were expecting, the preferred format is used and caBum!

Is there a better way to check if the gpkg is active than simpling calling QgsVectorFileWriter.supportedFormatExtensions() and check if it's there?

@SrNetoChan
Copy link
Member Author

This seems to work, but I am not sure it's a very elegant way of going it.

@SrNetoChan
Copy link
Member Author

Maybe an even better way would be to "fix" the classQgsProcessingParameters (or some of the methods in which it relies upon) to make sure that the preferredFormat is present in the compatibleFormats list, before trying to use it.

@nyalldawson
Copy link
Collaborator

This approach looks fine to me. I think the logic does belong here, since only the provider would know what fallbacks to use in case of a missing preferred format.

@lbartoletti are you happy with this one?

ogr_data_path = self.parameterAsCompatibleSourceLayerPath(parameters, parameter_name, context,
QgsVectorFileWriter.supportedFormatExtensions(),
preferred_format,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QgsVectorFileWriter.supportedFormatExtensions()[0] will works more simply.

What I got on my system when GDAL is built:

Without external option

QgsVectorFileWriter.supportedFormatExtensions()
['shp', '000', 'bna', 'csv', 'dgn', 'dxf', 'geojson', 'geojsonl', 'geojsons', 'gml', 'gpx', 'gxt', 'json', 'kml', 'tab', 'txt', 'xml']

With gpkg/sqlite3 (and other formats):

QgsVectorFileWriter.supportedFormatExtensions()
['gpkg', 'shp', '000', 'bna', 'csv', 'dgn', 'dxf', 'geojson', 'geojsonl', 'geojsons', 'gml', 'gpx', 'gxt', 'ili', 'itf', 'json', 'kml', 'ods', 'sqlite', 'tab', 'txt', 'xlsx', 'xml', 'xtf']

By default, supportedFormatExtensions() returns the list sorted by recommanded formats, so it's not necessary to add this if/else statement.

@lbartoletti
Copy link
Member

Thanks @SrNetoChan

@alexbruy
Copy link
Contributor

alexbruy commented Oct 4, 2019

As this changes default output format, GDAL tests should be adjusted too.

@SrNetoChan
Copy link
Member Author

@alexbruy I have worked on that already. Would it make sense to add a new test to test this fallback mechanism?

If so, how could I disabled and enable gdal gpkg driver using PyQGIS?

@SrNetoChan
Copy link
Member Author

@lbartoletti I have changed the code to use your latest suggestion. It looks much better.

@alexbruy I think the changes I made to the tests should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Processing/OGR: do not use shapefile as format for temp input files
4 participants