-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Exporting a GeoDataframe with column 'fid' to GeoPackage will delete that column #1035
Comments
Example script to recreate:
Output will be as follows:
I noticed in Link: https://gdal.org/programs/ogr2ogr.html#cmdoption-ogr2ogr-preserve-fid |
Update: This is expected behavior for the driver and occurs on write. GPKG uses "fid" as the index. If present, it is assigned as the index "id." A more clear example would be to use a number other than 1 for the "fid" value in the above example. Let's say we use "12" instead. The resulting output would have the "fid" column missing from the columns but the index value for the first row would be "12" instead of "0." Reading in the file with
In the above example we can see the "12" value being assigned to the feature id. But it is not preserved and carried through to the index in GeoPandas:
Returns...
|
I took some additional notes on what is happening and captured it here: http://kuanbutts.com/2019/07/02/gpkg-write-from-geopandas/ Ultimately, I am not sure if this is a decision GeoPandas should be making. I can see the argument that this is something that a user might want to specifically handle themselves. The only item I think might be valid in terms of making a modification to GeoPandas would be that, when reading in a GPKG, create an additional column |
@kuanb thanks for the exploration, very useful! There is also some related discussion to FIDs in fiona in Toblerity/Fiona#327. Additional problem in handling this consistently is that it probably also depends on the driver. Also not sure what is best here. We could set the ids as the index or as an 'fid' column, it's not that hard code-wise if we would decide that is best (although it is somewhat backwards incompatible). But I would prefer to not do too many driver-specific things to geopandas. |
Any update on this? Would be great to have the option of explicity indexing by the GeoJSON feature IDs instead of just re-sequencing based on read order. I'm specifically interested in this because I'm splitting the data between vector tiles and a Postgres Database, so if I can't specify the index manually, there's a chance that features could end up with different indices which would make it impossible to share information between the tiles and the DB. |
Also interested in this. Mainly for the part about reading the fid values properly from an existing geopackage rather than writing the fid column. I need to do some checks on the geometries in a geopackage file and print out which rows need to be manually checked, and the fid would be the logical thing to print out for users to find the rows to be checked... but not possible due to the current behaviour of just ditching the fid column. At first sight, I think the safest solution would be to add an fid column rather than change the pandas index for (backwards)compatibility reasons... |
I am still a bit hesitant to always convert the But I certainly understand that, if you want access to the FID of the GeoPackage file, the current situation is quite annoying.. So some option I can think of:
Thoughts? Somebody interested in doing a PR for one of those? |
I forgot about this back then... but a new case where a solution for this would have made life a lot easier arrived... So, in response to @jorisvandenbossche: I think your second option sounds great. It is safe backwards-compatibility wise, and as it is an index, it sounds reasonable that it would end up as the index of the resulting GeoDataFrame as well. I also noticed |
Any solution? In my project |
This is my workaround, hope it can be usefull https://gist.github.com/MaxDragonheart/46445a150aac9d528dadd2ec877203a5 |
Steps:
Create a GeoDataframe with a column
'fid'
Export as a GeoPackage using
to_file()
Read the exported GeoPackage
Bug
The
'fid'
column should exist, but is missing.Hunches
A scan of both the GeoPandas and the Fiona source code reveal no reason this bug should occur, so I suspect that it's actually an issue with GDAL.
The text was updated successfully, but these errors were encountered: