Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract gdf index and use it during serialization of .to_geojson() and .to_gdf() #165

Merged
merged 3 commits into from
Aug 11, 2022

Conversation

mattijn
Copy link
Owner

@mattijn mattijn commented Aug 11, 2022

This PR fix issue #164.

If a source geodataframe has a unique index, it will be used during extraction and used as index again when using the .to_gdf() function call. This also means that the commonly used identifier is being used as a member for all features in the serialization process creating geosjon (.to_geojson()).

Additionally, this PR also tries to use the .crs of the source geodataframe.

@mattijn mattijn merged commit fbf3238 into master Aug 11, 2022
@mattijn mattijn deleted the keep-source-index-from-gdf branch August 11, 2022 15:45
Comment on lines +549 to +550
if geom.index.is_unique:
self._data = geom.to_dict(orient="index")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to give a warning to the user, that the index is not unique?

Comment on lines 385 to +388
import geopandas

gdf = geopandas.GeoDataFrame().from_features(features=fc["features"], crs=crs)
gdf = gdf.set_index(geopandas.pd.json_normalize(fc["features"])["id"].values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little weird for me to use like GeoDataFrame().from_features and geopandas.pd.json_normalize.
from_features is a class method, we can directly call it from GeoDataFrame.
geopandas.pd is a shortcut to hook pandas, but it's a little bit confusing.

from geopandas import GeoDataFrame
from pandas import json_normalize

return (
    GeoDataFrame.from_features(features=fc["features"], crs=crs)
    .set_axis(json_normalize(fc["features"])["id"].values)
)

@mattijn
Copy link
Owner Author

mattijn commented Aug 12, 2022

Thanks @Zeroto521 for the valuable feedback!
If you've time, you can send a PR?
Otherwise I will come back to this in a few days.

@Zeroto521
Copy link
Contributor

Please check #166.

I only lint serialize_as_geodataframe.
But about the warning information, I'm still not sure whether add it or not.

        if geom.index.is_unique:
            self._data = geom.to_dict(orient="index")
        else:
            # warn the user the index is not unique

@mattijn
Copy link
Owner Author

mattijn commented Aug 12, 2022

Thanks for your concern. I think it is best to work with a unique index internally and store the index of the source separately. Than the source index can contains duplicates without creating potential issues for this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants