Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add from_wkt and from_wkb tools #293

Merged
merged 3 commits into from
Jun 24, 2024

Conversation

giorgiobasile
Copy link
Contributor

Implements dask_geopandas.from_wkt and dask_geopandas.from_wkb, taking inspiration from dask_geopandas.points_from_xy.

Closes #208

@giorgiobasile giorgiobasile force-pushed the from-wkt-wkb branch 2 times, most recently from c5fc899 to f36271d Compare May 28, 2024 09:16
@giorgiobasile giorgiobasile marked this pull request as ready for review May 28, 2024 09:27
@jorisvandenbossche
Copy link
Member

@giorgiobasile Thanks for the PR!

One question we have to think about is the exact API: do we want to do this at the dataframe level, or at the (geo)series level? Right now you added a function that takes a dataframe and returns a GeoSeries. But if it returns a GeoSeries, we could also let it accept a Series (and the user can do the column selection themselves). Or otherwise let it accept and return a (Geo)DataFrame.

@giorgiobasile
Copy link
Contributor Author

giorgiobasile commented May 30, 2024

Thanks @jorisvandenbossche for your remarks!

One question we have to think about is the exact API: do we want to do this at the dataframe level, or at the (geo)series level? Right now you added a function that takes a dataframe and returns a GeoSeries.

Yes, the reason I sticked with the dask.dataframe -> gpd.GeoSeries interface was mainly for consistency with dask_geopandas.points_from_xy which I took inspiration from, as I needed to do something like:

gdf = dask_geopandas.from_dask_dataframe(
    ddf, geometry=dask_geopandas.from_wkt(ddf, geometry_wkt='wkt', crs="EPSG:4326")
)

Just double checking I'm getting what you are suggesting:

But if it returns a GeoSeries, we could also let it accept a Series (and the user can do the column selection themselves)

So this would change to allow for something like:

gdf = dask_geopandas.from_dask_dataframe(
    ddf, geometry=dask_geopandas.from_wkt(ddf["wkt"], crs="EPSG:4326")
)

Or otherwise let it accept and return a (Geo)DataFrame.

In this case the user would have a unified API, like:

gdf = dask_geopandas.from_wkt(ddf, geometry_wkt='wkt', crs="EPSG:4326")

Does this all make sense? Also, If we go for any of the latter two cases, should the interface be applied to dask_geopandas.points_from_xy as well, for sake of consistency?

@jorisvandenbossche
Copy link
Member

Apologies for the slow follow-up!

Yes, the reason I sticked with the dask.dataframe -> gpd.GeoSeries interface was mainly for consistency with dask_geopandas.points_from_xy which I took inspiration from

Yes, that's a good point. points_from_xy has the constraint that it has multiple input columns, though, which we don't have for from_wkb, so we have a bit more flexibility.

But if it returns a GeoSeries, we could also let it accept a Series (and the user can do the column selection themselves)

So this would change to allow for something like:

gdf = dask_geopandas.from_dask_dataframe(
    ddf, geometry=dask_geopandas.from_wkt(ddf["wkt"], crs="EPSG:4326")
)

I think this feels the most natural API to me / most consistent with how it works in geopandas.

@giorgiobasile
Copy link
Contributor Author

Apologies for the slow follow-up!

No problem!

I think this feels the most natural API to me / most consistent with how it works in geopandas.

Ok sure, I just updated it accordingly.

@jorisvandenbossche jorisvandenbossche changed the title Add from_wkt and from_wkb tools ENH: Add from_wkt and from_wkb tools Jun 24, 2024
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, looks good!

@jorisvandenbossche jorisvandenbossche merged commit 7679bf9 into geopandas:main Jun 24, 2024
12 checks passed
@jorisvandenbossche
Copy link
Member

Thanks @giorgiobasile!

@giorgiobasile
Copy link
Contributor Author

Thanks @giorgiobasile!

@jorisvandenbossche thank you for your suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Dask GeoDataFrame from wkt string
2 participants