Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing real example #12

Closed
deeplook opened this issue Jul 9, 2020 · 2 comments
Closed

Missing real example #12

deeplook opened this issue Jul 9, 2020 · 2 comments

Comments

@deeplook
Copy link

deeplook commented Jul 9, 2020

I'm trying to evaluate this package by writing a little example, but I'm running into some errors below. I try to write a small small example, to read a simple CSV file with lat/lon columns, add a geometry column and turn this into a Dask-GeoDateFrame, but I'm running into errors like these:

  • AttributeError: 'Series' object has no attribute 'map_partitions'
  • AttributeError: 'DataFrame' object has no attribute 'geometry'
import dask_geopandas
import dask.dataframe as dd
import pandas as pd
import geopandas as gd

df = pd.read_csv('airport_volume_airport_locations.csv')

# ok
gdf = gd.GeoDataFrame(
    df, geometry=gd.points_from_xy(df.Airport1Latitude, df.Airport1Longitude)
)

ddf = dask_geopandas.from_geopandas(df, npartitions=4)
# raises AttributeError: 'DataFrame' object has no attribute 'set_geometry'
ddf.set_geometry(
    dask_geopandas.points_from_xy(ddf, "Airport1Latitude", "Airport1Longitude")
)
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jul 9, 2020

@deeplook thanks for taking a look and giving feedback! (and yes, examples are still very scarce ..)

The dask_geopandas.from_geopandas expects a GeoDataFrame, but you are passing it a DataFrame (df, and not gdf). We should probably already check that and directly raise an error, instead of letting it result in an error in a later operation.
So you could pass gdf instead, but this of course doesn't do the xy-to-points conversion with dask.

For what you are trying to do (create a dask_geopandas.GeoDataFrame from a dask.DataFrame (and not from a geopandas.GeoDataFrame)), you can do something like:

ddf = dd.from_pandas(df, npartitions=4)
ddf["geometry"] = dask_geopandas.points_from_xy(ddf, "Airport1Latitude", "Airport1Longitude")
gddf = dask_geopandas.from_dask_dataframe(ddf)

The reason is that this is a bit more complicated than the set_geometry which you tried (and works for plain GeoPandas), is because GeoPandas monkey-patches the pandas DataFrame with a set_geometry method so that you can convert a pandas DataFrame which already has a geometry column to a GeoDataFrame using that method.
But we didn't (yet) do the same in dask_geopandas (we probably could for consistency with geopandas).

@deeplook
Copy link
Author

deeplook commented Jul 9, 2020

Thanks! I'll surely come back with other things... ;)

@deeplook deeplook closed this as completed Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants