-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pd.concat([..], axis=1) fails #1230
Comments
This appears to be causing plotnine to break on 0.6.* versions. For example... import geopandas
from plotnine import *
ne = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
ggplot() + geom_map(ne) Raises: AttributeError: 'GeometryArray' object has no attribute 'view' Stacktrace ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/ggplot.py in repr(self) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/ggplot.py in draw(self, return_ggplot) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/ggplot.py in _draw(self, return_ggplot) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/ggplot.py in _build(self) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/layer.py in setup_data(self) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/layer.py in setup_data(self) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/plotnine/geoms/geom_map.py in setup_data(self, data) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/pandas/core/reshape/concat.py in get_result(self) ~/.virtualenvs/tidytuesday/lib/python3.6/site-packages/pandas/core/internals/managers.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy) AttributeError: 'GeometryArray' object has no attribute 'view' |
since Geopandas allows only one geometry column to be specified, I was wondering as to what the ideal solution be for this? I have been working around it by converting it to a Dataframe |
I'd like to work on fixing this. As far as I can see, there are two options
Would appreciate some input on what approach is best, or if there are other alternatives. from geopandas import GeoDataFrame
from shapely.geometry import Point
import pandas as pd
geoms = [Point(0, 0), Point(1, 1)]
df = pd.DataFrame({"col1": [0, 1], "geometry": geoms})
df['geometry2'] = df['geometry']
df = df.rename(columns={'geometry2': 'geometry'})
gdf = GeoDataFrame(df) |
@m-richards we also have to figure out which of the geometry columns, in case of unique names, should be set as an active geometry after concat (none?) In any case, I think that raising is a good way forward. If you renamed you, again, need to figure out which should be active and also which is from which gdf. I'd say that leaving that to be resolved by a user is safer option. |
Currently, the geometry column of the first geodataframe gets set in geopandas/geopandas/geodataframe.py Lines 1385 to 1398 in 64598b6
So it turns out that there's another curious edge case: In [3]: cities = geopandas.read_file(geopandas.datasets.get_path("naturalearth_cities")).rename_geometry('geom')
In [4]: countries = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres")).rename_geometry('geom')
In [5]: gdf = pd.concat([countries, cities], axis=1)
In [6]: gdf.geometry
Out[6]:
geom geom
0 MULTIPOLYGON (((180.00000 -16.06713, 180.00000... POINT (12.45339 41.90328)
1 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982... POINT (12.44177 43.93610)
2 POLYGON ((-8.66559 27.65643, -8.66512 27.58948... POINT (9.51667 47.13372)
3 MULTIPOLYGON (((-122.84000 49.00000, -122.9742... POINT (6.13000 49.61166)
4 MULTIPOLYGON (((-122.84000 49.00000, -120.0000... POINT (158.14997 6.91664)
.. ... ...
197 None POINT (31.24802 30.05191)
198 None POINT (139.74946 35.68696)
199 None POINT (2.33139 48.86864)
200 None POINT (-70.66899 -33.44807)
201 None POINT (103.85387 1.29498)
[202 rows x 2 columns] I don't know if that should become a separate issue, but it's certainly not great, because if the geometry column has been set normally, the |
Also, the "use the metadata of the first geodataframe" approach in In [14]: cities = geopandas.read_file(geopandas.datasets.get_path("naturalearth_cities"))
In [16]: cities2 = cities.to_crs(crs=27700)
In [19]: gdf =pd.concat([cities, cities2], axis=0)
In [20]: gdf.geometry
Out[20]:
0 POINT (12.45339 41.90328)
1 POINT (12.44177 43.93610)
2 POINT (9.51667 47.13372)
3 POINT (6.13000 49.61166)
4 POINT (158.14997 6.91664)
...
197 POINT (3692494.09369 -1687352.07231)
198 POINT (3930565.10961 9764719.36170)
199 POINT (717570.12268 -105552.10645)
200 POINT (-6222461.46130 -12320935.04244)
201 POINT (13067799.34178 13920712.76573)
Name: geometry, Length: 404, dtype: geometry
In [21]: gdf.crs
Out[21]:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich I think it probably makes sense to open a separate issue for that though. |
This "works":
But once you do something spatial with the resulting GeoDataFrame (anything that accesses the "geometry column"), things breaks (due to there being two columns with the "geometry" name).
In 0.6.0 this already started failing when doing the concat:
The text was updated successfully, but these errors were encountered: