-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: .to_parquet unable to write empty GeoDataFrame #3137
Comments
Presumably the main use case of allowing write of an empty Parquet file is to preserve column information and possibly CRS (if set), but perhaps not to write a truly empty DataFrame (no columns, CRS, etc), and as a special case for GeoPandas on top of Pandas, that we preserve info about the geometry column in the GeoDataFrame. While Pandas allows us to write empty DataFrames to Parquet, we're not always doing so for a few reasons - some of which may be out of scope of GeoPandas:
So - I think that we already support writing an empty GeoDataFrame that includes a geometry column and is compatible with the GeoParquet spec, and the only real place we could improve is to make it more clear that if you call |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of geopandas.
(optional) I have confirmed this bug exists on the main branch of geopandas.
Code Sample, a copy-pastable example
Generated output (short version):
ValueError: Writing to Parquet/Feather requires string column names
Problem description
Well, I think that this capacity is nice (or even needed) for generalization, as for example considering GeoJson, when an empty geodataframe can be easily serialized and read back. I know that parquet is column-oriented, and as the output suggests it must have at least one column, so writing like this, works:
But with a column sets (such as
['data1','data2']
or['data']
that doest not includes "geometry, it throws this error at print() call:output:
Expected Output
The generation of a file that will be read as
Empty GeoDataFrame
without the need to specify any column names.Output of
geopandas.show_versions()
SYSTEM INFO
python : 3.11.7 (main, Dec 8 2023, 18:56:58) [GCC 11.4.0]
executable : /home/kaue/opensidewalkmap_beta/.venv/bin/python
machine : Linux-6.2.0-39-generic-x86_64-with-glibc2.35
GEOS, GDAL, PROJ INFO
GEOS : 3.11.2
GEOS lib : None
GDAL : 3.6.4
GDAL data dir: /home/kaue/opensidewalkmap_beta/.venv/lib/python3.11/site-packages/fiona/gdal_data
PROJ : 9.3.0
PROJ data dir: /home/kaue/opensidewalkmap_beta/.venv/lib/python3.11/site-packages/pyproj/proj_dir/share/proj
PYTHON DEPENDENCIES
geopandas : 0.14.2
numpy : 1.26.3
pandas : 2.1.4
pyproj : 3.6.1
shapely : 2.0.2
fiona : 1.9.5
geoalchemy2: None
geopy : None
matplotlib : None
mapclassify: None
pygeos : None
pyogrio : None
psycopg2 : None
pyarrow : 14.0.2
rtree : None
None
The text was updated successfully, but these errors were encountered: