A faster file-based format for geometries with
This project capitalizes on the very fast
feather file format to store geometry (points, lines, polygons) data for interoperability with
Why does this exist?
This project exists because reading and writing standard spatial formats (e.g., shapefile) in
geopandas is slow. I was working with millions of geometries in multiple processing steps, and needed a fast way to read and write intermediate files.
In our benchmarks, we see about 5-6x faster file writes than writing from geopandas to shapefile via
.to_file() on a
We see about 2x faster reads compared to geopandas
How does it work?
feather format works brilliantly for standard
pandas data frames. In order to leverage the
feather format, we simply convert the geometry data from
shapely objects into Well Known Binary (WKB) format, and then store that column as raw bytes.
We store the coordinate reference system using JSON format in a sidecar file
Available on PyPi at: https://pypi.org/project/geofeather/
pip install geofeather
Given an existing
my_gdf, pass this into
my_gdf = from_geofeather('test.feather')
Right now, indexes are not supported in
feather files. In order to get around this, simply reset your index before calling