Skip to content
Faster file-based I/O for geometry data with Geopandas
Python
Branch: master
Clone or download
Latest commit a45d3d2 Oct 4, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
geofeather Functional implementation, minimal tests Oct 3, 2019
tests Added tests and benchmarks for lines and polygons Oct 4, 2019
.gitignore
.travis.yml Functional implementation, minimal tests Oct 3, 2019
LICENSE Initial commit Oct 3, 2019
Pipfile Functional implementation, minimal tests Oct 3, 2019
Pipfile.lock Functional implementation, minimal tests Oct 3, 2019
README.md typo fix in README Oct 4, 2019
setup.py Functional implementation, minimal tests Oct 3, 2019

README.md

geofeather

Build Status Coverage Status

A faster file-based format for geometries with geopandas.

This project capitalizes on the very fast feather file format to store geometry (points, lines, polygons) data for interoperability with geopandas.

Introductory post.

Why does this exist?

This project exists because reading and writing standard spatial formats (e.g., shapefile) in geopandas is slow. I was working with millions of geometries in multiple processing steps, and needed a fast way to read and write intermediate files.

In our benchmarks, we see about 5-6x faster file writes than writing from geopandas to shapefile via .to_file() on a GeoDataFrame.

We see about 2x faster reads compared to geopandas read_file() function.

How does it work?

The feather format works brilliantly for standard pandas data frames. In order to leverage the feather format, we simply convert the geometry data from shapely objects into Well Known Binary (WKB) format, and then store that column as raw bytes.

We store the coordinate reference system using JSON format in a sidecar file .crs.

Installation

Available on PyPi at: https://pypi.org/project/geofeather/

pip install geofeather

Usage

Write

Given an existing GeoDataFrame my_gdf, pass this into to_geofeather:

to_geofeather(my_gdf, 'test.feather')

Read

my_gdf = from_geofeather('test.feather')

Indexes

Right now, indexes are not supported in feather files. In order to get around this, simply reset your index before calling to_geofeather.

You can’t perform that action at this time.