Skip to content

Commit

Permalink
ENH: (optionally) use pygeos for vectorized GeometryArray operations (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
jorisvandenbossche committed Mar 24, 2020
1 parent 6b037d0 commit 5d1181a
Show file tree
Hide file tree
Showing 20 changed files with 1,333 additions and 496 deletions.
19 changes: 11 additions & 8 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@ matrix:
- env: ENV_FILE="ci/travis/35-minimal.yaml"

# Python 3.6 test all supported Pandas versions
- env: ENV_FILE="ci/travis/36-pd023.yaml"
- env: ENV_FILE="ci/travis/36-pd024.yaml"
- env: ENV_FILE="ci/travis/36-pd023.yaml" PYGEOS=true
- env: ENV_FILE="ci/travis/36-pd024.yaml" PYGEOS=true

- env: ENV_FILE="ci/travis/37-latest-defaults.yaml" STYLE=true
- env: ENV_FILE="ci/travis/37-latest-conda-forge.yaml"
- env: ENV_FILE="ci/travis/37-latest-defaults.yaml" STYLE=true PYGEOS=true
- env: ENV_FILE="ci/travis/37-latest-conda-forge.yaml" PYGEOS=true

- env: ENV_FILE="ci/travis/38-latest-conda-forge.yaml"
- env: ENV_FILE="ci/travis/38-latest-conda-forge.yaml" PYGEOS=true

- env: ENV_FILE="ci/travis/37-dev.yaml" DEV=true
- env: ENV_FILE="ci/travis/37-dev.yaml" DEV=true PYGEOS=true

allow_failures:
- env: ENV_FILE="ci/travis/37-dev.yaml" DEV=true
- env: ENV_FILE="ci/travis/37-dev.yaml" DEV=true PYGEOS=true

install:
# Install conda
Expand All @@ -46,7 +46,10 @@ install:
- python -c "import geopandas; geopandas.show_versions();"

script:
- py.test geopandas --cov geopandas -v --cov-report term-missing
- echo "Testing without PyGEOS"
- USE_PYGEOS=0 pytest geopandas --cov geopandas -v --cov-report term-missing
- if [ "$PYGEOS" ]; then echo "Testing with PyGEOS"; fi
- if [ "$PYGEOS" ]; then USE_PYGEOS=1 pytest geopandas --cov geopandas -v --cov-report term-missing; fi
- if [ "$STYLE" ]; then black --check geopandas; fi
- if [ "$STYLE" ]; then flake8 geopandas; fi

Expand Down
2 changes: 1 addition & 1 deletion asv.conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@
"matrix": {
"pandas": [],
"shapely": [],
"cython": [],
"pygeos": [],
"fiona": [],
"pyproj": [],
"rtree": [],
Expand Down
2 changes: 2 additions & 0 deletions ci/travis/36-pd023.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ dependencies:
- gdal=2.3
- fiona
#- pyproj
- geos
# testing
- pytest
- pytest-cov
Expand All @@ -28,3 +29,4 @@ dependencies:
- pyproj==2.3.1
- geopy
- codecov
- git+https://github.com/pygeos/pygeos.git
2 changes: 2 additions & 0 deletions ci/travis/36-pd024.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ dependencies:
- shapely
- fiona=1.7
#- pyproj
- geos
# testing
- pytest
- pytest-cov
Expand All @@ -25,3 +26,4 @@ dependencies:
- codecov
- geopy
- mapclassify
- git+https://github.com/pygeos/pygeos.git
2 changes: 2 additions & 0 deletions ci/travis/37-dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ dependencies:
- shapely
- fiona
- pyproj
- geos
# testing
- pytest
- pytest-cov
Expand All @@ -25,3 +26,4 @@ dependencies:
- codecov
- geopy
- mapclassify
- git+https://github.com/pygeos/pygeos.git
1 change: 1 addition & 0 deletions ci/travis/37-latest-conda-forge.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ dependencies:
- shapely
- fiona
- pyproj
- pygeos
# testing
- pytest
- pytest-cov
Expand Down
2 changes: 2 additions & 0 deletions ci/travis/37-latest-defaults.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ dependencies:
- shapely
- fiona
- pyproj
- geos
# testing
- pytest
- pytest-cov
Expand All @@ -24,3 +25,4 @@ dependencies:
- codecov
- geopy
- mapclassify
- git+https://github.com/pygeos/pygeos.git
1 change: 1 addition & 0 deletions ci/travis/38-latest-conda-forge.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ dependencies:
- shapely
- fiona
- pyproj
- pygeos
# testing
- pytest
- pytest-cov
Expand Down
42 changes: 42 additions & 0 deletions doc/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,46 @@ For plotting, these additional packages may be used:
- `mapclassify`_


Using the optional PyGEOS dependency
------------------------------------

Work is ongoing to improve the performance of GeoPandas. Currently, the
fast implementations of basic spatial operations live in the `PyGEOS`_
package (but work is under way to contribute those improvements to Shapely).
Starting with GeoPandas 0.8, it is possible to optionally use those
experimental speedups by installing PyGEOS. This can be done with conda
(using the conda-forge channel) or pip::

# conda
conda install pygeos --channel conda-forge
# pip
pip install pygeos

More specifically, whether the speedups are used or not is determined by:

- If PyGEOS is installed, it will be used by default (but installing GeoPandas
will not yet automatically install PyGEOS as dependency, you need to do this
manually).

- You can still toggle the use of PyGEOS when it is available, by:

- Setting an environment variable (``USE_PYGEOS=0/1``). Note this variable
is only checked at first import of GeoPandas.
- Setting an option: ``geopandas.options.use_pygeos = True/False``. Note,
although this variable can be set during an interactive session, it will
only work if the GeoDataFrames you use are created (e.g. reading a file
with ``read_file``) after changing this value.

.. warning::

The use of PyGEOS is experimental! Although it is passing all tests,
there might still be issues and not all functions of GeoPandas will
already benefit from speedups. But trying this out is very welcome!
Any issues you encounter (but also reports of successful usage are
interesting!) can be reported at https://gitter.im/geopandas/geopandas
or https://github.com/geopandas/geopandas/issues


.. _PyPI: https://pypi.python.org/pypi/geopandas

.. _GitHub: https://github.com/geopandas/geopandas
Expand Down Expand Up @@ -204,3 +244,5 @@ For plotting, these additional packages may be used:
.. _GEOS: https://geos.osgeo.org

.. _PROJ: https://proj.org/

.. _PyGEOS: https://github.com/pygeos/pygeos/
3 changes: 2 additions & 1 deletion geopandas/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from geopandas._config import options # noqa

from geopandas.geoseries import GeoSeries # noqa
from geopandas.geodataframe import GeoDataFrame # noqa
from geopandas.array import points_from_xy # noqa
Expand All @@ -12,7 +14,6 @@

import geopandas.datasets # noqa

from geopandas._config import options # noqa

# make the interactive namespace easier to use
# for `from geopandas import *` demos.
Expand Down
80 changes: 80 additions & 0 deletions geopandas/_compat.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from distutils.version import LooseVersion
import os

import pandas as pd

Expand All @@ -9,3 +10,82 @@
PANDAS_GE_024 = str(pd.__version__) >= LooseVersion("0.24.0")
PANDAS_GE_025 = str(pd.__version__) >= LooseVersion("0.25.0")
PANDAS_GE_10 = str(pd.__version__) >= LooseVersion("0.26.0.dev")


# -----------------------------------------------------------------------------
# Shapely / PyGEOS compat
# -----------------------------------------------------------------------------

USE_PYGEOS = None
PYGEOS_SHAPELY_COMPAT = None


def set_use_pygeos(val=None):
"""
Set the global configuration on whether to use PyGEOS or not.
The default is use PyGEOS if it is installed. This can be overridden
with an environment variable USE_PYGEOS (this is only checked at
first import, cannot be changed during interactive session).
Alternatively, pass a value here to force a True/False value.
"""
global USE_PYGEOS
global PYGEOS_SHAPELY_COMPAT

if val is not None:
USE_PYGEOS = bool(val)
else:
if USE_PYGEOS is None:
try:
import pygeos # noqa

USE_PYGEOS = True
except ImportError:
USE_PYGEOS = False

env_use_pygeos = os.getenv("USE_PYGEOS", None)
if env_use_pygeos is not None:
USE_PYGEOS = bool(int(env_use_pygeos))

# validate the pygeos version
if USE_PYGEOS:
try:
import pygeos # noqa

# validate the pygeos version
if not str(pygeos.__version__) >= LooseVersion("0.6"):
raise ImportError(
"PyGEOS >= 0.6 is required, version {0} is installed".format(
pygeos.__version__
)
)

# Check whether Shapely and PyGEOS use the same GEOS version.
# Based on PyGEOS from_shapely implementation.

from shapely.geos import geos_version_string as shapely_geos_version
from pygeos import geos_capi_version_string

# shapely has something like: "3.6.2-CAPI-1.10.2 4d2925d6"
# pygeos has something like: "3.6.2-CAPI-1.10.2"
if not shapely_geos_version.startswith(geos_capi_version_string):
warnings.warn(
"The Shapely GEOS version ({}) is incompatible with the GEOS "
"version PyGEOS was compiled with ({}). Conversions between both "
"will be slow.".format(
shapely_geos_version, geos_capi_version_string
)
)
PYGEOS_SHAPELY_COMPAT = False
else:
PYGEOS_SHAPELY_COMPAT = True

except ImportError:
raise ImportError(
"To use the PyGEOS speed-ups within GeoPandas, you need to install "
"PyGEOS: 'conda install pygeos' or 'pip install pygeos'"
)


set_use_pygeos()
38 changes: 36 additions & 2 deletions geopandas/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ def __setattr__(self, key, value):
if option.validator:
option.validator(value)
self._config[key] = value
if option.callback:
option.callback(key, value)
else:
msg = "You can only set the value of existing options"
raise AttributeError(msg)
Expand Down Expand Up @@ -58,7 +60,7 @@ def __repr__(self):
else:
doc_text = u"No description available."
doc_text = indent(doc_text, prefix=" ")
description += doc_text
description += doc_text + "\n"
space = "\n "
description = description.replace("\n", space)
return "{}({}{})".format(cls, space, description)
Expand Down Expand Up @@ -100,4 +102,36 @@ def _validate_display_precision(value):
callback=None,
)

options = Options({"display_precision": display_precision})

def _validate_bool(value):
if not isinstance(value, bool):
raise TypeError("Expected bool value, got {0}".format(type(value)))


def _default_use_pygeos():
import geopandas._compat as compat

return compat.USE_PYGEOS


def _callback_use_pygeos(key, value):
assert key == "use_pygeos"
import geopandas._compat as compat

compat.set_use_pygeos(value)


use_pygeos = Option(
key="use_pygeos",
default_value=_default_use_pygeos(),
doc=(
"Whether to use PyGEOS to speed up spatial operations. The default is True "
"if PyGEOS is installed, and follows the USE_PYGEOS environment variable "
"if set."
),
validator=_validate_bool,
callback=_callback_use_pygeos,
)


options = Options({"display_precision": display_precision, "use_pygeos": use_pygeos})

0 comments on commit 5d1181a

Please sign in to comment.