# Writing an efficient code for GeoPandas and Shapely in 2023

With the release of Shapely 2.0, the GeoPandas-based code that have been optimised years ago may no longer provide the best performance. This workshop will show you how to change that and write efficient and convenient GeoPandas code that uses the benefits of the latest developments in the Python geospatial ecosystem.

**Martin Fleischmann, Joris van den Bossche**

08/03/2022, Basel

## Setup

Follow the ReadMe to set up the environment correctly. You should have these packages installed:

```
- geopandas
- pyogrio
- pyarrow
```

## What is GeoPandas?

**Easy, fast and scalable geospatial analysis in Python**

From the docs:

> The goal of GeoPandas is to make working with geospatial data in python easier. It combines the capabilities of pandas and shapely, providing geospatial operations in pandas and a high-level interface to multiple geometries to shapely. GeoPandas enables you to easily do operations in python that would otherwise require a spatial database such as PostGIS.

## How to write an efficient code for GeoPandas?

The way an efficient code should look like evolves as the whole ecosystem constantly develops better, smarter and faster tools. What was considered a good piece of code only a few years ago may not be optimal today. 

This notebook contains a set of examples of the common tasks. Each shows a way that was recommended some time ago. By us, by the community, in the documentation or on StackOverflow or elsewhere. And each show a way that is recommended today, with GeoPandas 0.12 and Shapely 2.0. 

In [1]:
import geopandas

In [2]:
geopandas.show_versions()


SYSTEM INFO
-----------
python     : 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:26:40) [Clang 14.0.6 ]
executable : /Users/martin/mambaforge/envs/geopandas-workshop/bin/python
machine    : macOS-13.2.1-arm64-arm-64bit

GEOS, GDAL, PROJ INFO
---------------------
GEOS       : 3.11.1
GEOS lib   : None
GDAL       : 3.6.2
GDAL data dir: /Users/martin/mambaforge/envs/geopandas-workshop/share/gdal
PROJ       : 9.1.1
PROJ data dir: /Users/martin/mambaforge/envs/geopandas-workshop/share/proj

PYTHON DEPENDENCIES
-------------------
geopandas  : 0.12.2
numpy      : 1.24.2
pandas     : 1.5.3
pyproj     : 3.4.1
shapely    : 2.0.1
fiona      : 1.9.1
geoalchemy2: None
geopy      : None
matplotlib : 3.7.0
mapclassify: 2.5.0
pygeos     : None
pyogrio    : 0.5.1
psycopg2   : None
pyarrow    : 11.0.0
rtree      : 1.0.1


- spatial predicates - use STRtree not binary predicates or sjoin
    - https://stackoverflow.com/questions/48097742/geopandas-point-in-polygon
    - https://stackoverflow.com/questions/62410871/how-do-i-test-if-point-is-in-polygon-multipolygon-with-geopandas-in-python
    - https://gis.stackexchange.com/questions/281652/finding-all-neighbors-using-geopandas
- efficient IO
    - pyogrio, parquet
- pairwise distance (shapely ufunc broadcasting - `shapely.distance(pts, np.reshape(pts, (-1, 1)))`)
    - https://stackoverflow.com/questions/64754025/calculate-all-distances-between-two-geodataframe-of-points-in-geopandas
- folium mapping
    - https://autogis-site.readthedocs.io/en/latest/lessons/lesson-5/interactive-maps.html
- WKT/WKB serialization
    - https://stackoverflow.com/questions/61122875/geopandas-how-to-read-a-csv-and-convert-to-a-geopandas-dataframe-with-polygons
    - https://stackoverflow.com/questions/61125808/geopandas-how-to-convert-the-column-geometry-to-string
- creating point geometry
    - https://stackoverflow.com/questions/50971914/what-is-the-most-efficient-way-to-convert-numpy-arrays-to-shapely-points
- nearest
    - https://stackoverflow.com/questions/30740046/calculate-distance-to-nearest-feature-with-geopandas
    - https://stackoverflow.com/questions/56520780/how-to-use-geopanda-or-shapely-to-find-nearest-point-in-same-geodataframe
- points to lines
    - https://stackoverflow.com/questions/51071365/convert-points-to-lines-geopandas
- specifying a projection (don't use `init...`)
    - https://gis.stackexchange.com/questions/218450/getting-polygon-areas-using-geopandas
- getting coordinates
    - https://gis.stackexchange.com/questions/287306/listing-all-polygon-vertices-coordinates-using-geopandas
- dissolve connected components
    - https://gis.stackexchange.com/questions/271733/geopandas-dissolve-overlapping-polygons
- rounding coordinates (and the issues it may cause)
    - https://gis.stackexchange.com/questions/321518/rounding-coordinates-to-five-decimals-in-geopandas