In [None]:
import pandas as pd
import geopandas as gpd

# `geopandas == pandas + geometry`

In [None]:
sa1_data = pd.read_csv("sa1-wellington.csv")
sa1_data

In [None]:
sa1_geoms = gpd.read_file("sa1-wellington.gpkg")
sa1_geoms

In [None]:
sa1_geoms.plot()

In [None]:
sa1_geoms.merge(sa1_data)

Oh dear... what went wrong? This is actually pretty commonplace, especially with New Zealand Census data: turns out that one of our tables understands the SA1 ids to be numbers, the other considers them 'objects' (in other words strings).

In [None]:
sa1_data.SA12018_V1_00.dtype, sa1_geoms.SA12018_V1_00.dtype

To fix this problem we need to convert one of them, so they match.

In [None]:
sa1_data.SA12018_V1_00 = sa1_data.SA12018_V1_00.astype(str)

And now the join operation works fine:

In [None]:
welly_census = sa1_geoms.merge(sa1_data)

And we can make a map!

In [None]:
ax = welly_census.plot(
    column = "CURPop", cmap = "Reds", k = 9, 
    ec = "k", lw = 0.25, figsize = (8, 8))
ax.set_axis_off()

## Before we all get too excited
Some background on `geopandas`. 

In essence, `geopandas` simply adds to `pandas` `GeoSeries` and `GeoDataFrame` classes of object. A `GeoSeries` is a `pandas` `Series` that contains geometries, and also knows what coordinate reference system it's in. And a `GeoDataFrame` is a `pandas` `DataFrame` that can contain one (or more) columns that are `GeoSeries`. Usually the geometry column will be called `geometry` or `geom`.

Let's take a look at the `GeoSeries` in this dataset.

In [None]:
welly_census.geometry

OK... that's not hugely informative. What about a single (multi)polygon?

In [None]:
welly_census.geometry[0]

This is the `shapely` module's slightly silly way of showing us a polygon (or any other geometry for that matter). `shapely` is the underlying package on which `geopandas`'s handling of geometry is based. To get a better idea of what's going on we can `print` a geometry.

In [None]:
print(f"{welly_census.geometry[0]}")

If we want to look closer still we can use the [`shapely` API](https://shapely.readthedocs.io/) to interrogate a geomtry further. For example

In [None]:
[p for p in list(welly_census.geometry[0].geoms)[0].exterior.coords]

or

In [None]:
welly_census.geometry[0].area

or even

In [None]:
welly_census.geometry[0].buffer(100)

But delving deeply into the details of how geometries are handled in `geopandas` is beyond the scope of these sessions. Suffice to say you can dig into the details of individual geometries, pick them apart, and rebuild them if needed (and if you know what you are doing).

It's much more likely you will apply geometric operations to geometries as collections of objects in `GeoDataFrame` form. In that context perhaps of more interest is the handling of coordinate reference systems.

In [None]:
welly_census.crs

In [None]:
welly_census.to_crs(3857).crs

Projecting data into a new coordinate reference system really is that simple!

In the next notebook, we'll make some maps.