| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | [lambdas](./07_lambdas.ipynb) | [CIPW](./08_CIPW_Norm.ipynb) | [Lattice Strain](./09_lattice_strain.ipynb) | **Extensions:** | [ML](./11_geochem_ML.ipynb) | [Spatial Data](./12_spatial_geochem.ipynb) |
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |

## Getting Started with Geopandas


In this notebook we'll look at working with geochemistry in a spatial context, mainly looking at [`geopandas`](https://geopandas.org/en/stable/). We'll also look at how to bring some *simple* interactivity to your `matplotlib` figures, which could also be applied to any non-spatial case.

In [None]:
import contextily as cx
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import pyrolite

In [None]:
gdf = gpd.read_file(
    "../data/regolith/CapricornSoilGeochem.shp"
)  # stores it's own coordinate system, should be in crs="epsg:28350"
gdf = gdf.loc[gdf.Easting > 0]  # drop things which don't have locations

In [None]:
gdf.crs

In [None]:
gdf.columns

In [None]:
gdf.columns = [c.replace("_ppm", "") for c in gdf.columns]
gdf.columns

In [None]:
gdf.geometry

In [None]:
gdf.geometry[0].x  # the x coordinate from the first point

In [None]:
gdf.plot()

In [None]:
colour_by = "Al"

In [None]:
gdf.plot(
    c=gdf[colour_by]
)  # plot the data from our dataset, coloured by the column selected

In [None]:
ax = gdf.plot(c=gdf[colour_by])
plt.colorbar(ax.collections[0], label=colour_by)  # add a colourbar for the variable

## Quick Look at the Chemistry

We can have a look at how this chemistry looks like, here normalizing to an upper-continental crustal reference composition (Rudnick and Gao, 2014) and colouring by size fraction:

In [None]:
gdf.pyrochem.elements.apply(lambda x: np.where(x > 0, x, np.nan)).pyrochem.normalize_to(
    "UCC_RG2014", units="ppm"
).pyroplot.spider(
    figsize=(15, 8),
    c=gdf.Size_fract.apply(lambda x: "" if x is None else x),
    index_order="incompatibility",
    alpha=0.5,
    unity_line=True,
)

We can see there's a decent amount of REE data, but it doesn't look like it'll tell us much in this case (otherwise maybe we could look at e.g. `lambdas`), although maybe there are some Ce, Eu anomalies in there:

In [None]:
gdf.pyrochem.elements.apply(lambda x: np.where(x > 0, x, np.nan)).pyrochem.normalize_to(
    "UCC_RG2014", units="ppm"
).pyroplot.REE(
    c=gdf.Size_fract.apply(lambda x: "" if x is None else x),
    index_order="incompatibility",
    alpha=0.5,
    unity_line=True,
)

In [None]:
lambdas = gdf.pyrochem.REE.apply(
    lambda x: np.where(x > 0, x, np.nan)
).pyrochem.lambda_lnREE(anomalies=["Ce", "Eu"])
lambdas.iloc[:, -2:].pyroplot.scatter(
    c=gdf.Size_fract.apply(lambda x: "" if x is None else x),
)

## Looking at Geochemical PCA in Spatial Context

In [None]:
from sklearn.decomposition import PCA

n_components = 5
pca = PCA(n_components=n_components)

In [None]:
input_df = (
    gdf.drop(columns=["Re"])  # ,"Hg", "Te", 'S', 'Cd', 'Ag'])
    .pyrochem.elements.apply(
        lambda x: np.where(x > 0, x, np.nanmin(x[x > 0] / 3))
    )  # ~replace by third of detection limit
    .pyrochem.normalize_to("UCC_RG2014", units="ppm")
    .dropna(how="all", axis=1)
    .apply(np.log)
)

In [None]:
pca_scores = gpd.GeoDataFrame(
    pca.fit_transform(input_df),
    columns=["PCA{}".format(ix) for ix in range(n_components)],
    geometry=gdf.geometry,
    dtype="float",
)
pca_scores

In [None]:
pd.DataFrame(
    pca.components_,
    columns=input_df.columns,
    index=["PCA{}".format(ix) for ix in range(n_components)],
    dtype="float",
).pyroplot.spider(
    figsize=(12, 4),
    c=["PCA{}".format(ix) for ix in range(n_components)],
    logy=False,
    index_order="incompatibility",
)

In [None]:
from pyrolite.plot.color import process_color  # bug in geopandas colour processing?

cmap = plt.get_cmap("cividis").copy()

fig, ax = plt.subplots(1, n_components, sharex=True, sharey=True, figsize=(15, 3))
ax = list(ax.flat)
for a, c in zip(ax, pca_scores.columns.tolist()):
    a.set_title(c)
    a = pca_scores.plot(
        ax=a,
        c=process_color(pca_scores[c].values, cmap="cividis")["c"],
    )

## Basemaps with Contextily

In [None]:
ax = gdf.plot(c=gdf[colour_by])
plt.colorbar(ax.collections[0], label=colour_by)
cx.add_basemap(ax, crs=gdf.crs.to_string())  # add a basemap under our dataset

In [None]:
ax = gdf.plot(c=gdf[colour_by])
plt.colorbar(ax.collections[0], label=colour_by)
cx.add_basemap(
    ax, crs=gdf.crs.to_string(), source=cx.providers.Esri.WorldImagery, zoom=10
)  # add a basemap under our dataset, with the ESRI satellite imagery

In [None]:
%matplotlib widget
# use an interactive backend for matplotlib
fig, ax = plt.subplots(1, figsize=(10, 6))  # create a figure with a specific size

gdf.plot(c=gdf[colour_by], marker="D", ax=ax)  # plot the Al_ppm data from our dataset

fig.colorbar(
    ax.collections[0], label=colour_by, shrink=0.8
)  # add a colourbar for the variable

ax.set(
    xlabel="Easting",
    ylabel="Northing",
    aspect="equal",
    xlim=(125000, 800000),
    ylim=(7.1e6, 7.65e6),
)
# modify some of the axis defaults, expand so we have broader context

cx.add_basemap(
    ax,
    crs=gdf.crs.to_string(),
    source=cx.providers.Esri.WorldImagery,
    zoom=10,
    attribution=False,
)  # add a basemap under our dataset

## Exporting for External Use

You can easily re-export the data to the original format (here, `shapefile`), or instead export to something less-platform dependent/open like `geopackage` (a single file with spatial information, instead of multiple for `.shp`):

In [None]:
gdf.to_file("../data/regolith/processed_soil_geochem.shp")

In [None]:
gdf.to_file("../data/regolith/processed_soil_geochem.gpkg")

You could download these and open them in e.g. QGIS.

--- 

| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | [lambdas](./07_lambdas.ipynb) | [CIPW](./08_CIPW_Norm.ipynb) | [Lattice Strain](./09_lattice_strain.ipynb) | **Extensions:** | [ML](./11_geochem_ML.ipynb) | [Spatial Data](./12_spatial_geochem.ipynb) |
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |