# Plotting extended

<img alt="matplotlib topo" align="right" style="width:40%" src="https://matplotlib.org/3.1.1/_images/sphx_glr_topographic_hillshading_001.png">

Not all datasets fit into simple scatter, line or histogram plots. Python still has your back.
We won't discuss it here, but Python is also good for plotting maps, but it is a little
more advanced.  The best library is [cartopy](https://scitools.org.uk/cartopy/docs/latest/).



Here we will cover:
1. Density plotting with colourmaps - avoid overplotting
2. Multi-axis plotting: scatter plotting geochemical data
3. Circular plotting (rose/wind plots): plotting focal mechanisms from the NZ Moment Tensor database
4. 3D plotting

For more examples of what matplotlib can do, have a look at their
[gallery](https://matplotlib.org/3.1.1/gallery/index.html). The image on the
right is generated with matplotlib, and the [source code is here](https://matplotlib.org/3.1.1/gallery/specialty_plots/topographic_hillshading.html#sphx-glr-gallery-specialty-plots-topographic-hillshading-py)

In [1]:
%matplotlib widget

## Density plotting

Overplotting data can lead to miss-interpretation.  Overplotting is when you have
many overlapping points on a plot. A simple example is a 2-D normal distribution
sampled at discrete points:

In [2]:
import numpy as np
import matplotlib.pyplot as plt

# Set a random state so that I can test the output
np.random.seed(42)

n = 100000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)

fig, ax = plt.subplots()
ax.scatter(x, y)
plt.show()

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

There are *many* overlapping point here.  We would be better to represent these data by their
density.  Matplotlib offers a few ways to do this, one of the simplest is the `hexbin`, which
grids the data into hexagonal bins and uses colour to represent the number of points within
that bin:

In [3]:
fig, ax = plt.subplots()
collection = ax.hexbin(x, y)
colorbar = fig.colorbar(collection)
colorbar.set_label("Count")

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

If you find yourself with a dense 2D dataset, plotting the density of points can be really helpful!

## Multi-axis plotting

### Subplots

We keep using the:
```python
fig, ax = plt.subplots()
```
syntax for starting a plotting session, but we haven't explored the strength of subplots.
As the name suggests, subplots let you make multiple plots in one.  To explore this we will
use a dataset provided by Colin Wilson that contains a range of geochemical data. These data are 
from the Huckleberry Ridge Tuff and were published in a paper by [Swallow et al., 2019](https://academic.oup.com/petrology/article/60/7/1371/5524670). 

We will read it
in using pandas and make some scatter plots (as any good/bad/indifferent geochemist would/should).

In [11]:
import pandas as pd

geochem = pd.read_csv(
    "data/Edited Swallow et al J Petrol data for plotting.csv",
    index_col="Sample")
print(geochem)

         SiO2  TiO2  Al2O3  Fe2O3 (T)   MnO   MgO   CaO  Na2O   K2O   P2O5  \
Sample                                                                       
YP114   72.98  0.26  13.56       2.96  0.05  0.08  0.82  4.05  5.22  0.040   
YP307   76.50  0.13  11.85       1.70  0.04  0.08  1.15  2.87  5.67  0.020   
YP359   74.56  0.17  12.83       2.06  0.04  0.23  1.28  2.95  5.86  0.030   
YP363   76.61  0.12  12.06       1.73  0.04  0.04  0.69  2.78  5.94  0.010   
YP414   76.08  0.14  12.14       1.79  0.04  0.06  0.96  2.77  5.99  0.020   
...       ...   ...    ...        ...   ...   ...   ...   ...   ...    ...   
YP564   76.64  0.17  12.23       2.13  0.02  0.01  0.37  3.30  4.89  0.030   
YP603   76.93  0.19  11.77       2.29  0.04  0.01  0.39  3.48  4.58  0.020   
YP081   75.68  0.10  12.81       1.84  0.04  0.14  0.68  3.34  5.35  0.021   
YP133   76.69  0.10  12.04       1.55  0.02  0.25  1.45  3.03  4.85  0.020   
YP600   76.39  0.10  12.54       1.59  0.03  0.06  1.01  2.84  5

Lets make a plot of P2O5 against SiO2:

In [12]:
fig, ax = plt.subplots()
ax.scatter(geochem["SiO2"], geochem["P2O5"])

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

<matplotlib.collections.PathCollection at 0x7f8bdebfbd10>

### Sharing axes

Sometimes it can be good to have multiple different datasets on one axes.  We can set up two
y-axes on the same `Axes` object to allow us to plot different scales on the one figure.

## Circular plotting

### New Zealand Moment Tensor database: 

To demonstrate plotting circular values we are
going to play around with the New Zealand Centroid Moment Tensor database, maintained 
by John Ristau of GNS.  This dataset is publicly available
on the [GeoNet github page](https://github.com/GeoNet/data). Centroid Moment Tensors are a little
like focal mechanisms: they are a way of modeling the faulting style of an earthquake.  They are
a little more complex than focal mechanisms because they allow for *non-double couple* forces, and
so can also describe explosions and implosions and any combination thereof.

To start off, we will write a little function to download the data from the website and 
read it into a pandas dataframe. We only care about the column `"strike1"` for this example,
but feel free to explore the database more at your leisure.

In [3]:
import requests
import pandas as pd

def get_geonet_cmt():
    """ Download GeoNet CMT catalogue and save to the Data directory. """
    response = requests.get(
        "https://raw.githubusercontent.com/GeoNet/data/master/"
        "moment-tensor/GeoNet_CMT_solutions.csv")
    with open("data/GeoNet_CMT_solutions.csv", "wb") as f:
        f.write(response.content)
    return pd.read_csv("data/GeoNet_CMT_solutions.csv", parse_dates=["Date"])

In [4]:
cmt_solutions = get_geonet_cmt()

## 3D plotting

Some data are best in 3-D! 