# Exercise 3.2 Mesh plots (cartopy)

There are three functions to plot three-dimensional data in two dimensions using a colored mesh in matplotlib:

 * pcolormesh
 * pcolor
 * imshow

This is important to show gridded model data or observations on their grid (we will introduce the interpolating function `contour` and `contourf` later).

We will show the usage of `pcolormesh` in this exercise. This function is recommended over the others because:

 * imshow assumes that all data elements in your array are to be rendered at the same size, whereas pcolormesh/pcolor associates elements of the data array with rectangular elements whose size may vary over the rectangular grid (shamelessly stolen from this [stackoverflow answer](https://stackoverflow.com/a/21169703).
 * `pcolormesh` is [about 1 to 3 orders of magnitude faster](http://thomas-cokelaer.info/blog/wp-content/uploads/2014/05/pcolor_erformance.png) than `pcolor`.

Note that most of what we show here for georeferenced plots also applies for normal `pcolormesh`.

## Import libraries

In [None]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

%matplotlib inline

## Load data

### Function to create artificial data:

In [None]:
# artificial data


def sample_data_3d(nlons, nlats):
    """Returns `lons`, `lats`, and fake `data`

    adapted from:
    http://scitools.org.uk/cartopy/docs/v0.15/examples/axes_grid_basic.html
    """
    
    dlat = 180. / nlats / 2
    dlon = 360. / nlons

    lat = np.linspace(-90 + dlat, 90 - dlat, nlats)   
    lon = np.linspace(0, 360 - dlon, nlons)

    lons, lats = np.meshgrid(np.deg2rad(lon), np.deg2rad(lat))
    wave = 0.75 * (np.sin(2 * lats) ** 8) * np.cos(4 * lons)
    mean = 0.5 * np.cos(2 * lats) * ((np.sin(2 * lats)) ** 2 + 2)
    data = wave + mean
    
    return lon, lat, data

## CMIP 5, historical precipitation climatology (1986 to 2005)

Create a netCDF with historical, and projected climatlological precipitation, as well as the relative change between them, from all CMIP5 models for RCP8.5 (Taylor et al., 2012).

The data was prepared in [another notebook](../data/prepare_CMIP5_map.ipynb).

In [None]:
fN = '../data/cmip5_delta_pr_rcp85_map.nc'

# load data, omitting some unecessary variables
pr = xr.open_dataset(fN, drop_variables=['pr_rel', 'proj', 'agree_sign', 'pval'])

pr

## First pcolormesh plot

`pcolormesh` takes x, y, z as input:

In [None]:
# create sample data
lon, lat, data = sample_data_3d(90, 48)

# ====

ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

ax.pcolormesh(lon, lat, data)

ax.set_global()

### Exercise
 * plot the climatological precipitation amount

In [None]:
# get data
lon, lat, hist = pr.lon, pr.lat, pr.hist

# plot

ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

# code here

### Solution

In [None]:
# get data
lon, lat, hist = pr.lon, pr.lat, pr.hist

# plot

ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

ax.pcolormesh(lon, lat, hist, transform=ccrs.PlateCarree())

ax.set_global()

This looks all right, but what's with the white stripe?

Commonly lat and lon are in the center of the gridcell. However, `pcolormesh` assumes the coordinates to specify the edges of the gridcells and *silently truncates the topmost row and the rightmost column* in the plot!

This becomes more obvious if we have less datapoints. 

In [None]:
# create sample data
lon, lat, data = sample_data_3d(nlons=18, nlats=9)

# this is never displayed!
data[:, -1] = 5

# ====

ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

h = ax.pcolormesh(lon, lat, data, transform=ccrs.PlateCarree())

# plot the lat and lon data

lons, lats = np.meshgrid(lon, lat)
ax.plot(lons.flatten(), lats.flatten(), 'o', transform=ccrs.PlateCarree(), ms=4, c='r')

ax.set_global()

The red points show the original lat and lon coordinates - they should be in the center of the gridcells.

Notice how there are only 8 rows and 17 columns displayed! This can be remedied by passing the edges instead of the centers of the gridcells:

In [None]:
print(lon)
print(lat)

In [None]:
# create sample data
lon, lat, data = sample_data_3d(18, 9)

# ====

ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

LON = np.arange(-10, 351, 20)
LAT = np.arange(-90, 91, 20)


h = ax.pcolormesh(LON, LAT, data, transform=ccrs.PlateCarree())

# plot the lat and lon data

lons, lats = np.meshgrid(lon, lat)
ax.plot(lons.flatten(), lats.flatten(), 'o', transform=ccrs.PlateCarree(), ms=4, c='r')

ax.set_global()


# ====

print(LAT.shape, lat.shape)
print(LON.shape, lon.shape)
print(data.shape)


Perfect. Notice how LAT (LON) has one more element than lat (lon) and data!

### Exercise

 * apply the same correction for the cmip5 precipitation data

In [None]:
print(pr.lon.values)
print('Delta lon:', np.unique(np.diff(pr.lon.values)))

print(pr.lat.values)
print('Delta lat:', np.unique(np.diff(pr.lat.values)))

In [None]:
# get data
lon, lat, hist = pr.lon, pr.lat, pr.hist

# plot

ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

# create coordinates of edges

# LON = 
# LAT = 

ax.pcolormesh(lon, lat, hist, transform=ccrs.PlateCarree())

ax.set_global()

### Solution

In [None]:
# get data
lon, lat, hist = pr.lon, pr.lat, pr.hist

# plot

ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

# create coordinates of edges

LON = np.arange(0, 361, 2.5)
LAT = np.arange(-90, 91, 2.5)

ax.pcolormesh(LON, LAT, hist, transform=ccrs.PlateCarree())

ax.set_global()

Of course, calculating the edges can be done in a function:

In [None]:
def _infer_interval_breaks(coord):
    """
    >>> _infer_interval_breaks(np.arange(5))
    array([-0.5,  0.5,  1.5,  2.5,  3.5,  4.5])
    """
    coord = np.asarray(coord)
    deltas = 0.5 * (coord[1:] - coord[:-1])
    first = coord[0] - deltas[0]
    last = coord[-1] + deltas[-1]
    return np.r_[[first], coord[:-1] + deltas, [last]]

In [None]:
# create sample data
lon, lat, data = sample_data_3d(18, 9)

# ====

ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

LON = _infer_interval_breaks(lon)
LAT = _infer_interval_breaks(lat)

h = ax.pcolormesh(LON, LAT, data, transform=ccrs.PlateCarree())

ax.set_global()

I provide an advanced version of this function in `utils.py`

In [None]:
import utils

from importlib import reload

reload(utils)


In [None]:
# create sample data
lon, lat, data = sample_data_3d(18, 9)

# ====

ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

LON, LAT = utils.infer_interval_breaks(lon, lat)

h = ax.pcolormesh(LON, LAT, data, transform=ccrs.PlateCarree())

### Exercise

 * use `utils.infer_interval_breaks` for the cmip5 precipitation data

In [None]:
# get data
lon, lat, hist = pr.lon, pr.lat, pr.hist

# plot

ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

# replace here
LON = np.arange(0, 361, 2.5)
LAT = np.arange(-90, 91, 2.5)

ax.pcolormesh(LON, LAT, hist, transform=ccrs.PlateCarree())

ax.set_global()

### Solution

In [None]:
# get data
lon, lat, hist = pr.lon, pr.lat, pr.hist

# plot

ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

LON, LAT = utils.infer_interval_breaks(lon, lat)

ax.pcolormesh(LON, LAT, hist, transform=ccrs.PlateCarree())

ax.set_global()

## lat extends from -90...90

Some models/ dataset have lat values that extend from -90..90... For example output from CESM (Community Earth System Model) or HadGEM (Hadley Centre Global Environment Model).

It still makes sense to infer the interval breaks because else:

 * we would loose one row of data
 * the lat coordinates may still be the center of the gridcell, except for the two poles (this is e.g. the case for CESM)

### Open random temperature field from CESM

In [None]:
fN = '../data/cesm_temp.nc'

cesm = xr.open_dataset(fN)

cesm.lat

The problem exists:

In [None]:
ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

ax.pcolormesh(cesm.lon, cesm.lat, cesm.temp, transform=ccrs.PlateCarree())

ax.set_global()

In [None]:
ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

LON, LAT = utils.infer_interval_breaks(cesm.lon, cesm.lat)
ax.pcolormesh(LON, LAT, cesm.temp, transform=ccrs.PlateCarree())

ax.set_global()

This creates a warning (because lat is now outside of the allowd range). We can correct this by `clipping` the values to the range -90...90.

In [None]:
np.clip?

In [None]:
ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

LON, LAT = utils.infer_interval_breaks(cesm.lon, cesm.lat, clip=True)

ax.pcolormesh(LON, LAT, cesm.temp, transform=ccrs.PlateCarree())

ax.set_global()

## Bonus: xarray

Until now we used xarray only as 'data store' and did the plotting as

    ax.plot(ds.lon, ds.lat. ds.data, ...)
    
However, `xarray` also has it's dedicated plotting functions, which allow to do:
    
    ds.data.plot.pcolormesh(ax=ax, ...)

This plotting function already applies the `interval_breaks`. Note that `xarray` does some additional things under the hood, and cannot apply the clipping of the values.

(There is much more to be said about plotting with xarray, here, I only want to mention the `interval_breaks` - thingy.)

In [None]:
ax = plt.axes(projection=ccrs.PlateCarree())

ax.coastlines()

cesm.temp.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())

ax.set_global()

### Exercise

 * plot the cmip5 precipitation data with xarray

In [None]:
# code here

### Solution

In [None]:
ax = plt.axes(projection=ccrs.Robinson())

ax.coastlines()

pr.hist.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())

ax.set_global()