# Exercise 2.1 Mesh plots (cartopy)
prepared by M. Hauser

Here we learn how to plot data as mesh grid. This is important for *gridded* model data or observations (we will introduce the interpolating functions `contour` and `contourf` in [Exercise 2.2](ex2_2_contour.ipynb)). We will show the usage of `pcolormesh` in this exercise. While there are other functions with a similar functionality (`pcolor` and `imshow`), `pcolormesh` is recommended over the others.

> Most of what we show here for georeferenced plots also applies to normal usage of `pcolormesh`.

## Goals

 * Know how to plot gridded data on a map using `pcolormesh`
 * Restrict the shown range of the data, including discrete levels
 * Showcase the xarray interface to `pcolormesh`
 * Saving figures (including rasterizing and dpi)

## Import libraries

In [None]:
# Specific to our jupyterhub setup
import os
os.environ["PROJ_DATA"] = "/data/python_intro/miniconda/pkgs/proj-9.2.1-ha643af7_0"

In [None]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

In [None]:
import mplotutils as mpu

## First pcolormesh plot

`pcolormesh` takes x, y, z as input. x and y are the coordinates while z determines the color of each pixel.

We showcase `pcolormesh` using artificial [sample data](https://scitools.org.uk/cartopy/docs/latest/gallery/miscellanea/axes_grid_basic.html).


In [None]:
# create sample data
lon, lat, data = mpu.sample_data_map(nlons=90, nlats=45)

In [None]:
print(f"{lon.shape  = }")
print(f"{lat.shape  = }")
print(f"{data.shape = }")

print(f"{lon[:4]    = }")
print(f"{lat[:4]    = }")

lon and lat are the coordinates and define the center of the grid cell. Each grid cell has a size of 4° x 4°. We pass lat, lon and data to `pcolormesh` to plot it:

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()))
ax.coastlines()

ax.pcolormesh(lon, lat, data)

## Load CMIP 5 data: historical precipitation climatology (1986 to 2005)

We will load a NetCDF with historical, and projected climatological precipitation, as well as the relative change between them, from all CMIP5 models for RCP8.5 (Taylor et al., [2012](https://doi.org/10.1175/BAMS-D-11-00094.1)).

The data was prepared in [another notebook](../data/prepare_CMIP5_map.ipynb).

In [None]:
file = "../data/cmip5_delta_pr_rcp85_map.nc"

# load data, omitting some unnecessary variables
pr = xr.open_dataset(file, drop_variables=["pr_rel", "proj", "agree_sign", "pval"])

pr

### Exercise

* Plot the climatological precipitation amount (`pr.hist`)
 > Pass `pr.lon`, `pr.lat`, and `pr.hist` to `ax.pcolormesh`.

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

# code here

### Solution

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

ax.pcolormesh(pr.lon, pr.lat, pr.hist, transform=ccrs.PlateCarree())

## Cell centers

matplotlib assumes that the x and y coordinates are the cell centers and calculates the cell boundaries from them (if the number of coordinates is equal to the number of data points).

We can illustrate this best with an example with only a few datapoints. The red points show the original lat and lon coordinates in the center of the gridcells.

In [None]:
# create sample data
lon, lat, data = mpu.sample_data_map(nlons=18, nlats=9)

# ====

f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()))
ax.coastlines()

h = ax.pcolormesh(lon, lat, data, transform=ccrs.PlateCarree(), ec="0.5", lw=0.5)

# plot the gridcell centers

LON, LAT = np.meshgrid(lon, lat)
ax.plot(LON.flatten(), LAT.flatten(), "o", transform=ccrs.PlateCarree(), ms=1, c="r")

> In earlier versions of matplotlib the behavior of `pcolormesh` was different - it removed one row and column of the data, and a white row and column was shown. Thus, if you use earlier versions of matplotlib you will need to take care of this by manually extrapolating the coordinates (check `mpu.infer_interval_breaks?`).

## Data range, colorbar and color range

Per default `pcolormesh` shows the entire range of data - this is often not desirable.

We can set the shown data limits with `vmin` and `vmax`. Because we now clip values at both ends, we should let the viewers know. We can do this by using the `extend` keyword in the colorbar. It takes the  values

 * `'neither'` (default).
 * `'both'`
 * `'min'`
 * `'max'`

**Let's illustrate this with a random temperature field from a climate model**

In [None]:
file = "../data/cesm_temp.nc"
cesm = xr.open_dataset(file)
cesm.temp

In [None]:
f, axs = plt.subplots(2, 1, subplot_kw=dict(projection=ccrs.PlateCarree()))

for ax in axs:
    ax.coastlines()

ax = axs[0]

h = ax.pcolormesh(
    cesm.lon,
    cesm.lat,
    cesm.temp - 273.15,
    transform=ccrs.PlateCarree(),
    cmap="RdBu_r",
)
plt.colorbar(h, ax=ax)
ax.set_title("Entire data range show")

ax = axs[1]

h = ax.pcolormesh(
    cesm.lon,
    cesm.lat,
    cesm.temp - 273.15,
    transform=ccrs.PlateCarree(),
    vmin=-30,
    vmax=30,
    cmap="RdBu_r",
)
plt.colorbar(h, ax=ax, extend="both")
ax.set_title("Data range restricted and extend='both'")

### Exercise
 * Clip the precipitation values to the range 0...3000
 * Indicate that the values extend at the upper bound

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

h = ax.pcolormesh(pr.lon, pr.lat, pr.hist, transform=ccrs.PlateCarree())

plt.colorbar(h)

### Solution

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

h = ax.pcolormesh(
    pr.lon, pr.lat, pr.hist, transform=ccrs.PlateCarree(), vmin=0, vmax=3000
)

plt.colorbar(h, extend="max")

## Color levels

In addition to limiting the color scale, it is also possible to display the data in color levels. To create a discrete color scale instead of a continuous one, we need to pass `norm` to `pcolormesh`. `norm` is a function that normalizes data to the 0.0...1.0 interval. Usually it ranges linearly between the minimum and maximum data values - we'll need to pass one that generates the levels. We also need to pass a changed colormap (`cmap`).

For this, we can make use of a small helper function in mplotutils: `mpu.from_levels_and_cmap`.

> Doing this, we need to specify `extend` in the `from_levels_and_cmap` and not in the colorbar anymore.

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()))
ax.coastlines()

levels = np.arange(-30, 31, 10)
cmap, norm = mpu.from_levels_and_cmap(levels, cmap="RdBu_r", extend="both")

h = ax.pcolormesh(
    cesm.lon,
    cesm.lat,
    cesm.temp - 273.15,
    transform=ccrs.PlateCarree(),
    norm=norm,
    cmap=cmap,
)
plt.colorbar(h)

### Exercise

* Create discrete levels for the precipitation data from 0 to 3000 with a spacing of 500
* The colormap is called `"viridis"`
* Let your viewers know that the colorbar has a restricted range.

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

# levels = ...
# cmap, norm = ...
h = ax.pcolormesh(pr.lon, pr.lat, pr.hist, transform=ccrs.PlateCarree())

plt.colorbar(h)

### Solution

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

levels = np.arange(0, 3001, 500)
cmap, norm = mpu.from_levels_and_cmap(levels, cmap="viridis", extend="max")
h = ax.pcolormesh(
    pr.lon, pr.lat, pr.hist, transform=ccrs.PlateCarree(), cmap=cmap, norm=norm
)

plt.colorbar(h)

## Using xarray

Until now we used xarray only as 'data store' and did the plotting as

```python
ax.pcolormesh(ds.lon, ds.lat. ds.data, ...)
```
    
However, `xarray` also has its dedicated plotting functions, which allow you to do:

```python
ds.data.plot.pcolormesh(ax=ax, ...)
```

It is good to know how to create the plot directly with matplotlib. However, for daily work, I almost always create the plot with xarray, e.g. you don't explicitly need to pass the coordinates. This simplifies certain aspects of the plot. Under the hood, xarray uses matplotlib to create the figure.

### Example

In [None]:
temp = cesm.temp - 273.15
temp.plot()

### Exercise

* Plot `pr.hist` calling its `.plot` method.

In [None]:
# code here

### Solution

In [None]:
pr.hist.plot()

This does not create a map plot. For this we will need to pass a `GeoAxesSubplot` to the plot method, as `pr.hist.plot(ax=ax)`.

### Exercise

* Create a map plot with a `Robinson` projection.
* Add coastlines.
* Plot the CMIP5 precipitation data with xarray
* Restrict the shown data range using `vmax`
 

In [None]:
# f, ax = ...

### Solution

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree(), vmax=3000)

### Differences xarray - matplolib

Comparing the xarray plot with the one from matplotlib/ cartopy you can see a number of differences.

* xarray automatically chooses the `"RdBu_r"` colormap if the data crosses 0 and the `"viridis"` colorbar otherwise.
* If the data crosses 0 the displayed range is symmetric (unless otherwise specified with `vmin` and `vmax`)
* It adds a colorbar. This can be controlled with the `add_colorbar` keyword.
* If you restrict the color range with `vmin` or `vmax`, it automatically adds triangles at the end of the colorbar to indicate that the values were "cut off".

## Color levels - xarray

Using the xarray plotting interface you can also directly pass `levels` without using `mpu.from_levels_and_cmap`:

In [None]:
# get data
temp = cesm.temp - 273.15

# plot
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

levels = np.arange(-40, 41, 10)
temp.plot(ax=ax, transform=ccrs.PlateCarree(), levels=levels, extend="both")

### Exercise

* Replace `vmax` with `levels`

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree(), vmax=3000)

### Solution

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

levels = np.arange(0, 3001, 500)

pr.hist.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree(), levels=levels)

## Saving figures // rasterized

> This applies to figures created by matplotlib and xarray.

There is nothing special about saving a map figure:

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot(transform=ccrs.PlateCarree())

plt.savefig("precip_global.pdf")

However, they can grow large very quickly. Especially if you save a `pcolormesh` figure as pdf, because the pdf is saved as vector graphic and each mesh is its own element.

It is, however, possible to rasterize certain elements of the plot, e.g. the `pcolormesh`.

> It is then important that you increase the resolution of the saved figure, e.g. by setting `dpi=300`.

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot(transform=ccrs.PlateCarree(), rasterized=True)

plt.savefig("precip_global_rasterized.pdf", dpi=300)

* Compare the file size of `'precip_global.pdf'` and `'precip_global_rasterized.pdf'`.

> The following may not work in windows.

In [None]:
! ls -lh precip_global*.pdf

* Open robinson_rasterized.pdf and zoom in; you can see that the coastlines are not rasterized.

> Setting `rasterized=True` can help make smaller figures. By setting `dpi` to e.g. `300` you can do so without (much) loss of quality.

Therefore, I recommend to always use this option.


### Another example

We look at another example of the precipitation data. We select a region with tilted lines. Because the rectangular elements don't have vertical edges (but the pixels do), the `dpi` keyword is especially important. 



In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot.pcolormesh(
    transform=ccrs.PlateCarree(),
    cmap="Blues",
    vmax=2500,
    add_colorbar=False,
)

ax.set_extent([-150, -130, 30, 70], ccrs.PlateCarree())

plt.savefig("precip_detail.pdf")

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot.pcolormesh(
    transform=ccrs.PlateCarree(),
    cmap="Blues",
    vmax=2500,
    add_colorbar=False,
    rasterized=True,
)

ax.set_extent([-150, -130, 30, 70], ccrs.PlateCarree())

plt.savefig("precip_detail_rasterized.pdf")

In [None]:
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
ax.coastlines()

pr.hist.plot.pcolormesh(
    transform=ccrs.PlateCarree(),
    cmap="Blues",
    vmax=2500,
    add_colorbar=False,
    rasterized=True,
)

ax.set_extent([-150, -130, 30, 70], ccrs.PlateCarree())

plt.savefig("precip_detail_rasterized_dpi.pdf", dpi=300)

* Open the three pdfs (`precip_detail.pdf`, `precip_detail_rasterized.pdf`, and `precip_detail_rasterized_dpi.pdf`) and compare their quality (zoom in).
* Compare the size of the pdfs.

> The following may not work in windows.

In [None]:
! ls -lh precip_detail*