# Getting started with xarray and Matplotlib

This notebook will show how to use xarray’s convenient matplotlib-backed plotting interface to visualize your datasets.

[**xarray**](http://xarray.pydata.org/en/stable/) introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like multidimensional arrays, which allows for a more intuitive, more concise, and less error-prone developer experience. *Xarray* plotting functionality is a thin wrapper around [**matplotlib**](https://matplotlib.org/), a comprehensive library for creating static, animated, and interactive visualizations in Python.

Let's start by importing the main Python modules we'll use in this notebook

In [None]:
%matplotlib inline
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib as mpl
import glob
from os.path import expanduser
home = expanduser("~")

### Load data

All the input datasets are located under the ```data``` folder and organized according to the directory structure defined in the [**CMIP6 Data Reference Syntax**](https://pcmdi.llnl.gov/CMIP6/Guide/dataUsers.html). 

In these examples we’ll use a ```tas```(near-surface air temperature) file which is part of the ```CMIP6``` dataset produced by ```CMCC``` from the ```CMCC-CM2-SR5``` global coupled general circulation model and related to the ```ssp585``` experiment (update of emission-driven RCP8.5 based on SSP5). 

As shown in [**this**](Quick_Start_intake-esm.ipynb) notebook, you can exploit the ```intake-esm``` features to search&discover data and load the desired data assets (NetCDF files) into xarray datasets. 

First let's load up a dataset to visualize.

In [None]:
input_file = home+"/data/CMIP6/ScenarioMIP/CMCC/CMCC-CM2-SR5/ssp585/r1i1p1f1/Amon/tas/gn/v20200622/tas_Amon_CMCC-CM2-SR5_ssp585_r1i1p1f1_gn_201501-210012.nc"
ds = xr.open_dataset(input_file).load()
ds

This dataset contains a three-dimensional variable, called ```tas```, with dimensions ```(time, lat, lon)```.

### Basic plotting: .plot()
DataArray objects have a ```plot``` method, which creates plots using matplotlib. By default ```.plot()``` makes
- a line plot for 1-D arrays using ```plt.plot()```

- a pcolormesh plot for 2-D arrays using ```plt.pcolormesh()```

- a histogram for everything else using ```plt.hist()```

##### Histograms
```tas``` is three-dimensional, so we got a histogram of temperature values. Notice the label on the x-axis. One of xarray’s convenient plotting features is that it uses the ```attrs``` of ```tas``` to nicely label axes and colorbars.

In [None]:
ds.tas.plot()

You can pass extra arguments to the underlying ```hist()``` call. For example, we can change the type of histogram to draw:
- ```bar``` (default) is a traditional bar-type histogram;
- ```barstacked``` is a bar-type histogram where multiple data are stacked on top of each other;
- ```step``` generates a lineplot that is by default unfilled;
- ```stepfilled``` generates a lineplot that is by default filled.

We can define the orientation: ```vertical``` (default) or ```horizontal```.

We can change the ```color```: a color or a sequence of colors, one per dataset.

In [None]:
ds.tas.plot(histtype='step', orientation='horizontal', color='red')

#### 2D plots

Now we will explore 2D plots. Let’s select a single timestep of ```tas``` to visualize. The dataset covers a time period from January 2015 to December 2100 and has a monthly time frequency: so, for example, we can use an index equal to 2 to select the third time point, that is March 2015.  

In [None]:
ds.tas.isel(time=2).plot()

The x- and y-axes are labeled with full names — ```Latitude```, ```Longitude``` — along with units. The colorbar has a nice label, again with units. And the title indicates the timestamp of the data presented.

Here is a more complicated figure that explicitly select the second ```longitude``` value, sets ```time``` as the x-axis, customizes the colorbar, and overlays two contours at specific levels.

In [None]:
ds.tas.isel(lon=1).plot(
    x="time",  # Coordinate for x axis
    robust=True,  # Set colormap range to 2nd and 98th percentile of data instead of the extreme values
    cbar_kwargs={
        "orientation": "horizontal",
        "label": "custom label",
        "pad": 0.2,
    },  # passed to plt.colorbar
)

#### 1D line plots
```xarray``` is also able to plot lines by wrapping ```plt.plot()```. As in the earlier examples, the axes are labelled and keyword arguments can be passed to the underlying ```matplotlib``` call.

In the example belowe, we select a single spatial point, identified by its index (```lat```,```lon```), and we slice the time sequence to get every nth timestep (e.g. 12, to get every Junuary for each year from 2015 to 2100).

In [None]:
tas1d = ds.tas.isel(lat=10, lon=10).isel(time=slice(None,None,12))
print(tas1d.time.dt.strftime("%Y%m%d %H%M%S").values)
tas1d.plot()

Additional arguments can be directly passed to the matplotlib function. For example, ```xarray.plot()``` calls ```matplotlib.pyplot.plot``` passing in the array values as x and y, respectively. So to make a thin-line plot with purple triangles a matplotlib format string can be used:

In [None]:
tas1d.plot( 
    color="purple", 
    marker="^",
    linewidth="0.2"
)

#### Multiple lines from a 2d DataArray

Now we want to compare line plots of temperature at three different latitudes. We can use the ```hue``` kwarg to do this.

In [None]:
ds.tas.isel(time=1).sel(lat=[40, 50, 60], method="nearest").plot(
    x="lon", hue="lat"
)

Similarly, we can use line plots to check the variation of air temperature (every year ---> ```time=slice(None,None,12)```), at three different latitudes (```lat=[19, 21, 22]```) along a longitude line (```lon=10```).


It is required to explicitly specify either

    x: the dimension to be used for the x-axis, or
    hue: the dimension you want to represent by multiple lines.

Thus, we can make the plot by specifying either ```hue='lat'``` or ```x='time'```. 

If required, the automatic legend can be turned off using ```add_legend=False```.

In [None]:
ds.tas.isel(lon=50, lat=[19, 20, 21],time=slice(None,None,12)).plot.line(x="time")

We can now combine the two plots into a single figure

In [None]:
tas = ds.tas - 273.15  # to celsius

# Prepare the figure
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4), sharey=True)

# Selected latitude values
isel_lats = [40, 50, 60]

# Temperature vs longitude plot
tas.sel(lat=isel_lats, method="nearest").isel(time=1).plot.line(ax=ax1, hue='lat')
ax1.set_ylabel('°C')

# Temperature vs time plot
tas.sel(lon=50, lat=isel_lats,method="nearest").isel(time=slice(None,None,12)).plot.line(ax=ax2, x='time', add_legend=False)
ax2.set_ylabel('')

# Show
plt.tight_layout()

#### Dimension along y-axis

It is also possible to make line plots such that the data are on the x-axis and a dimension is on the y-axis. This can be done by specifying the appropriate ```y``` keyword argument.

In [None]:
ds.tas.isel(time=11, lon=[10, 30]).plot(y="lat", hue="lon")

#### Step plots

As an alternative, also a step plot can be made using 1D data.

The argument ```where``` defines where the steps should be placed. Options are:

- ```pre``` (default): the interval ```(x[i-1], x[i]]``` has the value ```y[i]```.
- ```post```the interval ```[x[i], x[i+1])``` has the value ```y[i]```
- ```mid```: steps occur half-way between the x positions

In [None]:
tas1d = ds.tas.isel(lat=10, lon=10).isel(time=slice(None,None,60)) #every 5 years
tas1d.plot.step(where="mid")

This is particularly handy when plotting data grouped with ```Dataset.groupby_bins()```.

In this case, the ```bins``` argument is an ```int```, so it defines the number of equal-width bins in the range of x. We could also use a sequence to define the bin edges allowing for non-uniform bin width.

In [None]:
tas_grp = ds.tas.mean(["time", "lon"]).groupby_bins(group="lat", bins=5) #[0, 23.5, 66.5, 90]
tas_mean = tas_grp.mean()
tas_std = tas_grp.std()

tas_mean.plot.step()
(tas_mean + tas_std).plot.step(ls=":")
(tas_mean - tas_std).plot.step(ls=":")

plt.title("Zonal mean temperature")

## Faceting

It is an effective way of visualizing variations of 3D data where 2D slices are visualized in a panel (subplot) and the third dimensions is varied among panels.

Let's start by computing the monthly means. Note that the dimensions are now ```lat```, ```lon```, ```month```.

In [None]:
monthly_means = ds.groupby("time.month").mean()
# xarray's groupby reductions drop attributes. Let's assign them back so we get nice labels.
monthly_means.tas.attrs = ds.tas.attrs
monthly_means

We want to visualize how the monthly mean air temperature varies with  month of the year.

The simplest way is to specify the ```row``` or ```col``` kwargs which are expected to be a dimension name. Here we use ```month``` so that each panel of the plot presents the mean temperature field in a given month. Since a 12 column plot would be too small to interpret, we can “wrap” the facets into multiple rows using col_wrap

In [None]:
fg = monthly_means.tas.plot(
    col="month",
    col_wrap=3,  # each row has a maximum of 3 columns
)

We can customize our plot as for not-faceted plots

In [None]:
fg = monthly_means.tas.plot(
    col="month",
    col_wrap=4,
    robust=True,
    cmap=mpl.cm.coolwarm,
    cbar_kwargs={
        "orientation": "horizontal",
        "shrink": 0.8,
        "aspect": 40,
        "pad": 0.1,
    },
)

In [None]:
fg = monthly_means.tas.plot(col="month", col_wrap=4)

# Plot contours on each panel
fg.map_dataarray(
    xr.plot.contour, x="lon", y="lat", colors="k", levels=13, add_colorbar=False
)

# Add a red point on Rome, Italy
fg.map(lambda: plt.plot(12.496366, 41.902782, markersize=20, marker=".", color="r"))

Faceting also works for line plots.

In [None]:
fg = ds.tas.groupby("time.year").mean().sel(lat=[-70,-30,0,30,70],method="nearest").sel(year=slice(None,None,11)).plot(
   x="lon", hue="lat", col="year", col_wrap=4
)

### Other types of plot

Contour plot using ```DataArray.plot.contour()```

In [None]:
ds.tas.isel(time=0).plot.contour()

Filled contour plot using ```DataArray.plot.contourf()```

In [None]:
ds.tas.isel(time=0).plot.contourf()

Surface plot using ```DataArray.plot.surface()```

In [None]:
ds.tas.T.isel(time=0).plot.surface()

Since ```xarray```’s default plotting functionality builds on ```matplotlib```, we can seamlessly use ```cartopy``` to make nice maps. The [**Quick_Start_cartopy**](Quick_Start_cartopy.ipynb) notebook shows some basic examples on how to use ```cartopy``` with ```matplotlib``` to create professional and publishable maps with only a few lines of code.