In [None]:
import numpy as np
import xarray as xr
import holoplot.xarray

XArray provides a convenient and very powerful wrapper to label the axis and coordinates of nd-arrays. It therefore differs significantly from the other data types supported by HoloPlot. This user guide will cover how to visualize and explore data of different dimensionality ranging from simple 1D data, to 2D image-like data to multi-dimensional cubes of data.

For these examples we’ll use the North American air temperature dataset:

In [None]:
air_ds = xr.tutorial.load_dataset('air_temperature')
air = air_ds.air
air_ds

## 1D Plots

Selecting the data at a particular lat/lon coordinate we get a 1D dataset of air temperatures over time:

In [None]:
air1d = air.isel(lat=10, lon=10)
air1d.plot()

We can also further subselect the data and use HoloViews ability to overlay plots:

In [None]:
air1d_sel = air1d.isel(time=slice(0, 200))
air1d_sel.plot(color='purple') * air1d_sel.plot.scatter(marker='o', color='purple', size=5)

### Selecting multiple

If we select multiple coordinates along one axis and plot a chart type the data will automatically be split by the coordinate:

In [None]:
air.isel(lon=10, lat=[19,21,22]).plot.line()

## 2D Plots

If the ``DataArray`` is 2D the plot method will automatically infer that the data should be displayed as an image:

In [None]:
air2d = air.isel(time=500)
air2d.plot(colorbar=True, width=500)

## 3(+)D Plots

If the data has more than two dimensions it will automatically apply a ``groupby`` along any remaining dimensions, allowing the value be selected via a a slider:

In [None]:
air.plot(colorbar=True, width=500)

## Statistical plots

Statistical plots such as histograms, kernel-density estimates or violin and box-whisker plots aggregate the data across one or more of the coordinate dimensions, e.g. plotting a histogram provides a summary of all the air temperature values:

In [None]:
air.plot.hist()

Using the ``by`` keyword we can break down the distribution of the air temperature across one or more variables:

In [None]:
air.plot.violin('air', by='lat')

## Datashading

When plotting a large amount of data at once it is also possible to use the datashading abilities of plot API. By enabling datashade and declaring that the data should not be grouped by another coordinate variable we can plot all the datapoints, showing us the spread of air temperatures in the dataset. Additionally we can overlay an aggregate of the same data:

In [None]:
air.plot.scatter('time', 'air', by=[], datashade=True, use_dask=True) *\
air.plot.line('time', 'air', by=[], use_dask=True).aggregate(function=np.mean)