# BASIC INTERACTIVE DATA VISUALIZATION

The following code can be used & adapted to quickly generate interactive plots and explore output data from ACCESS-OM2, as well as other climate models. 

We'll use the **Panel** & **HoloViews** libraries, which provide automatic tools for a faster, more intuitive understanding of the outputs.

If you run into trouble, or for any questions, suggestions or comments, just send me an email to Alfonso Acosta Gonçalves (a.acostagoncalves@unsw.edu.au).

# Imports & Data Loading

The code must be run on **Kernel 3-19.10**, which is the only one where panel & holoviews have been installed.

In [None]:
import cmocean.cm as cmo
import holoviews as hv
import numpy as np
import pandas as pd
import panel as pn
import xarray as xr

# Disable warning messages
import warnings
warnings.filterwarnings('ignore')

# Enable panels & the bokeh back-end for Holoviews
pn.extension()
hv.extension('bokeh')

Data loading:

In [None]:
# Open example output files
# Monthly data
dsm = xr.open_dataset('/g/data3/hh5/tmp/cosima/access-om2/1deg_jra55_ryf9091_spinup1_B1/output051/ocean/ocean_month.nc')
# Yearly data
dso = xr.open_dataset('/g/data3/hh5/tmp/cosima/access-om2/1deg_jra55_ryf9091_spinup1_B1/output051/ocean/ocean.nc')

# 1. BASIC INTERACTIVE VISUALIZATION

The easiest way to reuse the code is to define a function that receives the data we want to show in the interactive plot, along with some visualization options. That way, we can simply change the function call to show whichever variable we want. But first, let's just build a basic interactive plot directly on the code for better clarity, to investigate how the mixed-layer depth varies as the months pass by.

(We'll also use this opportunity to learn some Spanish —the language of the future!— to avoid using python keywords in the code).

In [None]:
# Mixed-Layer Depth through time
datos = dsm.mld # DataArray from the output DataSet containing all the information to plot
valor = "mld"   # Name of the DataArray's value that will be used as the color-coded value
                # in the plots. In this case, since we're plotting dsm.mld, "mld" is 
                # the name of the value to plot. This can be seen when looking at dsm.mld: 
                # (xarray.DataArray 'mld' (time: 60, yt_ocean: 300, xt_ocean: 360)...
                # It could've been "temp", or "salt", or...
mapa = cmo.deep_r # Colormap to be used in the plot
titulo = "MLD through time" # Title of the plot

The **hv.extension** instruction below only needs to be run once, hence why it was put at the beginning of the script, following the imports. It's just added here again to remind the user that a holoviews "back-end" must be specified (it can be either 'bokeh' or 'matplotlib'); otherwise no plots will be shown, but there will be _no error message_ either (!). Besides, if you generate some plots with the Matplotlib back-end, you'll need to change it back to Bokeh when required.

In [None]:
hv.extension('bokeh')

# Transform our DataArray into a HoloViews's Dataset, so Holoviews knows how to manipulate it
hv_ds = hv.Dataset(datos)

# Transform the Dataset into a "Quadmesh", which is the standard mesh-grid plot.
# kdims specifies which DataArray dimensions to use as X & Y coordinates in the plot.
# vdims specifies which is the name of the variable we are plotting.
# dynamic specifies that only the data necessary for the current plot should be loaded on memory.
# That way, when exploring large datasets that cannot fit into memory, holoviews can instead
# just load enough data to show the current plot without overloading the system.
qm = hv_ds.to(hv.QuadMesh, kdims=["xt_ocean", "yt_ocean"], vdims=[valor], dynamic=True)
# Standard plot options, plus which bokeh tools we want to use. In our case,
# 'hover' is one of the most useful ones since it will display precise data at the mouse's location.
# Bokeh provides otherwise standard tools for moving around the plot, zooming, saving the plot,
# and resetting the plot to its original state.
qm.opts(title=titulo, cmap=mapa, width=900, height=500, colorbar=True, 
        tools=['hover'])

# Since the DataArray we're using has 3 dimensions (time, xt_ocean, yt_ocean) and we have only
# used the last 2 dimensions, Holoviews automatically provides a "widget" to interact or manipulate
# ALL MISSING DIMENSIONS. In our case, it will generate a slider so we can move through time,
# updating the plot dynamically.
# We then create a panel, specifying that we want the widget on top of the plot & centred.
hv_panel = pn.panel(qm, widget_location='top', center=True)
# The following line is not necessary, it just prints the panel's data structure so we understand 
# how we can get a handler on the widget to change its width and make it larger.
# It can be commented to avoid printing the panel structure.
hv_panel.pprint()
# Based on the hv_panel structure printed on screen, we access the slider to change its width.
hv_panel[1][0][1][0].width = 500
# Show the panel on screen.
hv_panel

Notice that the **colorbar from the plot changes its values** to accommodate the min & max values of each "current" plot. This is caused by `dynamic=True`, which only loads into memory the data necessary for the current plot, and hence cannot know which will be the future min & max values.

If desired, the values can be manually set to remain constant with: 

`qm = qm.redim.range(mld=(min_range, max_range))`


And, to avoid having to replace _'mld'_ in the code when visualizing other variables, we can just use the eval function to dynamically build the expression for us: 

`qm = eval("qm.redim.range(" + valor + "=(min_range, max_range))")`

We'll use this new information to build a function to display our interactive plots.

# 2. FUNCTION-BASED VISUALIZATION

In [None]:
def holo_plot(datos, valor, mapa, titulo, min_range=None, max_range=None):
    """
    Returns a Holoviews Panel for interactive data visualization.
    
    Parameters:
    -----------
        datos: DataArray with the data to plot
        valor: String with the name of the data value to plot
        mapa: String with the name of the colormap to use on the plot
        titulo: String with the title of the plot
        min_range: Float or Int with the minimum value of the colorbar range
        max_range: Float or Int with the maximum value of the colorbar range
    """
    
    hv.extension('bokeh')

    # Transform our DataArray into a HoloViews's Dataset, so Holoviews knows how to manipulate it
    hv_ds = hv.Dataset(datos)
    
    # Transform the Dataset into a "Quadmesh", which is the standard mesh-grid plot.
    # kdims specifies which dimensions to use as X & Y coordinates in the plot.
    # vdims specifies which is the name of the variable we are plotting.
    # dynamic specifies that only the data necessary for the current plot should be loaded on memory.
    # That way, when exploring large datasets that cannot fit into memory, holoviews can instead
    # just load enough data to show the current plot without overloading the system.
    qm = hv_ds.to(hv.QuadMesh, kdims=["xt_ocean", "yt_ocean"],  vdims=[valor], dynamic=True)
    # Standard plot options, plus which bokeh tools we want to use. In our case, 'hover' is one
    # of the most useful ones since it will display precise data at the mouse's location.
    # Bokeh provides otherwise standard tools for moving around the plot, zooming, saving the plot,
    # and resetting the plot to its original state.
    qm.opts(title=titulo, cmap=mapa, width=900, height=500, colorbar=True, tools=['hover'])
    # Fix the colorbar range if provided
    if min_range is not None and max_range is not None:
        qm = eval("qm.redim.range(" + valor + "=(min_range, max_range))")

    # Since the DataArray we're using has 3 dimensions (time, xt_ocean, yt_ocean) and we have only
    # used the last 2 dimensions, Holoviews automatically provides a "widget" to interact or manipulate
    # all missing dimensions. In our case, it will generate a slider so we can move through time,
    # updating the plot dynamically.
    # We then create a panel, specifying that we want the widget on top & centred.
    hv_panel = pn.panel(qm, widget_location='top', center=True)
    #hv_panel.pprint()
    # Based on the hv_panel structure printed on screen, we access the slider to change its width.
    hv_panel[1][0][1][0].width = 500
    # Show the panel on screen.
    return hv_panel

Let's check that the function behaves as expected:

In [None]:
# MLD through time, with a fixed colorbar-range between 0 & 2000 m
holo_plot(datos = dsm.mld, 
          valor = "mld",
          mapa = cmo.deep_r,
          titulo = "MLD through time",
          min_range = 0, 
          max_range = 2000)

Now that the holoviews function is set, we can just change the function call to inspect other variables. For example, to take a look at sea-level height:

In [None]:
holo_plot(datos = dsm.sea_level, 
          valor = "sea_level",
          mapa = cmo.deep_r,
          titulo = "SSH through time")

# 3. WHY BOTHER?

If you're wondering why bother with any of this when **ncview** can already do all this, its power lies in that **we can manipulate the data before it's displayed**, any way we want. For example: since we have monthly data over 5 years, we can average all the data per month of the year to smooth out inter-year variability. As ever, Holoviews will realize that the 'time' dimension is not being used to plot the data, and will provide a widget to interact with it:

In [None]:
# Monthly-averaged Mixed-Layer Depth
holo_plot(datos = dsm.mld.groupby('time.month').mean('time'), 
          valor = "mld",
          mapa = cmo.deep_r,
          titulo = "Monthly-averaged MLD")

We could also take a look at the data grouping by season. In this case, Holoviews will change the widget provided to select one of the seasons.

In [None]:
holo_plot(datos = dsm.mld.groupby('time.season').mean('time'), 
          valor = "mld",
          mapa = cmo.deep_r,
          titulo = "Seasonally-averaged MLD")

# 4. CONCLUSION

Anybody can now copy-paste the imports, the **holo_plot** function definition from section 2, and be ready to interactively explore their data!

For example: the temperature data is actually 4-dimensional, with lon, lat, time **& depth**. Do we need to change anything in our code to explore this additional 4th dimension? No. Holoviews will realize that there are 2 dimensions that are not being used to plot the data (time & depth), and will thus provide 2 widgets to interact with them any way we want:

In [None]:
holo_plot(datos = dso.temp, 
          valor = "temp",
          mapa = cmo.thermal,
          titulo = "Temperature through time & depth")

**Tip**: If you're in Chrome, when you click on the widget and the slider turns blue, you can use the keyboard arrows to move left & right for easier exploration!

And one final thing: not everything in this life is a QuadMesh. We can use these same principles to plot other things. For example: Let's say we want to look at the time evolution of temperatures at different depths. In our example, we only have data for 5 years, but the same code will work if you open multiple .nc files together to get a longer time period.

In [None]:
hv.extension('bokeh')

# Since temperature data is 4-dimensional, we average temperatures over lons & lats,
# plot the time dimension, and interact with the depth dimension.
exp_data = (dso.temp.mean(['yt_ocean', 'xt_ocean'])) - 273.15

# Same procedure as before, we just create a Curve instead of a QuadMesh
hv_ds = hv.Dataset(exp_data)
wi = hv_ds.to(hv.Curve, kdims=['time'], vdims=['temp'], label="Temperature", dynamic=True)
# We specify that the axes on the frame should adapt to the current data being displayed.
# If the axes didn't move and our first plot was showing temperatures ranging from, say,
# 10 to 15 degrees, by the time we got to the bottom of the ocean and temperatures were 
# around 0 degrees, we wouldn't be able to see anything on the plot. (Try it!)
wi.opts(framewise=True)
wi.opts(width=600, height=600, 
        padding=(0, 0.05), # Specifiy that we want some space (0.05) between the min value, 
                           # and the horizontal axis, so they don't touch
        fontsize={'labels':12, 'yticks':12, 'title':12})

# Centre the widget on top of the plot
hv_panel = pn.panel(wi, center=True, widget_location='top')
# Make the widget wider, for more precise control
hv_panel[1][0][1][0].width = 500
# Show the result
hv_panel

Y eso es todo por hoy ^_^ If you're hungry for more, check out the Intermediate level, where we'll take control of which widgets are displayed, and how our plots react to them!