<b><img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250, style="padding: 10px"> 
<p><p><p><p><p><p>
<b>Interactive Image Visualization</b> <br>
Last verified to run on <b>2021-09-17</b> with LSST Science Pipelines release <b>w_2021_33</b> <br>
Contact authors: Leanne Guy <br>
Target audience: All DP0 delegates. <br>
Container Size: medium or large <br>
Questions welcome at <a href="https://community.lsst.org/c/support/dp0">community.lsst.org/c/support/dp0</a> <br>
Find DP0 documentation and resources at <a href="https://dp0-1.lsst.io">dp0-1.lsst.io</a> <br>

**Credit:** This tutorial was inspired by a notebook originally developed by Keith Bechtol in the context of the LSST Stack Club. It has been updated and extended for DP0.1 by Leanne Guy. Please consider acknowledging Leanne Guy and Keith Bechtol in any publications or software releases that make use of this notebook's contents.

### Learning Objectives

This tutorial introduces three open-source Python libraries that enable powerful interactive visualization of images and catalogs. 
 1. [**HoloViews**](http://holoviews.org): Produce high-quality interactive visualizations easily by annotating datasets rather than using direct calls to a plotting library
 2. [**Bokeh**](https://bokeh.org): A powerful data visualization library that provides interactive including brushing and linking between multiple plots. `Holoviews` + `Bokeh`
 3. [**Datashader**](https://datashader.org): Accurately render very large datasets quickly and flexibly.
 
These packages are part of the [PyViz](http://pyviz.org/) ecosystem of tools intended for visualization in a web browser and can be used to create quite sophisticated dashboard-like interactive displays and widgets. The goal of this tutorial is to provide an introduction and starting point from which to create more advanced, custom interactive visualizations. 

### Logistics
This notebook is intended to be runnable on `data.lsst.cloud`. Note that occasionally the notebook may seem to stall, or the interactive features may seem disabled. If this happens, usually a restart of the kernel fixes the issue. You might also need to log out of the RSP and start a "large" instance of the JupyterLab environment. In some examples shown in this notebook, the order in which the cells are run is important for understanding the interactive features, so you may want to re-run the set of cells in a given section if you encounter unexpected behavior. Note that some of the examples require manual selection of points on a graph to run correctly.

### Setup

In [1]:
# General python imports
import numpy as np
import pandas as pd
import warnings

# Updadte this option setting as you prefer
pd.set_option('display.max_rows', 5)

# Astropy
from astropy.visualization import  ZScaleInterval, AsinhStretch
from astropy import units as u
from astropy.coordinates import SkyCoord

# Rubin TAP service utilities
from rubin_jupyter_utils.lab.notebook import get_tap_service, retrieve_query

# Bokeh and Holoviews for visualization
import bokeh
from bokeh.io import output_notebook, show
from bokeh.models import ColumnDataSource, Range1d, HoverTool
from bokeh.models import Selection, CDSView, GroupFilter
from bokeh.plotting import figure, gridplot
from bokeh.transform import factor_cmap

import holoviews as hv
from holoviews import streams, opts
from holoviews.operation.datashader import datashade, dynspread, rasterize
from holoviews.plotting.util import process_cmap

import datashader as dsh

# Set the holoviews plotting library to be bokeh
# You will see the holoviews + bokeh icons displayed when the library is loaded successfully
hv.extension('bokeh')

# Display bokeh plots inline in the notebook
output_notebook()

Patching auth into notebook.base.handlers.IPythonHandler(notebook.base.handlers.AuthenticatedHandler) -> IPythonHandler(jupyterhub.singleuser.mixins.HubAuthenticatedHandler, notebook.base.handlers.AuthenticatedHandler)
Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
NumExpr defaulting to 8 threads.


In [2]:
# What versions of bokeh and holoviews nd datashader are we working with?
# This is important when referring to online documentation as
# APIs can change between versions.
print("Bokeh version: " + bokeh.__version__)
print("Holoviews version: " + hv.__version__)
print("Datashader version: " + dsh.__version__)

Bokeh version: 2.3.3
Holoviews version: 1.14.5
Datashader version: 0.13.0


In [3]:
# Prevent some helpful but ancillary warning messages from printing
# during some LSST DM Release calls
warnings.simplefilter("ignore", category=FutureWarning)
warnings.simplefilter("ignore", category=UserWarning)

In [4]:
# What version of the LSST Science Pipelnes are we using?
! echo $IMAGE_DESCRIPTION
! eups list -s | grep lsst_distrib

Recommended (Weekly 2021_33)
lsst_distrib          22.0.1-3-g7ae64ea+3baa4596b0 	w_2021_33 current setup


### 1. Data preparation

The basis for any data visualization is the underlying data. In this tutorial we will work with both tabular data and images. 

#### 1.1 DP0.1 tabular dataset
We will execute a cone search about a defined coordinate with a specified radius using the Rubin TAP service. For more details about using the TAP service and ADQL queries, please refer to tutorial 02_Intermediate_TAP_Query.

In [5]:
# Get a Rubin TAP service instance. 
service = get_tap_service()
assert service is not None

In [6]:
# Define a reference position on the sky and a radius in arcseconds for a cone search
c1 = SkyCoord(ra=59.7955707*u.degree, dec=-29.91176471*u.degree, frame='icrs')
radius = 15.882353 * u.arcmin

In [7]:
query = "SELECT obj.ra, obj.dec, obj.objectId, obj.mag_g, obj.mag_r, " \
        "obj.mag_i, obj.mag_g_cModel, obj.mag_r_cModel, obj.mag_i_cModel," \
        "obj.psFlux_g, obj.psFlux_r, obj.psFlux_i, obj.cModelFlux_g, " \
        "obj.cModelFlux_r, obj.cModelFlux_i, obj.tract, obj.patch, " \
        "obj.extendedness, obj.good, obj.clean, " \
        "truth.mag_r as truth_mag_r, truth.match_objectId, "\
        "truth.flux_g, truth.flux_r, truth.flux_i, truth.truth_type, " \
        "truth.match_sep, truth.is_variable " \
        "FROM dp01_dc2_catalogs.object as obj " \
        "JOIN dp01_dc2_catalogs.truth_match as truth " \
        "ON truth.match_objectId = obj.objectId " \
        "WHERE CONTAINS(POINT('ICRS', obj.ra, obj.dec),"\
        "CIRCLE('ICRS', " + str(c1.ra.value) + ", " + str(c1.dec.value) + ", " \
        + str(radius.to(u.deg).value) + " )) = 1 " \
        "AND truth.match_objectid >= 0 "\
        "AND truth.is_good_match = 1"
# print(query)

In [8]:
%%time
# Execute the query and convert the results to a pandas dataframe
data = service.search(query).to_table().to_pandas() 
assert len(data) == 102096



CPU times: user 6.92 s, sys: 269 ms, total: 7.18 s
Wall time: 9.54 s


In [9]:
# Map the truth type to a descriptive string
data['truth_type'] = data['truth_type'].map(
    {1: 'galaxy', 2: 'star', 3: 'SNe'})

In [10]:
assert data[data["truth_type"] == "star"].shape[0] == 2164
assert data[data["truth_type"] == "galaxy"].shape[0] == 99932
assert data[data["truth_type"] == "SNe"].shape[0] == 0

In [11]:
# Compute some colours and add to the result set. 
data['gmi'] = data['mag_g_cModel'] - data['mag_i_cModel']
data['rmi'] = data['mag_r_cModel'] - data['mag_i_cModel']
data['gmr'] = data['mag_g_cModel'] - data['mag_r_cModel']

#### 1.2 DP0.1 Images
We will work with  images that we will retrieve via the Butler, a `calexp` and a `coadd` image. For more details about using the Butler, please refer to tutorial `04_Intro_to_Butler` . 
 For DP0.1, images can only be accessed via the `butler` (<a href="https://pipelines.lsst.io/modules/lsst.daf.butler/index.html">documentation</a>), an LSST Science Pipelines software package that allows you to fetch the LSST data you want without you having to know its location or format. For more details on how to use the Butler, see tutorial `04_Intro_to_Butler`. We will retrieve a calexp and a deep r-band coadd image from a dataset, specifying a tract and patch to work with. 

In [12]:
# Instantiate the Butler initializing it with the repository name and the DP0.1 collection identifier
from lsst.daf.butler import Butler
repo = 's3://butler-us-central1-dp01'
collection = '2.2i/runs/DP0.1'
butler = Butler(repo, collections=collection)

Found credentials in shared credentials file: /home/leannep/.lsst/aws-credentials.ini


In [13]:
# Define a calibrated exposure and retrieve it via the Butler
calexpId = {'visit': 192350, 'detector': 175, 'band': 'i'}
calexp = butler.get('calexp', **calexpId)
assert calexp is not None
# Source table for this exposure
calexpSrc = butler.get('src', **calexpId)

In [14]:
# Define a deep coadded image and retrieve it via the Butler
coaddId = {'tract': 4226, 'patch': 17, 'band': 'r'}
coadd = butler.get('deepCoadd', **coaddId)
assert coadd is not None
# Source table for this coadd
coaddSrc = butler.get('deepCoadd_forced_src', coaddId)

### 2. Holoviews

[Holoviews](https://holoviews.org) supports easy analysis and visualization by annotating data rather than utilizing direct calls to plotting packages. For this tutorial, we will use [Bokeh](hrrps://bokeh.org) as the plotting library backend for Holoviews. This is defined in the `Setup` section above with the `hv.extension('bokeh')` call.  Holoviews supports several plotting libraries and there is an exercise to the user at the end of this section to explore using Holoviews with other plotting packages. 

#### 2.1 Visualizing tabular data with Holoviews

The basic core primitives of Holoviews are [Elements](http://holoviews.org/Reference_Manual/holoviews.element.html); hv.Element. Elements are simple wrappers around your data that provide a semantically meaningful visual representation. An Element may be a set of Points, an Image, a Curve, a Histogram, a Scatter, etc. See the Holoviews [Reference Gallery](http://holoviews.org/reference/index.html) for all the various types of Elements that can be created with Holoviews. 

In this first example we will use the Holoviews [Scatter Element](http://holoviews.org/reference/elements/bokeh/Scatter.html) to quickly visualize the catalog data retrieved in section 1 as a scatter plot. HoloViews maintains a strict separation between content and presentation. This separation is achieved by maintaining sets of keyword values as `options` that specify how `Elements` are to appear.  In this first example we will apply the default options and remove the toolbar. 

In [15]:
# Make a simple scatter plot of the data using the Scatter element. 
hv.Scatter(data).options(toolbar=None)

The `data` object contains many columns. If no columns are specified, explicitly the first 2 columns are taken for x and y respectively by the `Scatter` element.  

Now let's bin the data in RA using the robust [Freedman Diaconis Estimator](https://numpy.org/doc/stable/reference/generated/numpy.histogram_bin_edges.html#numpy.histogram_bin_edges) and plot
the resulting distribution using the Holoviews [Histogram Element](http://holoviews.org/reference/elements/bokeh/Histogram.html). 
We will also add in some basic plot options. Read more about about [customizing plots](https://holoviews.org/user_guide/Customizing_Plots.html) via `options`. Note that `options` can be shortened to `opts`.

In [16]:
(ra_bin, count) = np.histogram(data['ra'], bins='fd')
ra_distribution = hv.Histogram(ra_bin, count).opts(
    title="RA distribution",color='darkmagenta', 
    xlabel='RA', fontscale=1.2,
    height=400, width=400)

Histogram edges should be supplied as a tuple along with the values, passing the edges will be deprecated in holoviews 2.0.


In [17]:
ra_distribution

Next, let's create a layout of several plots. A `Layout` is a type of `Container` that can contain any HoloViews object. Other types of Containers that exist include Overlay, Gridspace, Dymamicmap, etc. See the Holoviews [Reference Gallery](http://holoviews.org/reference/index.html) for the full list of `Layouts` that can be created with Holoviews. See [Building Composite Objects](http://holoviews.org/user_guide/Building_Composite_Objects.html) for the full details about the ways Containers can be composed.

The `+` operator is used to create a Layout. 

In [24]:
# Slice the data and set some moore options
skyplot = hv.Scatter(data[["ra", "dec"]]).opts(
    title="Skyplot", toolbar='above', tools = ['hover'], 
    height=350, width=350) 

# Build a composite object using the `+` operator
skyplots = skyplot + \
           skyplot[59.67:60.07, -30.71:-29.78].opts(
    title="Skyplot region", tools=[]) + \
           ra_distribution.options(height=350, width=350)

In [None]:
skyplots

A few things to note about these three plots. The first two are the same plot, with the second one plotting only a subset of the data. When you zooom in on either of these two plots, they will both change in the same manner. The third plot is a different plot object and is not linked to the other two. It will not change in response to actions on either of the first two. Try zooming in on the distributon in the RA distribution plot, you will notice that the data are not rebinned and that the two skyplots do not change. This is because the plots are not linked. We will see how to link plots in Section 3.

Next, let's setup some default plot options to avoid duplicating long lists everytime we want to make a plot. As different plotting packages typically provide different customization capabilities, we will define one set of options for a Bokeh backend and one for a matplotlib backend.

In [None]:
# Bokeh specific customizations as a python dictionary 
plot_style_bkh = dict(alpha=0.4, color='darkmagenta', 
                      marker='triangle', size=3,
                      xticks=5, yticks=5,
                      height=400, width=400, 
                      toolbar='above')

# Matplotlib specific customizations
plot_style_mpl = dict(alpha=0.2, color='c', marker='s', 
                      fig_size = 200, s=3, 
                      fontsize=14, xticks=8, yticks=8)

Instead of subsetting a dataset to choose which columns to plot, Holoviews allows us to specifiy the dimensionality directly. 
`kdims` are the key dimensions or the independent variable(s) and `vdims` are the value dimensions or the dependenent variable(s) 

In [None]:
# Use the bokeh plot style
plot_style = plot_style_bkh
hv.Scatter(data, 
           kdims=['gmi'], vdims=['mag_g_cModel']
          ).opts(invert_yaxis=True,
                 xlabel="G-I color", ylabel="G magnitude",
                 **plot_style)

Dimensions can be specified as strings above but they are in fact rich objects. Dimension objects support a long descriptive label, which complements the short programmer-friendly name. Let's look at color-color diagram of the stars in the dataset.  

In [None]:
# Axes as rich objects
rmi = hv.Dimension('rmi', label='(r-i)', range = (-0.8,3.0))
gmr = hv.Dimension('gmr', label='(g-r)', range = (-0.8,3.0))

Let's make a colour-colour scatter plot of just the stars in the dataset and also display the distribution of samples along both value dimensions using the `hist()` method of the [Scatter Element](http://holoviews.org/reference/elements/bokeh/Scatter.html).

In [None]:
stars = data[data["truth_type"] == 'star']
col_col = hv.Scatter(stars, kdims=gmr, 
                     vdims=rmi).opts(**plot_style)

#  Use the hist method to show the distribution of samples along both value dimensions.
col_col = col_col.hist(dimension = [gmr,rmi], 
                       num_bins=100, adjoin=True)

In [None]:
col_col

Try zooming in on regions of the plot. The histograms are automatically recomputed.  

The techniques to apply customizations in the cells above use standard python syntax and are the recommended way to customize your visualizations in HoloViews. Holoviews also supports IPython magics. Magics is a much older approach that is not standard python and is specific to notebooks. [Holoviews notebook magic](https://holoviews.org/user_guide/Notebook_Magics.html) supports both line and cell magics. Here is an example of using magics to plot the same spatial distribution of `Objects` as above.

In [None]:
%%opts Scatter [tools=['hover'], toolbar='above',height=400, width=400](color='grey')
hv.Scatter(data)

Our result set above contained a lot of columns. Often want to be selective about which information we show in the hover tool and customize the names. We do this by creating a custom hover tool. 

In [None]:
raDecHover = HoverTool(
    tooltips=[
        ( 'ra/dec', '@ra / @dec'),
        ( 'rmag', '@mag_r_cModel'),
        ( 'type', '@truth_type'),
    ],
    formatters={
        'ra/dec' : 'printf',
        'rmag' : 'numeral',
        'type' : 'printf',
    },
    point_policy="follow_mouse"
)

In [None]:
hv.Scatter(data).opts(tools=[raDecHover], **plot_style_bkh)

#### 2.2. Visualizing exposure images with Holoviews

In the tutorial 03_Image_Display_and_Manipulation we saw how to use the `lsst.afw.display` library to visualize exposure images and in tutorial 03b_Image_Display_with_Firefly we saw how to do the same using Firefly. In this example we demonstrate image visualization at the pixel level with Holoviews.

We will use the holoviews Image Element to visualise a calexp. We will then overlay a Holoviews DynamicMap on the image to compute and display elements dynamically, allowing exploration of large datasets. DynamicMaps generate elements on the fly allowing exploration of parameters with arbitrary resolution. DynamicMaps are lazy in the sense they only compute as much data as the user wishes to explore. An Overlay is a collection of HoloViews objects that are displayed simultanously, e.g a Curve superimposed on a Scatter plot of data. You can build a Overlay between any two HoloViews objects, which can have different types using the * operator. 

First, we will apply the same asinh stretch and zscale interval as was applied in `04_Intro_to_Butler` directly to the calexp object.

In [None]:
# Apply a asinh/zscale mapping to the data 
transform = AsinhStretch() + ZScaleInterval()
scaledImage = transform(calexp.image.array)

Define the (left, bottom, top, right) edges of the sensor image.

In [None]:
bounds_img = (0, 0, calexp.getDimensions()[0], calexp.getDimensions()[1])

To will use the matplotlib convention of placing the origin of an plot in the top left, we need to flip the data.

In [None]:
scaledImage = np.flipud(scaledImage)
bounds_img = (0, calexp.getDimensions()[1], calexp.getDimensions()[0],0)

In [None]:
# Define some default plot options for the Image
img_opts = dict(height=400, width=500, 
                xaxis="bottom", 
                padding = 0.01, fontsize={'title': '8pt'},
                colorbar=True, toolbar='right', show_grid=True,
                tools=['hover']
               )     

In [None]:
# Make a function to generate a plot title from the dataId.
def dataIdToString(dataId: dict) -> str:
    title = "DC2 image: "
    for key, value in dataId.items():
        title += str(key) + ": " + str(value) + " "
    return title.strip() 

In [None]:
# Create the Image element.
img = hv.Image(scaledImage, bounds=bounds_img,
               kdims=['x-axis', 'y-axis']).opts(
    cmap = "Greys_r",  xlabel = 'X', ylabel ='Y',
    title = dataIdToString(calexpId),
    **img_opts)

In [None]:
rasterize(img)

#### 2.3 Overlaying source detections on an image

Now let's overlay the source detections from the `Source` catalog on this image. We will use the Points Element for the detections to overlay.

In [None]:
s = calexpSrc.getColumnView()
coords = s.getX(), s.getY()

In [None]:
f"Number of src detections is, {len(coords[1])}"

You can view an HTML rendering of the `src` table by getting an `astropy.table.Table` version of it The coord columns are `coord_ra` and `coord_dec`

In [None]:
detections = hv.Points(coords).opts(
    fill_color=None, size = 9, color="darkorange")

The `*` operator  is used to overlay one Element on to another

In [None]:
rasterize(img).opts(cmap = 'Greys_r') * detections

### 3.0 Interactive Image Exploration with with Holoviews Streams and DynamicMap

Now let's addd some interactive exploration capability using Holoviews [Streams](http://holoviews.org/user_guide/Streaming_Data.html) and [DynamicMap](https://holoviews.org/reference/containers/bokeh/DynamicMap.html). A DynamicMap is an explorable multi-dimensional wrapper around a callable that returns HoloViews objects. The core concept behind a stream is simple: it defines one or more parameters that can change over time that automatically refreshes code depending on those parameter values.

First create a DynamicMap with a box stream so that we can explore selcted sections of the image.

In [None]:
boundsxy = (0, 0, 0, 0)
box = streams.BoundsXY(source=img, bounds=boundsxy)
dynamicMap = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box])

In [None]:
# Display the image and overlay the DynamicMap
rasterize(img) * dynamicMap

Using the interactive callback features on the image plots, such as the selection box, we can explore regions of the image.  Use the box select tool on the image above to select a region and then execute the cell below to get the box boundary coordinates. 

In [None]:
box

Here's another version of the image with a tap stream instead of box select.  Try zooming in on an interesting part of the image and then click somewhere to place an 'X' marker. 

In [None]:
posxy = hv.streams.Tap(source=img, x=0.5 * calexp.getDimensions()[0],
                       y=0.5 * calexp.getDimensions()[1])
marker = hv.DynamicMap(lambda x, y: hv.Points([(x, y)]), streams=[posxy])
rasterize(img)* marker.opts(color='white', marker='x', size=20)

'X' marks the spot! What's the value at that location? Execute the next cell to find out.

In [None]:
print('The value at position (%.3f, %.3f) is %.3f' %
      (posxy.x, posxy.y, calexp.image.array[-int(posxy.y), int(posxy.x)]))

### 4.0 Brushing and linking between scatter plots with Bokeh

We will now look at the Bokeh plotting library directly to demonstrate how to set up brushing and linking between two panels showing different repsentations of the same dataset. A selection applied to either panel will highlight the selected points in the other panel.

Based on [Bokeh linked brushing](http://bokeh.pydata.org/en/latest/docs/user_guide/interaction/linking.html#linked-brushing)

#### 4.1 Data preparation
The basis for any data visualization is the underlying data. Getting the data preparation phase right is key to creating powerful visualizations. 
Bokeh works with a ColumnDataSource (CDS).  A CDS is essentially a collection of sequences of data that have their own unique column name. We will create a CDS from the data returned by the query above and pass it directly to bokeh. The CDS is the core of bokeh plots. Bokeh automatically creates a CDS from data passed as python lists or numpy arrays.  CDS are useful as they allow data to be shared between multiple plots and renderers, enabling brushing and linking. 

In [None]:
# Create a column data source for the plots to share. 
col_data = dict(x0=data['ra'] - c1.ra.value,
                y0=data['dec'] - c1.dec.value,
                x1=data['gmr'],
                y1=data['mag_g_cModel'],
                ra=data['ra'],
                dec=data['dec'])
source = ColumnDataSource(data=col_data)

# Additional data can be added to the CDS after creation
source.data['objectId'] = data['objectId']
source.data['rmi'] = data['rmi']
source.data['gmr'] = data['gmr']
source.data['mag_r_cModel'] = data['mag_r_cModel']
source.data['truth_type'] = data['truth_type']

# Create a view on truth_type stars
stars = CDSView(source=source,
                filters=[GroupFilter(column_name='truth_type', group="star")])

In [None]:
type(source)

#### 4.2 Colour-Magnitude Diagram linked to a Skyplot

We will use bokeh to plot a color-magnitude (g vs. g-i) diagram making use of the cModel magnitudes and a skyplot and link link the two

In [None]:
# Create a custom hover tool on both panels
hover_left = HoverTool(tooltips=[("(RA,DEC)", "(@ra, @dec)"),
                                 ("(g-r,g)", "(@x1, @y1)"),
                                 ("ObjectId", "@objectId")])
hover_right = HoverTool(tooltips=[("(RA,DEC)", "(@ra, @dec)"),
                                  ("(g-r,g)", "(@x1, @y1)"),
                                  ("ObjectId", "@objectId")])
tols = "box_zoom,box_select,lasso_select,reset,help"
tools_left = [hover_left, tools]
tools_right = [hover_right, tools]

In [None]:
# Create a new plot and add a renderer. We will look at stars only
stars = CDSView(source=source,
                filters=[GroupFilter(column_name='truth_type', group="star")])

left = figure(tools=tools_left, plot_width=400, plot_height=400,
              output_backend="webgl",
              title='Spatial: Centered on (RA, Dec) = (%.2f, %.2f)' %
              (c1.ra.value, c1.dec.value))
left.circle('x0', 'y0', hover_color='firebrick', 
            source=source, view=stars, # select only stars
            selection_fill_color='steelblue',
            selection_line_color='steelblue',
            nonselection_fill_color='silver',
            nonselection_line_color='silver')
left.x_range = Range1d(0.4, -0.4)
left.y_range = Range1d(-0.4, 0.4)
left.xaxis.axis_label = 'Delta RA'
left.yaxis.axis_label = 'Delta DEC'

# create another new plot and add a renderer
right = figure(tools=tools_right, plot_width=400, plot_height=400, 
               output_backend="webgl",
               title='CMD')
right.circle('x1', 'y1', hover_color='firebrick', 
             source=source, view=stars,  # Select only stars
             selection_fill_color='steelblue',
             selection_line_color='steelblue',
             nonselection_fill_color='silver',
             nonselection_line_color='silver')
right.x_range = Range1d(-1.5, 2.5)
right.y_range = Range1d(32., 16.)
right.xaxis.axis_label = 'g - r'
right.yaxis.axis_label = 'g'

p = gridplot([[left, right]])
show(p)

Use the hover tool to see information about individual datapoints (e.g., the objectd). This information should appear automatically as you hover the mouse over the datapoints. Notice the data points highlighted in red on one panel with the hover tool are also highlighted on the other panel.

Next, click on the selection box icon (with a "+" sign) or the selection lasso icon found in the upper right corner of the figure. Use the selection box and selection lasso to make various selections in either panel by clicking and dragging on either panel. The selected data points will be displayed in the other panel.

### 5.0 Further analysis with Holoviews Linked Streams

If we want to do subsequent calculations with the set of selected points, we can use HoloViews linked streams for custom interactivity. The following visualization is a modification of this example. As for the example above, use the selection box and selection lasso to datapoints on the left panel. The selected points should appear in the right panel. Finally, notice that as you change the selection on the left panel, the mean x- and y-values for selected datapoints are shown in the title of right panel.

Based on [Holoviews Selection1D points](http://holoviews.org/reference/streams/bokeh/Selection1D_points.html)

In [None]:
# Declare some points
points = hv.Points((data['ra'] - c1.ra.value, data['dec'] - c1.dec.value)
                  ).options(tools=['box_select', 'lasso_select'])

# Declare points as source of selection stream
selection = streams.Selection1D(source=points)

# Define a function that uses the selection indices to slice points and compute stats
def selected_info(index):
    selected = points.iloc[index]
    if index:
        label = 'Mean x, y: %.3f, %.3f' % tuple(selected.array().mean(axis=0))
    else:
        label = 'No selection'
    return selected.relabel(label).options(color='red')


# Combine points and DynamicMap
# Notice the syntax used here: the "+" sign makes side-by-side panels
points + hv.DynamicMap(selected_info, streams=[selection])

In the next cell, we access the indices of the selected datapoints. We could use these indices to select a subset of full sample for further examination.

In [None]:
print(len(selection.index))

### 6.0  Visualizing Larger Datasets with Datashader

The interactive features of Bokeh work well with datasets up to a few tens of thousands of data points. To efficiently explore larger datasets, we'd like to use another visualization model that offers better scalability, namely Datashader.

In the examples below, notice that as one zooms in on the datashaded two-dimensional histograms, the bin sizes are dynamically adjusted to show finer or coarser granularity in the distribution. This allows one to interactively explore large datasets without having to manually adjust the bin sizes while panning and zooming. Zoom in all the way and you can see individual points (i.e., bins contain either zero or one count). If you zoom in far enough, the individual points are represented by extremely small pixels in datashader that are difficult to see. A solution is to dynspread instead of datashade, which will preserve a finite size of the plotted points.

The next cell also uses the concept of linked Streams in HoloViews for custom interactivity, in this case to create a selection box. We'll use that selection box tool in the following cell.

#### 6.1 Color-color plot 

Here we plot a color-colour diagram of the cModel magnitudes obtained fron the query in 1. Data Preparation

In [None]:
# Create color-color plot using bokeh
plot_options = {'plot_height': 400, 'plot_width': 800,
                'tools': ['hover', 'box_select', 'reset', 'help']}

hover = HoverTool(tooltips=[("objectId", "@objectId"),
                            ("(RA,DEC)", "(@ra, @dec)"),
                            ("(g-r,r-i)", "(@gmr, @rmi)"),
                            ("type", "@truth_type")])

p = figure(title="Colour-Colour Diagram (cModel magnitudes)",
           x_axis_label="g-r", y_axis_label="r-i",
           x_range=(-2.0, 3.0), y_range=(-2.0, 3.0),
           **plot_options)
p.circle(x='gmr', y='rmi', source=source,
         size=3, alpha=0.3,
         hover_color='firebrick',
         legend_field="truth_type",
         color=factor_cmap('truth_type', 'Category10_3',
                           ['galaxy', 'star', 'SNe']))
p.add_tools(hover)
show(p)

We see that even with a medium sized dataset of ~100K points, this plot suffers from overplotting.  A classic strategy is to specify transparency of the glyphs so we can better see sparse and dense areas. In the plot above we have `alpha=0.3`. This helps but washes out the detail in the sparser regions. An additional problem is that we cannot add too many glyphs to any plot. 

Holoviews + Datashader allows us to plot millions to billions of points this to produce much more informative plots. DataShader rasterizes or aggregates datasets into regular grids that can then be further analysed or viewed as images. 

In [None]:
# Create a holoviews object to hold and plot data
points = hv.Points((source.to_df()['gmr'], 
                    source.to_df()['rmi'])).opts(
    tools=['box_select', 'lasso_select'])


# Create the linked streams instance
boundsxy = (0, 0, 0, 0)
box = streams.BoundsXY(source=points, bounds=boundsxy)
bounds = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box])


# Apply the datashader
dynspread(datashade(points, cmap="Viridis").opts(
    width=800, height=300, tools=['hover'],
    padding=0.05, show_grid=True,
    xlim=(-2.0, 3.0), ylim=(-2.0, 3.0),
    xlabel="g-r", ylabel="r-i")) * bounds

This `datashade` plot of the same color-color diagram as above shows much more detail.  Select the `wheel zoom` and adjust the image as you interact with the plot. Note how the shades of color of the data points change according to the local density.

#### 6.2 Adding a callback function
Next we will add callback functionality to the colour-colour diagam above to retrieve the indices of selected points. We use the box selection tool to create a selection box for a two-dimensional histogram and then count the number of datapoints within the selection region.

> STOP - Select some data points from the plot above using the box select tool before proceeding

In [None]:
selection = (points.data.x > box.bounds[0]) \
    & (points.data.y > box.bounds[1]) \
    & (points.data.x < box.bounds[2]) \
    & (points.data.y < box.bounds[3])
print('The selection box contains %i datapoints'%(np.sum(selection)))

Now we will plot a spatial distribution on the sky of all the data and link it to a two-dimensional histogram of the data in the box selection.

In [None]:
# First, create a holoviews dataset instance. 
# Here we label some of the columns.
kdims = [('ra', 'RA(deg)'), ('dec', 'Dec(deg)')]
vdims = [('mag_r_cModel', 'r(mag)')]
ds = hv.Dataset(source.to_df(), kdims, vdims)

In [None]:
points = hv.Points(ds)
boundsxy = (np.min(ds.data['ra']), np.min(ds.data['dec']),
            np.max(ds.data['ra']), np.max(ds.data['dec']))
box = streams.BoundsXY(source=points, bounds=boundsxy)
box_plot = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box])

In [None]:
# Custom callback functionality to update the linked histogram
def log_inf(x):
    return np.log(x) if x > 0 else 0


def update_histogram(bounds=bounds):
    selection = (ds.data['ra'] > bounds[0]) & \
                (ds.data['dec'] > bounds[1]) & \
                (ds.data['ra'] < bounds[2]) & \
                (ds.data['dec'] < bounds[3])

    selected_mag = ds.data.loc[selection]['mag_r_cModel']
    frequencies, edges = np.histogram(selected_mag)
    hist = hv.Histogram((list(map(log_inf, frequencies)), edges))
    return hist

In [None]:
dmap = hv.DynamicMap(update_histogram, streams=[box]).options(
    height=400, width=400)
datashade(points,
          cmap=process_cmap("Viridis", provider="bokeh")) * \
box_plot.options(height=400, width=400) + \
dmap

Try changing the box selection and watch as the histogram is recomputed and displayed. 

### 7.0  Optional exercises to the user 

 1. Holoviews works with a wide range of plotting libraries, Bokeh, matplotlib, plotly, mpld3, pygal to name a few. As an exercise, try changing the Holoviews plotting library to be `matplotlib` instead of `bokeh` in the `Setup` cell at the beginning of the notebook with `hv.extension('matplotlib')`. You will see the holoviews + matplotlib icons displayed when the library is loaded successfully. Run the cells in section 2.1 again and compare the outputs. Try again with some other plotting library. Don't forget to set the plotting library back to whichever you prefer to use for the rest of this tutorial.
 
 2. In the image display sections, try using the coadd image instead of the calexp image. 
 
 3. Try writing a different callback function to use in section 6.2. 