<img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250, style="padding: 10px"> 
<b>Interactive Image Visualization</b> <br>
Last verified to run on <b>2021-06-25</b> with LSST Science Pipelines release <b>w_2021_25</b> <br>
Contact authors: Leanne Guy <br>
Credit: Originally developed by Keith Bechtol in the context of the Stack Club <br>
Target audience: All DP0 delegates. <br>
Container Size: medium <br>
Questions welcome at <a href="https://community.lsst.org/c/support/dp0">community.lsst.org/c/support/dp0</a> <br>
Find DP0 documentation and resources at <a href="https://dp0-1.lsst.io">dp0-1.lsst.io</a> <br>

**Table of Contents**
1. Introduction to interactive visualization with Bokeh and Holoviews and Datashader <br>
2. Interactive exposure image visualization<br>
3. DP0.1 catalog data sample<br>
4. Brushing and linking between scatter plots with Bokeh
5. Further analysis with Holoviews linked streams
6. Visualizing Larger Datasets with Datashader¶

### 0. Setup

In [None]:
# General python imports
import numpy as np

# Astropy
from astropy import units as u
from astropy.coordinates import SkyCoord

# Bokeh and Holoviews for visualization
import bokeh
from bokeh.io import output_notebook, show
from bokeh.models import ColumnDataSource, Range1d, HoverTool
from bokeh.models import Selection, CDSView, GroupFilter
from bokeh.plotting import figure, gridplot
from bokeh.transform import factor_cmap

import holoviews as hv
from holoviews import streams
from holoviews.operation.datashader import datashade, dynspread, rasterize
from holoviews.plotting.util import process_cmap
import datashader as dsh

hv.extension('bokeh')

# Display bokeh plots inline in the notebook
output_notebook()

### 1.0 Introduction <br>

#### 1.1 Interactive Imge Visualization with Visualization with Bokeh, HoloViews<br>

In the tutorial 03_Image_Display_and_Manipulation (afw) we saw how to use the `lsst.afw.display` library to visualize exposeure images. This tutorial demonstrates a few of the interactive features of the [Bokeh](https://bokeh.pydata.org/en/latest/), [HoloViews](http://holoviews.org/), and [Datashader](http://datashader.org/) plotting packages in the notebook environment. These packages are part of the [PyViz](http://pyviz.org/) set of python tools intended for visualization use cases in a web browser, and can be used to create quite sophisticated dashboard-like interactive displays and widgets. The goal of this notebook is to provide an introduction and starting point from which to create more advanced, custom interactive visualizations. 

#### 1.2 Learning Objectives
After working through and studying this notebook you should be able to:
   1. Use `holoviews` to visualize and interact with an exposure image. 
   1. Use `bokeh` to create interactive figures with brushing and linking between multiple plots
   2. Use `holoviews` and `datashader` to create two-dimensional histograms with dynamic binning to efficiently explore large datasets   

#### 1.3 Logistics
This notebook is intended to be runnable on `data.lsst.cloud`. Note that occasionally the notebook may seem to stall, or the interactive features may seem disabled. If this happens, usually a restart of the kernel fixes the issue. You might also need to log out of the RSP and start a "large" instance of the JupyterLab environment. In some examples shown in this notebook, the order in which the cells are run is important for understanding the interactive features, so you may want to re-run the set of cells in a given section if you encounter unexpected behavior.

In [None]:
# What versions of bokeh and holoviews nd datashader are we working with?
# This is important when referring to online documentation as
# APIs can change between versions.
print("Bokeh version: " + bokeh.__version__)
print("Holoviews version: " + hv.__version__)
print("Datashader version: " + dsh.__version__)

### 2. Exposure Image Visualization

In this example we demonstrate image visualization at the pixel level with datashader.

#### 3.1 Finding and retrieving an image with the `butler`
For DP0.1, images can only be accessed via the `butler` (<a href="https://pipelines.lsst.io/modules/lsst.daf.butler/index.html">documentation</a>), an LSST Science Pipelines software package that allows you to fetch the LSST data you want without you having to know its location or format. For more details on how to use the Butler, see tutorial 04_Intro_to_Butler. 

We will retrieve a deep r-band coadd image from a dataset, specifying a tract and patch

In [None]:
# Load the Butler, which provides programmatic access to LSST data products.
from lsst.daf.butler import Butler

repo = 's3://butler-us-central1-dp01'
collection = '2.2i/runs/DP0.1'
butler = Butler(repo, collections=collection)

dataId = {'tract': 4226, 'patch': 17, 'band': 'r'}

# Retrieve a deep coadded calibrated exposure using the `butler` instance
image = butler.get('deepCoadd', **dataId)
assert image is not None

In [None]:
%%output size = 200

# Use an actual sensor image
bounds_img = (0, 0, image.getDimensions()[0], image.getDimensions()[1])
img = hv.Image(np.log10(image.image.array),
               bounds=bounds_img).options(colorbar=True,
                                          cmap=bokeh.palettes.Viridis256)

boundsxy = (0, 0, 0, 0)
box = streams.BoundsXY(source=img, bounds=boundsxy)
bounds = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box])

rasterize(img) * bounds

As with the histograms, it is possible to use interactive callback features on the image plots, such as the selection box. Use the box select tool on the image above and the execute the cell below to get the box boundary coordinates. 

In [None]:
box

Here's another version of the image with a tap stream instead of box select. Click on the image to place an 'X' marker.

In [None]:
%%output size=200
%%opts Points (color='white' marker='x' size=20)

posxy = hv.streams.Tap(source=img, x=0.5 * image.getDimensions()[0],
                       y=0.5 * image.getDimensions()[1])
marker = hv.DynamicMap(lambda x, y: hv.Points([(x, y)]), streams=[posxy])

rasterize(img) * marker

'X' marks the spot! What's the value at that location? Execute the next cell to find out.

In [None]:
print('The value at position (%.3f, %.3f) is %.3f' %
      (posxy.x, posxy.y, image.image.array[-int(posxy.y), int(posxy.x)]))

### 3.0  DP0.1 catalog data sample
The data in the following example we will query the catalogs usig the TAP service to obtain a sample of data. For more details about using the TAP service and ADQL queries, please refer to tutorial 02_Intermediate_TAP_Query. We will use the same query as in thee 

#### 3.1 Create the Rubin TAP Service client

In [None]:
from rubin_jupyter_utils.lab.notebook import get_tap_service, retrieve_query
service = get_tap_service()
assert service is not None

#### 3.2 Query the DP0.1 catalogs

In [None]:
# Define a reference position on the sky and cone radius in arcseconds
# Coordinates obtained to optimize result set size 
c1 = SkyCoord(ra=59.7955707*u.degree, dec=-29.91176471*u.degree, frame='icrs')
radius = 15.882353 * u.arcmin

In [None]:
# Get the corner coordinates of the polygon
ra1 = bounds.getLon().getA()
ra2 = bounds.getLon().getB()
dec1 = bounds.getLat().getA()
dec2 = bounds.getLat().getB()

In [None]:
SELECT COUNT(*) FROM dp01_dc2_catalogs.object 
WHERE CONTAINS(POINT('ICRS', ra, dec), 
               POLYGON('ICRS',59.48893247541351, -30.176108281372347, 60.10220892152004, -30.176108281372347, 
                       60.10220892152004, -29.647064552820886, 59.48893247541351, -29.647064552820886))=1

In [None]:
query = "SELECT obj.objectId, obj.ra, obj.dec, obj.mag_g, obj.mag_r, " \
        "obj.mag_i, obj.mag_g_cModel, obj.mag_r_cModel, obj.mag_i_cModel," \
        "obj.psFlux_g, obj.psFlux_r, obj.psFlux_i, obj.cModelFlux_g, " \
        "obj.cModelFlux_r, obj.cModelFlux_i, obj.tract, obj.patch, " \
        "obj.extendedness, obj.good, obj.clean, " \
        "truth.mag_r as truth_mag_r, truth.match_objectId, "\
        "truth.flux_g, truth.flux_r, truth.flux_i, truth.truth_type, " \
        "truth.match_sep, truth.is_variable " \
        "FROM dp01_dc2_catalogs.object as obj " \
        "JOIN dp01_dc2_catalogs.truth_match as truth " \
        "ON truth.match_objectId = obj.objectId " \
        "WHERE CONTAINS(POINT('ICRS', obj.ra, obj.dec),"\
        "CIRCLE('ICRS', " + str(c1.ra.value) + ", " + str(c1.dec.value) + ", " \
        + str(radius.to(u.deg).value) + " )) = 1 " \
        "AND truth.match_objectid >= 0 "\
        "AND truth.is_good_match = 1"
print(query)

In [None]:
query = "SELECT obj.objectId, obj.ra, obj.dec, obj.mag_g, obj.mag_r, " \
        "obj.mag_i, obj.mag_g_cModel, obj.mag_r_cModel, obj.mag_i_cModel," \
        "obj.psFlux_g, obj.psFlux_r, obj.psFlux_i, obj.cModelFlux_g, " \
        "obj.cModelFlux_r, obj.cModelFlux_i, obj.tract, obj.patch, " \
        "obj.extendedness, obj.good, obj.clean, " \
        "truth.mag_r as truth_mag_r, truth.match_objectId, "\
        "truth.flux_g, truth.flux_r, truth.flux_i, truth.truth_type, " \
        "truth.match_sep, truth.is_variable " \
        "FROM dp01_dc2_catalogs.object as obj " \
        "JOIN dp01_dc2_catalogs.truth_match as truth " \
        "ON truth.match_objectId = obj.objectId " \
        "WHERE CONTAINS(POINT('ICRS', obj.ra, obj.dec),"\
        "CIRCLE('ICRS', " \
        + str(c1.ra.value) + ", " + str(c1.dec.value) + ", " \
        + str(radius.to(u.deg).value) + " )) = 1 " \
        "AND truth.match_objectid >= 0 "\
        "AND truth.is_good_match = 1"
print(query)

In [None]:
query = "SELECT obj.objectId, obj.ra, obj.dec, obj.mag_g, obj.mag_r, " \
        "obj.mag_i, obj.mag_g_cModel, obj.mag_r_cModel, obj.mag_i_cModel," \
        "obj.psFlux_g, obj.psFlux_r, obj.psFlux_i, obj.cModelFlux_g, " \
        "obj.cModelFlux_r, obj.cModelFlux_i, obj.tract, obj.patch, " \
        "obj.extendedness, obj.good, obj.clean, " \
        "truth.mag_r as truth_mag_r, truth.match_objectId, "\
        "truth.flux_g, truth.flux_r, truth.flux_i, truth.truth_type, " \
        "truth.match_sep, truth.is_variable " \
        "FROM dp01_dc2_catalogs.object as obj " \
        "JOIN dp01_dc2_catalogs.truth_match as truth " \
        "ON truth.match_objectId = obj.objectId " \
        "WHERE CONTAINS(POINT('ICRS', ra, dec),POLYGON('ICRS', 59.53086482, -29.64705882, 60.06027658, -29.64705882, "\
        "59.53086482, -30.17647059, 60.06027658, -30.17647059))=1"\
        "AND truth.match_objectid >= 0 AND truth.is_good_match = 1"
print(query)

In [None]:
SELECT obj.objectId, obj.ra, obj.dec, obj.mag_g, obj.mag_r, obj.mag_i, obj.mag_g_cModel, obj.mag_r_cModel, 
obj.mag_i_cModel,obj.psFlux_g, obj.psFlux_r, obj.psFlux_i, obj.cModelFlux_g, obj.cModelFlux_r, obj.cModelFlux_i, 
obj.tract, obj.patch, obj.extendedness, obj.good, obj.clean, truth.mag_r as truth_mag_r, truth.match_objectId, 
truth.flux_g, truth.flux_r, truth.flux_i, truth.truth_type, truth.match_sep, truth.is_variable 
FROM dp01_dc2_catalogs.object as obj JOIN dp01_dc2_catalogs.truth_match as truth 
ON truth.match_objectId = obj.objectId 
WHERE CONTAINS(POINT('ICRS', ra, dec),POLYGON('ICRS', 59.53086482, -29.64705882, 60.06027658, -29.64705882, 59.53086482, -30.17647059, 60.06027658, -30.17647059))=1
AND truth.match_objectid >= 0 AND truth.is_good_match = 1

In [None]:
%%time
# Execute the query and convert the results to a pandas dataframe
data = service.search(query).to_table().to_pandas()
#assert len(data) == 14424
assert len(data) == 102096

### 4.0 Brushing and linking between scatter plots with Bokeh

First, an example with brushing and linking between two panels showing different repsentations of the same dataset. 
A selection applied to either panel will highlight the selected points in the other panel.

Based on http://bokeh.pydata.org/en/latest/docs/user_guide/interaction/linking.html#linked-brushing 

In [None]:
# Create a column data source for the plots to share
col_data = dict(x0=data['ra'] - c1.ra.value,
                y0=data['dec'] - c1.dec.value,
                x1=data['mag_g_cModel'] - data['mag_r_cModel'],
                y1=data['mag_g_cModel'],
                ra=data['ra'],
                dec=data['dec'])
source = ColumnDataSource(data=col_data)

# Additional data can be added to the CDS after creation
source.data['objectId'] = data['objectId']
source.data['rmi'] = data['mag_r_cModel']-data['mag_i_cModel']
source.data['gmr'] = data['mag_g_cModel']-data['mag_r_cModel']
source.data['mag_r_cModel'] = data['mag_r_cModel']

In [None]:
# Create a custom hover tool on both panels
hover_left = HoverTool(tooltips=[("(RA,DEC)", "(@ra, @dec)"),
                                 ("(g-r,g)", "(@x1, @y1)"),
                                 ("ObjectId", "@objectId")])
hover_right = HoverTool(tooltips=[("(RA,DEC)", "(@ra, @dec)"),
                                  ("(g-r,g)", "(@x1, @y1)"),
                                  ("ObjectId", "@objectId")])
TOOLS = "box_zoom,box_select,lasso_select,reset,help"
TOOLS_LEFT = [hover_left, TOOLS]
TOOLS_RIGHT = [hover_right, TOOLS]

In [None]:
# Creat views based on the truth type
# We will convert the truth_type integer to a more descriptive string
object_map = {1: 'galaxy', 2: 'star', 3: 'SNe'}
source.data['truth_type'] = data['truth_type'].map(object_map)

In [None]:
# create a new plot and add a renderer
left = figure(tools=TOOLS_LEFT, plot_width=400, plot_height=400,
              output_backend="webgl",
              title='Spatial: Centered on (RA, Dec) = (%.2f, %.2f)' %
              (c1.ra.value, c1.dec.value))
left.circle('x0', 'y0', hover_color='firebrick', source=source,
            selection_fill_color='steelblue',
            selection_line_color='steelblue',
            nonselection_fill_color='silver',
            nonselection_line_color='silver')
left.x_range = Range1d(0.4, -0.4)
left.y_range = Range1d(-0.4, 0.4)
left.xaxis.axis_label = 'Delta RA'
left.yaxis.axis_label = 'Delta DEC'

# create another new plot and add a renderer
right = figure(tools=TOOLS_RIGHT, plot_width=400,
               plot_height=400, output_backend="webgl",
               title='CMD')
right.circle('x1', 'y1', hover_color='firebrick', source=source,
             selection_fill_color='steelblue',
             selection_line_color='steelblue',
             nonselection_fill_color='silver',
             nonselection_line_color='silver')
right.x_range = Range1d(-0.5, 2.5)
right.y_range = Range1d(26., 16.)
right.xaxis.axis_label = 'g - r'
right.yaxis.axis_label = 'g'

p = gridplot([[left, right]])
show(p)

Use the hover tool to see information about individual datapoints (e.g., the objectd). This information should appear automatically as you hover the mouse over the datapoints. Notice the data points highlighted in red on one panel with the hover tool are also highlighted on the other panel.

Next, click on the selection box icon (with a "+" sign) or the selection lasso icon found in the upper right corner of the figure. Use the selection box and selection lasso to make various selections in either panel by clicking and dragging on either panel. The selected data points will be displayed in the other panel.

### 5.0 Further analysis with Holoviews Linked Streams

If we want to do subsequent calculations with the set of selected points, we can use HoloViews linked streams for custom interactivity. The following visualization is a modification of this example. As for the example above, use the selection box and selection lasso to datapoints on the left panel. The selected points should appear in the right panel. Finally, notice that as you change the selection on the left panel, the mean x- and y-values for selected datapoints are shown in the title of right panel.

In [None]:
%%opts Points [tools=['box_select', 'lasso_select']]
%%output size=150

# Declare some points
points = hv.Points((data['ra'] - c1.ra.value, data['dec'] - c1.dec.value))

# Declare points as source of selection stream
selection = streams.Selection1D(source=points)


# Function that uses the selection indices to slice points and compute stats
def selected_info(index):
    selected = points.iloc[index]
    if index:
        label = 'Mean x, y: %.3f, %.3f' % tuple(selected.array().mean(axis=0))
    else:
        label = 'No selection'
    return selected.relabel(label).options(color='red')


# Combine points and DynamicMap
# Notice the syntax used here: the "+" sign makes side-by-side panels
points + hv.DynamicMap(selected_info, streams=[selection])

In the next cell, we access the indices of the selected datapoints. We could use these indices to select a subset of full sample for further examination.

In [None]:
print(len(selection.index))

### 6.0  Visualizing Larger Datasets with Datashader

The interactive features of Bokeh work well with datasets up to a few tens of thousands of data points. To efficiently explore larger datasets, we'd like to use another visualization model that offers better scalability, namely Datashader.

In the examples below, notice that as one zooms in on the datashaded two-dimensional histograms, the bin sizes are dynamically adjusted to show finer or coarser granularity in the distribution. This allows one to interactively explore large datasets without having to manually adjust the bin sizes while panning and zooming. Zoom in all the way and you can see individual points (i.e., bins contain either zero or one count). If you zoom in far enough, the individual points are represented by extremely small pixels in datashader that are difficult to see. A solution is to dynspread instead of datashade, which will preserve a finite size of the plotted points.

The next cell also uses the concept of linked Streams in HoloViews for custom interactivity, in this case to create a selection box. We'll use that selection box tool in the following cell.

#### 6.2 Color-color plot 

Here we plot a color-colour diagram of the c-model magnitudes obtained fron the query in 3.2 

In [None]:
# Create color-color plot using bokeh
plot_options = {'plot_height': 300, 'plot_width': 800,
                'tools': ['hover', 'box_select', 'reset', 'help']}

hover = HoverTool(tooltips=[("objectId", "@objectId"),
                            ("(RA,DEC)", "(@ra, @dec)"),
                            ("(g-r,r-i)", "(@gmr, @rmi)"),
                            ("type", "@truth_type")])

p = figure(title="Colour-Colour Diagram (cModel magnitudes)",
           x_axis_label="g-r", y_axis_label="r-i",
           x_range=(-2.0, 3.0), y_range=(-2.0, 3.0),
           **plot_options)
p.circle(x='gmr', y='rmi', source=source,
         size=3, alpha=0.3,
         hover_color='firebrick',
         legend_field="truth_type",
         color=factor_cmap('truth_type', 'Category10_3',
                           ['galaxy', 'star', 'SNe']))
p.add_tools(hover)

# Change to gridplot with stars to the right 
show(p)

We see that even with a medium sized dataset of ~14K points, this plot suffers from overplotting.  A classic strategy is to specify transparency of the glyphs so we can better see sparse and dense areas. In the plot above we have `alpha=0.3`. This helps but washes out the detail in the sparser regions. An additional problem is that we cannot add too many glyphs to any plot. 

Holoviews + Datashader allows us to plot millions to billions of points this to produce much more informative plots. DataShader rasterizes or aggregates datasets into regular grids that can then be further analysed or viewed as images. 

In [None]:
%%opts Points [tools=['box_select']]

# Create a holoviews object to hold and plot data
points = hv.Points((source.to_df()['gmr'], source.to_df()['rmi']))

# Create the linked streams instance
boundsxy = (0, 0, 0, 0)
box = streams.BoundsXY(source=points, bounds=boundsxy)
bounds = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box])

# Apply the datashader
dynspread(datashade(points, cmap="Viridis").opts(
    width=800, height=300,
    padding=0.05, show_grid=True,
    xlim=(-2.0, 3.0), ylim=(-2.0, 3.0),
    xlabel="g-r", ylabel="r-i")) * bounds

This `datashade` plot of the same color-color diagram as above shows muchj more detail.  Select the `wheel zoom` and adjust the image as you interact with the plot. Note how the shades of color of the data points change according to the local density.

Next we add callback functionality to the plot above and retrieve the indices of the selected points. First, use the box selection tool to create a selection box for the two-dimensional histogram above. Then run the cell below to count the number of datapoints within the selection region.

In [None]:
selection = (points.data.x > box.bounds[0]) \
    & (points.data.y > box.bounds[1]) \
    & (points.data.x < box.bounds[2]) \
    & (points.data.y < box.bounds[3])
print('The selection box contains %i datapoints'%(np.sum(selection)))

Now we will plot a spatial distribution on the sky of all the data and link it to a two-dimansional histogram of the data in the box selection. Try changing the box selection and watch as the historgram is recomputed and displayed. 

In [None]:
# First, create a holoviews dataset instance. 
# Here we label some of the columns.
kdims = [('ra', 'RA(deg)'), ('dec', 'Dec(deg)')]
vdims = [('mag_r_cModel', 'r(mag)')]
ds = hv.Dataset(source.to_df(), kdims, vdims)

In [None]:
points = hv.Points(ds)
boundsxy = (np.min(ds.data['ra']), np.min(ds.data['dec']),
            np.max(ds.data['ra']), np.max(ds.data['dec']))
box = streams.BoundsXY(source=points, bounds=boundsxy)
box_plot = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box])

In [None]:
# Custom callback functionality to update the linked histogram
def log_inf(x):
    return np.log(x) if x > 0 else 0


def update_histogram(bounds=bounds):
    selection = (ds.data['ra'] > bounds[0]) & \
                (ds.data['dec'] > bounds[1]) & \
                (ds.data['ra'] < bounds[2]) & \
                (ds.data['dec'] < bounds[3])

    selected_mag = ds.data.loc[selection]['mag_r_cModel']
    frequencies, edges = np.histogram(selected_mag)
    hist = hv.Histogram((list(map(log_inf, frequencies)), edges))
    return hist

In [None]:
%%output size=150
dmap = hv.DynamicMap(update_histogram, streams=[box])
datashade(points,
          cmap=process_cmap("Viridis", provider="bokeh")) * box_plot + dmap

### Additional Documentation

If you'd like some more information on `bokeh`, `holoviews` and `datashader`, please have a look at the following websites:

* [Bokeh website](https://bokeh.org/)  
* [Holovioews website](http://holoviews.org/index.html)  
* [Datashader website](https://datashader.org/)