<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Data-visualization,-week-9" data-toc-modified-id="Data-visualization,-week-9-1">Data visualization, week 9</a></span></li><li><span><a href="#What's-an-interactive-data-viz?" data-toc-modified-id="What's-an-interactive-data-viz?-2">What's an interactive data viz?</a></span></li><li><span><a href="#How-does-interactiveness-matter?" data-toc-modified-id="How-does-interactiveness-matter?-3">How does interactiveness matter?</a></span></li><li><span><a href="#Do-interactive-visualizations-add-value?" data-toc-modified-id="Do-interactive-visualizations-add-value?-4">Do interactive visualizations add value?</a></span></li><li><span><a href="#My-two-cents..." data-toc-modified-id="My-two-cents...-5">My two cents...</a></span></li><li><span><a href="#What's-the-best-(Python)-lib-for-interactive-data-viz?" data-toc-modified-id="What's-the-best-(Python)-lib-for-interactive-data-viz?-6">What's the best (Python) lib for interactive data viz?</a></span></li><li><span><a href="#Both-Bokeh-and-Plotly-are-best-in-class-libraries" data-toc-modified-id="Both-Bokeh-and-Plotly-are-best-in-class-libraries-7">Both Bokeh and Plotly are best in-class libraries</a></span></li><li><span><a href="#Yet,-Bokeh-offers-some-unique-advantages" data-toc-modified-id="Yet,-Bokeh-offers-some-unique-advantages-8">Yet, Bokeh offers some unique advantages</a></span></li><li><span><a href="#Is-Plotly-stuck-in-the-middle?" data-toc-modified-id="Is-Plotly-stuck-in-the-middle?-9">Is Plotly stuck in the middle?</a></span></li><li><span><a href="#Some-interactive-viz-(&amp;-the-anatomy-of-Bokeh)" data-toc-modified-id="Some-interactive-viz-(&amp;-the-anatomy-of-Bokeh)-10">Some interactive viz (&amp; the anatomy of Bokeh)</a></span></li><li><span><a href="#Organization-of-the-API" data-toc-modified-id="Organization-of-the-API-11">Organization of the API</a></span></li><li><span><a href="#Plotting-'arm'-of-Bokeh-(a-simple-scatter-diagram)" data-toc-modified-id="Plotting-'arm'-of-Bokeh-(a-simple-scatter-diagram)-12">Plotting 'arm' of Bokeh (a simple scatter diagram)</a></span></li><li><span><a href="#Model-'arm'-of-Bokeh" data-toc-modified-id="Model-'arm'-of-Bokeh-13">Model 'arm' of Bokeh</a></span></li><li><span><a href="#Exemplars" data-toc-modified-id="Exemplars-14">Exemplars</a></span></li><li><span><a href="#A-special-case:-Dashboardsv" data-toc-modified-id="A-special-case:-Dashboardsv-15">A special case: Dashboardsv</a></span></li></ul></div>

In [2]:
from IPython.display import Video

# Data visualization, week 9

<img src='images/bokeh_artwork.png' width=600px />

# What's an interactive data viz?

An interactive data viz is a __digital artefact__ whose features vary as users interact with it

For example, visual forms, colors, and contents (i.e., data) may change as cases are filtered

# How does interactiveness matter?

'Design' takes further meanings in the context of an interactive viz 

In the context of a __static viz__, design choices concern aesthetic and functional properties of a plot 

In the context of an __interactive viz__, design choices concern also situation within which individuals as users manipulate plots as artifacts

Therefore, important considerations arise about the ergonomics of the plot – e.g.:

* how many filtering buttons?
* where to locate buttons?

# Do interactive visualizations add value?

**Not at all**

- as the number of dynamic features increases, it is highly likely that users attach different meanings and interpretations to the interactive visualization
- potentially, this creates ambiguity about the 'core message' of the visualization
- interactive visualizations should be avoided when the primary objective of the designer is to to create wisdom (e.g., data journalism)

**Definitely**

- interactive viz creates value to users by giving them exactly what they need $\rightarrow$ consumer surplus increase
- sometimes, users have in depth knowledge of a subject (e.g., music enthuasiasts) $\rightarrow$ they can credibly engage with even complex interactive visualizations
- sometimes, users are very sophisticated $\rightarrow$ appreciating their information needs is not trivial $\rightarrow$ interactive visualizations reduce the risk to fix a wrong focus

# My two cents...

Interactive data viz works best for:

+ exploratory purposes
    * e.g., I've got my own template I use to make sense of a new dataset
+ users that understand the business/societal context
    * e.g., consultants
+ heavy users
    * e.g., people working on operations

Caverat: After all, interactive viz is a newcomer to the business:

+ pay attention to balance such a novelty with some conservative visual forms
+ don't create a cognitive overload on the user's part (keep the number of buttons under control)

# What's the best (Python) lib for interactive data viz?

<img src='images/batman_superman.jpg' width=500px/>

# Both Bokeh and Plotly are best in-class libraries

<img src='images/bokeh_plotly.png' width=1000px/>

# Yet, Bokeh offers some unique advantages

+ Bokeh offers endless customization opportunities
+ as the library gets more and more mature, flexibility and ease of use start together
+ the quality of Bokeh's documentation is unmatched

# Is Plotly stuck in the middle?

+ new high-level APIs, such as HoloViews, require even less coding than Plotly
+ integrating some Bokeh visualizations in HoloViews requires very few lines of code

<img src='images/holoviews.png' width=800px/>

In [None]:
"""
Bokeh app example using datashader for rasterizing a large dataset and
geoviews for reprojecting coordinate systems.

This example requires the 1.7GB nyc_taxi_wide.parquet dataset which
you can obtain by downloading the file from AWS:

  https://s3.amazonaws.com/datashader-data/nyc_taxi_wide.parq

Place this parquet in a data/ subfolder and install the python dependencies, e.g.

  conda install datashader fastparquet python-snappy

You can now run this app with:

  bokeh serve --show nytaxi_hover.py

"""
import numpy as np
import holoviews as hv
import dask.dataframe as dd

from holoviews import opts
from holoviews.operation.datashader import aggregate

hv.extension('bokeh')
renderer = hv.renderer('bokeh')

# Set plot and style options
opts.defaults(
    opts.Curve(xaxis=None, yaxis=None, show_grid=False, show_frame=False,
               color='orangered', framewise=True, width=100),
    opts.Image(width=800, height=400, shared_axes=False, logz=True, colorbar=True,
               xaxis=None, yaxis=None, axiswise=True, bgcolor='black'),
    opts.HLine(color='white', line_width=1),
    opts.Layout(shared_axes=False),
    opts.VLine(color='white', line_width=1))

# Read the parquet file
df = dd.read_parquet('./data/nyc_taxi_wide.parq').persist()

# Declare points
points = hv.Points(df, kdims=['pickup_x', 'pickup_y'], vdims=[])

# Use datashader to rasterize and linked streams for interactivity
agg = aggregate(points, link_inputs=True, x_sampling=0.0001, y_sampling=0.0001)
pointerx = hv.streams.PointerX(x=np.mean(points.range('pickup_x')), source=points)
pointery = hv.streams.PointerY(y=np.mean(points.range('pickup_y')), source=points)
vline = hv.DynamicMap(lambda x: hv.VLine(x), streams=[pointerx])
hline = hv.DynamicMap(lambda y: hv.HLine(y), streams=[pointery])

sampled = hv.util.Dynamic(agg, operation=lambda obj, x: obj.sample(pickup_x=x),
                          streams=[pointerx], link_inputs=False)

hvobj = ((agg * hline * vline) << sampled)

# Obtain Bokeh document and set the title
doc = renderer.server_doc(hvobj)
doc.title = 'NYC Taxi Crosshair'

"""
Outcome: https://holoviews.org/gallery/apps/bokeh/nytaxi_hover.html
"""

# Some interactive viz (& the anatomy of Bokeh)

# Organization of the API

<img src='images/bokeh_api.png' width=1000px/>

# Plotting 'arm' of Bokeh (a simple scatter diagram)

In [1]:
# import
import pandas as pd
from bokeh.plotting import figure, output_file, show

# fake data
# --+ x
x0 = np.random.normal(loc=0, scale=1, size=50)
x1 = np.random.normal(loc=0, scale=1, size=50)
x2 = np.random.normal(loc=0, scale=1, size=50)
# --+ y
y0 = 1 + x0 * 1 * np.random.normal(loc=0, scale=0.1)
y1 = 1 + x1 * 1 * np.random.normal(loc=0, scale=0.2)
y2 = 1 + x2 * 1 * np.random.normal(loc=0, scale=0.4)
# --+ get a df
df0 = pd.DataFrame({'x': x0, 'y': y0, 'z': np.repeat(0, 50)})
df1 = pd.DataFrame({'x': x1, 'y': y1, 'z': np.repeat(1, 50)})
df2 = pd.DataFrame({'x': x2, 'y': y2, 'z': np.repeat(5, 50)})
df = pd.concat([df0, df1, df2])

# color map
colormap = {0: 'orange', 1: 'black', 5: 'purple'}
colors = [colormap[x] for x in df['z']]

# initialize the figure
p = figure(title = "Some fake data")

# axes
p.xaxis.axis_label = 'Consumption of vinegar crisps (stones)'
p.yaxis.axis_label = 'Consumption of carbonated drink (gallons)'

# plot data
p.circle(df["x"], df["y"], color=colors, fill_alpha=0.2, size=10)

# dump the chart
output_file("at_the_pub.html", title="at_the_pub.py example")

# show the chart
show(p)

# Model 'arm' of Bokeh

In [2]:
# minimal setup
import numpy as np
from bokeh.io import show
from bokeh.layouts import column
from bokeh.models import ColumnDataSource, RangeTool
from bokeh.plotting import figure


# sample data
from bokeh.sampledata.stocks import AAPL

# data manipulation
dates = np.array(AAPL['date'], dtype=np.datetime64)
source = ColumnDataSource(data=dict(date=dates, close=AAPL['adj_close']))

# initialize the plot
p = figure(plot_height=300, plot_width=800,                  # plot size
           tools="xpan", toolbar_location=None,              # tools
           x_axis_type="datetime", x_axis_location="above",  # x-axis
           x_range=(dates[1500], dates[2500],),
           background_fill_color="#efefef")                  # background )

# plot data
p.line('date', 'close', source=source)

# decorations
p.yaxis.axis_label = 'Price'

# initialize lower panel plot
select = figure(title="Drag the middle and edges of the selection box to change the range above",
                plot_height=130, plot_width=800, y_range=p.y_range,
                x_axis_type="datetime", y_axis_type=None,
                tools="", toolbar_location=None, background_fill_color="#efefef")

# initialize data selector 
range_tool = RangeTool(x_range=p.x_range)
range_tool.overlay.fill_color = "navy"
range_tool.overlay.fill_alpha = 0.2

# plot data
select.line('date', 'close', source=source)
select.ygrid.grid_line_color = None

# add selector
select.add_tools(range_tool)
select.toolbar.active_multi = range_tool

# show plot
show(column(p, select))

# A special case: Dashboards

![](https://cdn-images-1.medium.com/max/1200/1*KVtvnCCQ-IWTdUsl65UpZQ.gif)

<img alt="Image for post" class="yg yh el dq dm fq v c" width="1309" height="947" src="https://miro.medium.com/max/2618/1*CUyrsJpP5lkvVdheseAYXQ.png" srcset="https://miro.medium.com/max/552/1*CUyrsJpP5lkvVdheseAYXQ.png 276w, https://miro.medium.com/max/1104/1*CUyrsJpP5lkvVdheseAYXQ.png 552w, https://miro.medium.com/max/1280/1*CUyrsJpP5lkvVdheseAYXQ.png 640w, https://miro.medium.com/max/1456/1*CUyrsJpP5lkvVdheseAYXQ.png 728w, https://miro.medium.com/max/1632/1*CUyrsJpP5lkvVdheseAYXQ.png 816w, https://miro.medium.com/max/1808/1*CUyrsJpP5lkvVdheseAYXQ.png 904w, https://miro.medium.com/max/1984/1*CUyrsJpP5lkvVdheseAYXQ.png 992w, https://miro.medium.com/max/2000/1*CUyrsJpP5lkvVdheseAYXQ.png 1000w" sizes="600px">

https://youtu.be/VWi3HAlKOUQ