## Interactivity in plotly

The goal of this notebook is to understand basic interactivity in plotly, ie the capability to update a plot based on mouse events on the same plot. 

## Setup environment

For this example, we'll use the same data as in examples 1 and 3. 

In [1]:
import plotly.graph_objects as go
import pandas as pd

In [2]:
FILE = "~/data/ElectricVehicle.xlsx"

In [3]:
df = pd.read_excel(FILE)
df.columns = ['distance_electric', 'fuel', 'electric', 'total_distance', 'speed', 'energy_recovered']
df['charge'] = df.distance_electric / df.total_distance
df.head()

Unnamed: 0,distance_electric,fuel,electric,total_distance,speed,energy_recovered,charge
0,27.7,0.3,12.0,29.4,43.3,,0.942177
1,24.1,0.5,14.9,24.9,14.1,,0.967871
2,1.9,4.7,3.2,21.9,22.4,,0.086758
3,0.1,5.9,5.6,21.8,16.8,2.1,0.004587
4,2.1,4.7,3.8,32.5,19.1,3.1,0.064615


## Use case: clustering

Let's perform a clustering of the data using sklearn. 

In [4]:
from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=3)
kmeans.fit(df[['charge', 'electric']])

clusters = kmeans.predict(df[['charge', 'electric']])
df['cluster'] = clusters

## Plotting the clusters

We can produce a scatterplot in plotly where the color indicates the cluster number: 

In [6]:
figure = go.Figure(
    go.Scatter(
        x=df.charge,
        y=df.electric,
        mode='markers',
        marker=dict(
            size=8,
            color=df.cluster,
            line=dict(
                width=2,
                color='blue'
            )
        ),
    )   
)
figure.update_layout(
    title='Electric Vehicle',
    xaxis_title='Initial charge',
    yaxis_title='Electric consumption (kWh/100km)'
)




## Callbacks

Say we now want to highlight the clusters in an interactive way.

Each time the mouse moves over a point, its cluster should be highlighted (and the other clusters "turned off"). 

This can be useful for a dashboard for data exploration or in a business understanding task. 

### Method

- To add interactivity to a plotly plot, we can use a special type of figure object, `go.FigureWidget`. 
- It includes methods to specify what happens for certain mouse events, such as `on_click` or `on_hover`. 
- The argument for these function is the name of another function (sometimes known as callback). 
- The signature of the callback function is `function_name(trace, points, selector)`. `trace`, `points` and `selector` can be used to track different values related to the mouse event. For example, `points` is a structure that contains information about the points of the plot involved in the event. In the following example, `points.inds` is a list of the indices all the points involved in the event (in that case there should be only 1 point). 
- The context manager `batch_update` will apply updates to the plot as needed but only render the plot when finished, avoiding to render the plot for each change. 

In [7]:
df

Unnamed: 0,distance_electric,fuel,electric,total_distance,speed,energy_recovered,charge,cluster
0,27.7,0.3,12.0,29.4,43.3,,0.942177,2
1,24.1,0.5,14.9,24.9,14.1,,0.967871,2
2,1.9,4.7,3.2,21.9,22.4,,0.086758,1
3,0.1,5.9,5.6,21.8,16.8,2.1,0.004587,1
4,2.1,4.7,3.8,32.5,19.1,3.1,0.064615,1
5,0.0,4.6,3.6,32.7,33.3,1.6,0.0,1
6,56.4,1.5,12.0,69.3,22.1,4.5,0.813853,2
7,20.5,2.5,9.0,40.7,15.4,4.0,0.503686,0
8,53.9,3.1,8.8,121.7,24.9,7.4,0.442892,0
9,0.5,5.1,5.0,59.0,23.6,3.7,0.008475,1


In [8]:
import numpy as np
widget = go.FigureWidget(figure)

def update_point(trace, points, selector):
    
    for i in points.point_inds:
        cluster_id = df.cluster[i]
        # Get all indices of the same cluster as integer
        cluster_indices = np.where(df.cluster == cluster_id)[0]

        with widget.batch_update():
            opacity = np.array([0.1] * len(df))
            opacity[cluster_indices] = 1
            widget.data[0].marker.opacity = opacity
            

widget.data[0].on_hover(update_point)

widget

FigureWidget({
    'data': [{'marker': {'color': array([2, 2, 1, 1, 1, 1, 2, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 2, 2, 0, 2, 0, 2, 0,
                                         1, 1, 1, 1, 0, 0, 2, 0, 1, 2, 0, 2, 1, 1, 1, 2, 0, 0, 2], dtype=int32),
                         'line': {'color': 'blue', 'width': 2},
                         'size': 8},
              'mode': 'markers',
              'type': 'scatter',
              'uid': 'd0758bd5-8885-46f6-a280-6cdb81d502c1',
              'x': array([0.94217687, 0.96787149, 0.08675799, 0.00458716, 0.06461538, 0.        ,
                          0.81385281, 0.5036855 , 0.44289236, 0.00847458, 0.        , 0.01453958,
                          0.39339339, 0.        , 0.58885017, 0.        , 0.        , 0.92068966,
                          1.        , 0.08370044, 0.97084548, 0.24444444, 0.9972752 , 0.46897547,
                          0.02912621, 0.01848049, 0.        , 0.08962264, 0.46336429, 0.42716858,
                          0.94425087, 0.3

# Common use case: dashboard

A common use case of callbacks in plotly is having a click in one plot update a property of another plot. 

In [9]:
DATA = "~/data/hotel_bookings/hotel_bookings.csv"
df = pd.read_csv(DATA)


In [11]:
# Produce two subplots in the same row

from plotly.subplots import make_subplots

CATEGORIES = ['Transient', 'Contract', 'Transient-Party', 'Group']

df['customer_category'] = pd.Categorical(df['customer_type'], categories=CATEGORIES, ordered=True)

fig = go.FigureWidget(
    make_subplots(rows=1, cols=2, column_widths=[0.5, 0.5]
    )
)

fig.add_histogram(x=df['customer_category'], histnorm="percent", row=1, col=1)
fig.add_box(x=df['is_canceled'], y=df['lead_time'], row=1, col=2)

fig['layout']['xaxis']['title']='Type of customer'
fig['layout']['xaxis2']['title']='Has cancelled'
fig['layout']['yaxis1']['title']='Percentage'
fig['layout']['yaxis2']['title']='Lead time'

fig.update_layout(
    showlegend=False,
)


FigureWidget({
    'data': [{'histnorm': 'percent',
              'type': 'histogram',
              'uid': '7d31e9e8-cb3e-460c-a197-a726ed532dc6',
              'x': array(['Transient', 'Transient', 'Transient', ..., 'Transient', 'Transient',
                          'Transient'], dtype=object),
              'xaxis': 'x',
              'yaxis': 'y'},
             {'type': 'box',
              'uid': '7b061f98-3d12-4135-9ccb-509dc6791be3',
              'x': array([0, 0, 0, ..., 0, 0, 0]),
              'xaxis': 'x2',
              'y': array([342, 737,   7, ...,  34, 109, 205]),
              'yaxis': 'y2'}],
    'layout': {'showlegend': False,
               'template': '...',
               'xaxis': {'anchor': 'y', 'domain': [0.0, 0.45], 'title': {'text': 'Type of customer'}},
               'xaxis2': {'anchor': 'y2', 'domain': [0.55, 1.0], 'title': {'text': 'Has cancelled'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'Percentage'}},
         

In [12]:
def update_right_plot(trace, points, selector):
    if not points.point_inds:
        return

    with fig.batch_update():
        selected_index = points.point_inds[0]
        selected_customer_type = df.customer_type[selected_index]
        selected_data = df[df.customer_type == selected_customer_type]

        # Get the index of the selected customer type
        selected_index = CATEGORIES.index(selected_customer_type)
        opacity = ["lightgrey"] * 4
        opacity[selected_index] = "blue"

        fig.data[0].marker.color = opacity
        fig.data[1].x = selected_data['is_canceled']
        fig.data[1].y = selected_data['lead_time']

fig.data[0].on_click(update_right_plot)
