## Introduction to Altair (part 2)

In this section, we will focus on including the interaction options in the Altair. 

Through interaction we can transform static images into tools for exploration: highlighting points of interest, zooming in to reveal finer-grained patterns, and linking across multiple views to reason about multi-dimensional relationships.

We will visualize the same car data in part 1 from the vega-datasets collection. 

In [2]:
# Install the module
import pandas as pd
import altair as alt

cars = 'https://cdn.jsdelivr.net/npm/vega-datasets@1/data/cars.json'

### Set parameters

We'll start with a scatter plot of horsepower versus miles per gallon, with a color encoding for the origin. We set the opacity as 0.5. 

In [3]:
alt.Chart(cars).mark_circle().encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.Color("Origin:N"),
    opacity=alt.OpacityValue(0.5)
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

The first thing we can do is to add a slide bar to change the settings in the figure, like the opacity. 

In [4]:
slider = alt.binding_range(min=0, max=1, step=0.01, name="Opacity: ")
op = alt.param(bind=slider, value=0.5)
alt.Chart(cars).mark_circle().add_params(
    op
).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.Color("Origin:N"),
    opacity=alt.OpacityValue(op)
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

### Click and highlight

Maybe we don't want to change the opacity for all points. We jsut want to select part of the data to be highlighted. There are three types of selections in the Altair: 

- selection_point: select a single discrete value, by default on click events. You can also select multiple discrete values. The first value is selected on mouse click and additional values toggled using shift-click.
- selection_interval: select a continuous range of values, initiated by mouse drag.

In [5]:
selection = alt.selection_point()
alt.Chart(cars).mark_circle().add_params(
    selection
).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.condition(selection, 'Origin:N', alt.value('grey')),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

In [6]:
# In the beginning, no points are selected
selection = alt.selection_point(empty=False)
alt.Chart(cars).mark_circle().add_params(
    selection
).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.condition(selection, 'Origin:N', alt.value('grey')),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

In [7]:
# Select points by mouse over
selection = alt.selection_point(on="mouseover")
alt.Chart(cars).mark_circle().add_params(
    selection
).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.condition(selection, 'Origin:N', alt.value('grey')),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

In [8]:
# select interval
selection = alt.selection_interval(empty=False)
alt.Chart(cars).mark_circle().add_params(
    selection
).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.condition(selection, 'Origin:N', alt.value('grey')),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

In [9]:
# Y range is fixed (shows all Y values) for an inputted range of X
selection = alt.selection_interval(encodings=['x'])
alt.Chart(cars).mark_circle().add_params(
    selection
).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.condition(selection, 'Origin:N', alt.value('grey')),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin"
)

### Select across multiple panels

This approach becomes even more powerful when the selection behavior is tied across multiple views of the data within a compound chart. 

In [10]:
selection = alt.selection_interval()
hrspwr_mpg = alt.Chart(cars).mark_circle().add_params(selection).encode(
    y = alt.Y("Miles_per_Gallon:Q", title="Miles Per Gallon"),
    x = alt.X("Horsepower:Q", title="Horsepower"),
    color=alt.condition(selection, 'Origin:N', alt.value('grey')),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
    title="Miles Per Gallon vs. Horsepower by Origin",
    width=200, height=200
)
hrspwr_mpg | hrspwr_mpg.encode(
    x = "Acceleration:Q"  # overrides the previous x encoding
)

Another thing we can do is to "zoom" the figure. We use the selection to get a filter of the data and make new figures. 

In [11]:
selection = alt.selection_interval()
point = alt.Chart(cars).mark_circle().encode(
    alt.X("Horsepower:Q"),
    alt.Y("Miles_per_Gallon:Q"),
    alt.Color("Origin:N")
).add_params(selection).properties(
    width=200, height=200
)

bar = alt.Chart(cars).mark_bar().encode(
    x = 'count()',
    y = 'Origin:N',
    color='Origin:N'
).transform_filter(selection)

point & bar

The interaction can also be added to the legend instead of the figure itself. In the following example, we can click the legend and only show part of the data in the figure. 

In [12]:
selection = alt.selection_point(fields=['Origin'])
color = alt.condition(
    selection, 
    alt.Color('Origin:N', legend=None),
    alt.value('lightgrey')
)
scatter = alt.Chart(cars).mark_circle().encode(
    x = 'Horsepower:Q',
    y = 'Miles_per_Gallon:Q',
    color=color
)
legend = alt.Chart(cars).mark_circle().encode(
    alt.Y('Origin:N').axis(orient="right"),
    color = color
).add_params(selection)
scatter | legend

In [13]:
selection = alt.selection_point(fields=['Origin', 'Cylinders'])

color = alt.condition(
    selection, 
    alt.Color('Origin:N', legend=None),
    alt.value('lightgrey')
)
scatter = alt.Chart(cars).mark_circle().encode(
    x = 'Horsepower:Q',
    y = 'Miles_per_Gallon:Q',
    color=color
)
legend = alt.Chart(cars).mark_circle().encode(
    alt.Y('Origin:N').axis(orient="right"),
    alt.X("Cylinders:O"),
    color = color
).add_params(selection)
scatter | legend

### Bindings & Widgets

We can now bind parameters to chart elements (e.g. legends) and widgets (e.g. drop-downs and sliders) to allow users to do detailed screening on the data. 

- binding_checkbox: Renders as checkboxes allowing for multiple selections of items.
- binding_radio: Radio buttons that force only a single selection
- binding_select: Drop down box for selecting a single item from a list
- binding_range: Shown as a slider to allow for selection along a scale.

[Examples](https://altair-viz.github.io/gallery/multiple_interactions.html#gallery-multiple-interactions)

What we need to setup: 
- Set the dropdown (options, name etc. )
- Use selection to combine dropdown to the figure
- Make the figure

In [14]:
# Dropdown menu
input_dropdown = alt.binding_select(
    options=[None, "Europe", "Japan", "USA"], 
    labels=["All", "Europe", "Japan", "USA"],
    name="Region: "
)
selection = alt.selection_point(fields=["Origin"], bind=input_dropdown)

alt.Chart(cars).mark_circle().encode(
    x = alt.X("Horsepower:Q"),
    y = alt.Y("Miles_per_Gallon:Q"),
    color = alt.condition(selection, alt.Color("Origin:N"), alt.value('lightgrey'))
).add_params(selection)

In [18]:
# Radio Button
radio_button = alt.binding_radio(
    options=[None, "Europe", "Japan", "USA"], 
    labels=["All", "Europe", "Japan", "USA"],
    name="Region: "
)
selection = alt.selection_point(fields=["Origin"], bind=radio_button)

alt.Chart(cars).mark_circle().encode(
    x = alt.X("Horsepower:Q"),
    y = alt.Y("Miles_per_Gallon:Q"),
    color = alt.condition(selection, alt.Color("Origin:N"), alt.value('lightgrey'))
).add_params(selection)

In [19]:
# Radio Button- all other examples disappear when you click each button
radio_button = alt.binding_radio(
    options=[None, "Europe", "Japan", "USA"], 
    labels=["All", "Europe", "Japan", "USA"],
    name="Region: "
)
selection = alt.selection_point(fields=["Origin"], bind=radio_button)

alt.Chart(cars).mark_circle().encode(
    x = alt.X("Horsepower:Q"),
    y = alt.Y("Miles_per_Gallon:Q"),
    color = alt.condition(selection, alt.Color("Origin:N"), alt.value('lightgrey'))
).add_params(selection).transform_filter(selection)

In [25]:
# Replicate example 1
# Scatter plot -> miles per gallon, horsepower
# Dropwdown with the encoding shape
# Slider for angle of the shape
# Slider for the size of the shape
# slider for the stroke width of the shape
shape_dropdown = alt.binding_select(
    options=["arrow", "circle", "square", "diamond", "triangle", "wedge"], 
    labels=["arrow", "circle", "square", "diamond", "triangle", "wedge"],
    name="Shape: "
)
angle_slider = alt.binding_range(min=-359, max=359, step=1, name="Angle: ")
size_slider = alt.binding_range(min=1, max=200, step=1, name="Size: ")
stroke_slider = alt.binding_range(min=1, max=10, step=1, name="Stroke Width: ")
shape_var = alt.param(bind = shape_dropdown, value = "circle")
angle_var = alt.param(bind=angle_slider, value=0)
size_var = alt.param(bind=size_slider, value=20)
stroke_var = alt.param(bind=stroke_slider, value=2)
alt.Chart(cars).mark_point(
    shape=shape_var,
    angle=angle_var,
    size=size_var,
    strokeWidth=stroke_var
).encode(
    x = "Horsepower:Q", 
    y = "Miles_per_Gallon:Q"
).add_params(shape_var, angle_var, size_var, stroke_var)


In [31]:
# Replicate example 2
# Horsepower vs MPG
# Cutoff boundary slider. Everything to the left is red,
# everything to the right is blue
cutoff = alt.binding_range(min=0, max=250, step=1, name="Cutoff: ")
cutoff_var = alt.param(bind=cutoff, value=125)
alt.Chart(cars).mark_point().encode(
    x = "Horsepower:Q", 
    y = "Miles_per_Gallon:Q",
    color = alt.condition(
        alt.datum.Horsepower < cutoff_var, 
        alt.value("red"), 
        alt.value("blue")
    )
).add_params(cutoff_var)


### Exercise:

Create a scatter plot that shows the relationship between Horsepower and Acceleration. Color the points based on the Origin of the cars. To do:

- Add a dynamic filtering option that allows users to filter the dataset based on the number of cylinders. 
- Create a second plot: a bar chart showing the number of cars by the number of cylinders.
- Add an interactive slider that allows users to filter the data based on the car's weight

In [44]:
radio = alt.binding_radio(options = [3, 4, 5, 6, 8], name="Cylinders: ")
selection = alt.selection_point(fields = ['Cylinders'], bind=radio)
chart = alt.Chart(cars).mark_point().encode(
    x = "Horsepower:Q",
    y = "Acceleration:Q"
).add_params(selection)
chart

### In-class activities: 

Titanic data again!
Now, take the three Altair plots you have made last time. Select one of the plots and include some interactions to it (beyond the tooltips this time!). Save the plot as an html file. 

In [32]:
import seaborn as sns
titanic_data = sns.load_dataset("titanic")
titanic_data.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.2833,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.925,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [56]:
input_dropdown = alt.binding_select(
    options=[None, 0, 1],
    labels = ["All", "Survived", "Did not survive"],
    name="Survived?: "
)
selection = alt.selection_point(fields=["survived"], bind=input_dropdown)
alt.Chart(titanic_data).mark_circle().encode(
    x = alt.X("age:Q"),
    y = alt.Y("pclass:O"),
    color = alt.condition(selection, alt.Color("survived:N"), alt.value('lightgrey'))
).add_params(selection)

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
