# Adding Data Driven Inputs

## Objectives

- Explore interactive data visualization capabilities in Altair.
- Implement selection tools to enhance data exploration within charts.
- Use selection bindings to create dynamic user inputs and link them to visual attributes.
- Integrate interactive features such as selection intervals, point selection, and binding inputs with the visual representation of automobile data.

## Background

Background: This notebook utilizes Altair to incorporate advanced interactive elements into data visualizations. By leveraging the Automobile Dataset from UCI, the notebook demonstrates how Altair can add interactive selections and bindings that allow users to dynamically filter, highlight, and explore data points within visualizations. 

## Datasets Used

Automobile Dataset from UCI: This dataset serves as a foundation for demonstrating various interactive visualizations and data-driven inputs in Altair.

## Automobile Dataset

In [1]:
import numpy as np
import pandas as pd

pd.set_option('display.max_columns', 10)
import altair as alt

We will use the Automobile Data Set [https://archive.ics.uci.edu/ml/datasets/automobile] from the UCI Machine Learning Repository [https://archive-beta.ics.uci.edu/]. It includes categorical and continuous variables. 

Defining the headers

In [2]:
# Defining the headers
headers = ["symboling", "normalized_losses", "make", "fuel_type", "aspiration", "num_doors", "body_style", 
        "drive_wheels", "engine_location", "wheel_base", "length", "width", "height", "curb_weight", 
        "engine_type", "num_cylinders", "engine_size", "fuel_system", "bore", "stroke", "compression_ratio", 
        "horsepower", "peak_rpm", "city_mpg", "highway_mpg", "price"]

df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data",
                  header=None, names=headers, na_values="?" )
df.head()

Unnamed: 0,symboling,normalized_losses,make,fuel_type,aspiration,...,horsepower,peak_rpm,city_mpg,highway_mpg,price
0,3,,alfa-romero,gas,std,...,111.0,5000.0,21,27,13495.0
1,3,,alfa-romero,gas,std,...,111.0,5000.0,21,27,16500.0
2,1,,alfa-romero,gas,std,...,154.0,5000.0,19,26,16500.0
3,2,164.0,audi,gas,std,...,102.0,5500.0,24,30,13950.0
4,2,164.0,audi,gas,std,...,115.0,5500.0,18,22,17450.0


## Selection Types

### Interval Selection

An interval selection allows us to select chart elements by clicking and dragging.

In [3]:
interval = alt.selection_interval()

In [4]:
# Click and drag over the graph
alt.Chart(df).mark_point(size=50).encode(
    x = alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y = alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    # Conditional coloring
    color = alt.condition(interval, 'fuel_type:N', alt.value('lightgrey'))     
).add_params(interval)

In [5]:
interval_x = alt.selection_interval(encodings=['x'])
interval_y = alt.selection_interval(encodings=['y'])

In [6]:
# Click and drag over the graph
alt.Chart(df).mark_point(size=50).encode(
    x = alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y = alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color = alt.condition(interval_x, 'fuel_type:N', alt.value('lightgrey'))    
).add_params(interval_x)

In [7]:
# Click and drag over the graph
alt.Chart(df).mark_point(size=50).encode(
    x = alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y = alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color = alt.condition(interval_y, 'fuel_type:N', alt.value('lightgrey'))    
).add_params(interval_y)

### Single Selections

A single selection allows us to select a single chart element at a time using mouse actions. By default, points are selected on click.

In [8]:
point = alt.selection_point()

In [9]:
# Click over any bar
alt.Chart(df).mark_bar().encode(
    alt.X('make:N', title='Make'),
    alt.Y('count()', title='Count'),
    color = alt.condition(point, 'fuel_type:N', alt.value('lightgrey'))
).add_params(point)    

We can select points on mouseover rather than on click. Let's see it!

In [10]:
point_nearest = alt.selection_point(on='mouseover', nearest=True)

In [11]:
# Hover over the graph
alt.Chart(df).mark_bar().encode(
    alt.X('make:N', title='Make'),
    alt.Y('count()', title='Count'),
    color = alt.condition(point_nearest, 'fuel_type:N', alt.value('lightgrey'))
).add_params(point_nearest)    

#### One variable

In [12]:
# Selecting multiple points at once, using one variable
points_1v = alt.selection_point(fields=['body_style'])

In [13]:
color=alt.condition(points_1v,
                    alt.Color('body_style:N', legend=None),
                    alt.value('lightgray')) 

In [14]:
scatter = alt.Chart(df).mark_circle(size=80).encode(    
    x = alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y = alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color = color,
    tooltip = 'make:N'
)

In [15]:
legend = alt.Chart(df).mark_circle(size=100).encode(
    y=alt.Y('body_style:N', axis=alt.Axis(orient='right')),
    color = color
).add_params(points_1v)

In [16]:
# Press click + shift on the legend for multiple selections
scatter | legend 

#### Two variables

In [17]:
points_2v = alt.selection_point(fields=['body_style','fuel_type'])

In [18]:
color2 = alt.condition(points_2v,
            alt.Color('body_style:N', legend=None),
            alt.value('lightgray'))

In [19]:
scatter2 = alt.Chart(df).mark_circle(size=80).encode(
    x = alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y = alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color = color2,
    tooltip = 'make:N'
)

In [20]:
legend2 = alt.Chart(df).mark_circle(size=100).encode(
    y=alt.Y('body_style:N', axis=alt.Axis(orient='right')),
    x='fuel_type:N',
    color = color2
).add_params(points_2v)

In [21]:
# Press click + shift on the legend for multiple selections
scatter2 | legend2

## Adding Data Driven Inputs

We may now use the bind option to add data-driven input elements to the charts. Let's see some examples of input element binding.

| **Input Element** | **Description** |
|:--|:--|
| binding_select | **Drop down** box for selecting a single item from a list. |
| binding_radio | **Radio** buttons that force only a single selection. |
| binding_range | Shown as a **slider** to allow for selection along a scale. |
| binding_checkbox | Renders as checkboxes allowing for multiple selections of items. |

### Dropdown

In [22]:
options_dropdown = df.body_style.unique()
options_dropdown

array(['convertible', 'hatchback', 'sedan', 'wagon', 'hardtop'],
      dtype=object)

In [23]:
input_dropdown = alt.binding_select(
    options=options_dropdown, 
    name='Body: '
)
input_dropdown

BindRadioSelect({
  input: 'select',
  name: 'Body: ',
  options: array(['convertible', 'hatchback', 'sedan', 'wagon', 'hardtop'],
        dtype=object)
})

In [24]:
selection_d = alt.selection_point(
        fields = ['body_style'], 
        bind = input_dropdown,
        value='sedan'    
)

In [25]:
color_d = alt.condition(selection_d,
                alt.Color('body_style:N', legend=None),
                alt.value('lightgray')
)

In [26]:
alt.Chart(df).mark_circle(size=80).encode(
    x=alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y=alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color=color_d,
    tooltip='make:N'
).add_params(
    selection_d
)

Adding the `None` option

In [27]:
options_dropdown2 = [None, 'convertible', 'hatchback', 'sedan', 'wagon', 'hardtop']
options_dropdown2

[None, 'convertible', 'hatchback', 'sedan', 'wagon', 'hardtop']

In [28]:
input_dropdown2 = alt.binding_select(options=options_dropdown2, name='Body: ')
input_dropdown2

BindRadioSelect({
  input: 'select',
  name: 'Body: ',
  options: [None, 'convertible', 'hatchback', 'sedan', 'wagon', 'hardtop']
})

In [29]:
selection_d2 = alt.selection_point(
        fields = ['body_style'], 
        bind   = input_dropdown2                    
)    

In [30]:
color_d2 = alt.condition(selection_d2,
                alt.Color('body_style:N', legend=None),                
                alt.value('lightgray'))

In [31]:
alt.Chart(df).mark_circle(size=80).encode(
    x=alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y=alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color=color_d2,
    tooltip='make:N'
).add_params(
    selection_d2
)

If you want to dynamically filter the data displayed in the chart based on the user's interaction, you should add the `.transform_filter()` option.

In [32]:
alt.Chart(df).mark_circle(size=80).encode(
    x=alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y=alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color=color_d2,
    tooltip='make:N'
).add_params(
    selection_d2
).transform_filter(
    selection_d2
)

### Radio Items

In [33]:
np.append([None], df.body_style.unique())

array([None, 'convertible', 'hatchback', 'sedan', 'wagon', 'hardtop'],
      dtype=object)

In [34]:
input_radio = alt.binding_radio(
    options=np.append([None], df.body_style.unique()), 
    name='Body: '
)
input_radio

BindRadioSelect({
  input: 'radio',
  name: 'Body: ',
  options: array([None, 'convertible', 'hatchback', 'sedan', 'wagon', 'hardtop'],
        dtype=object)
})

In [35]:
selection_r = alt.selection_point(
                    fields = ['body_style'], 
                    bind = input_radio,
                    #value = {'body_style': 'hatchback'}
)  

In [36]:
color_r = alt.condition(selection_r,
                alt.Color('body_style:N', legend=None),
                alt.value('lightgray'))

In [37]:
alt.Chart(df).mark_circle(size=80).encode(
    x=alt.X('horsepower:Q',  scale=alt.Scale(zero=False)),
    y=alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color=color_r,
    tooltip='make:N'
).add_params(
    selection_r
)

### Slider

Let's define a slider to highlight the cars with a price greater than the value in red.

In [38]:
# Defining the slider
slider = alt.binding_range(
            min=0, 
            max=50000, 
            step=10, 
            name='Price: ')
slider

BindRange({
  input: 'range',
  max: 50000,
  min: 0,
  name: 'Price: ',
  step: 10
})

In [39]:
# Defining the selector
price_selector = alt.param(
    bind=slider, 
    value=20000
)

In [40]:
# Creating the chart and adding the selector
alt.Chart(df).mark_circle(size=60).encode(
    x=alt.X('price:Q',  scale=alt.Scale(zero=False)),
    y=alt.Y('highway_mpg:Q', scale=alt.Scale(zero=False)),
    color=alt.condition(
        alt.datum.price > price_selector,
        alt.value('red'),
        alt.value('silver')
    ),
    tooltip='make:N'
).add_params(
    price_selector
)

## Conclusions

- Interval selections enable dynamic data querying, allowing users to select and focus on specific subsets of data within visualizations.
- Single selections enhance user interaction by enabling selections through clicks or mouseovers, which can dynamically highlight or filter data points.
- Data-driven inputs, such as dropdowns, sliders, and radio buttons, allow users to control visualization parameters interactively, making the visualizations more flexible and responsive to user inputs.
- These interactive elements make Altair a powerful tool for exploratory data analysis, allowing users to manipulate and explore data visually in real time.

## References

- https://altair-viz.github.io/user_guide/interactions.html