In [1]:
# For this notebook you should have the following packages installed: pandas, altair, vega_datasets, matplotlib
# You can uncomment the line below and run this cell to install the required packages.

# !pip install pandas altair vega_datasets matplotlib

In [2]:
# importing libraries once
import altair as alt
import pandas as pd
from vega_datasets import data

In [3]:
# Get all datasets
iris_df = data.iris()
cars_df = data.cars()

# Vega-Altair - Declarative Visualization in Python

__Vega-Altair__ is a unique library in a Python data visualization ecosystem. We can create interactive visualization to improve exploratory data analysis. 

The three core components of specifying interactions in __Vegalite__,and hence __Vega-Altair__ are:

- Parameters
- Filters & Conditions
- Widgets

### Parameter

Parameters are the basic building blocks __Vega-Altair__ interaction grammar. Parameters in chart specification are analogous to variables in our Python code.

We can directly declare Python variables to control some aspects of a chart. We will use the Iris flower datasets for our examples

In [4]:
iris_df.head()

Unnamed: 0,sepalLength,sepalWidth,petalLength,petalWidth,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [5]:
mark_size = 30

alt.Chart(iris_df).mark_point(size=mark_size).encode(
    x="sepalLength:Q",
    y="sepalWidth:Q",
    color="species:N"
)

When we change the size of `mark_size` variable, the chart updates to show the size.

 Let us repeat the above example using Vega-Altair parameters. We can create a new parameter using `alt.param(value= < value of the variable >)`.

In [6]:
mark_size_param = alt.param(value=50)
mark_size_param

Parameter('param_1', VariableParameter({
  name: 'param_1',
  value: 50
}))

To use this parameter in a Vega-Altair plot we use `add_params` function to inform Vega-Altair about the existence of the parameter. Now we can use the parameter anywhere in our specification.

In [7]:
alt.Chart(iris_df).mark_point(size=mark_size_param).encode(
    x="sepalLength:Q",
    y="sepalWidth:Q",
    color="species:N"
).add_params(
    mark_size_param
)

The above approach is quite overkill for something simple like reusing the value in multiple places. The real utility of parameters comes in when we want to bind something in our chart to user input.

#### Binding parameters

Vega-Altair comes with series of input widgets we can use to add interactivity to our charts. The widgets are connected to our Vega-Altair plots using parameters. We will focus on binding the parameters to a widget, but later in the notebook we will take a look at different widgets in Vega-Altair. 

We will specify a Slider widget which allows us to select a number between a range.

In [8]:
slider = alt.binding_range(min=1, max=100, step=1)
slider

BindRange({
  input: 'range',
  max: 100,
  min: 1,
  step: 1
})

We can now bind the slider to our parameter and then use the parameter in the plot specification

In [9]:
slider = alt.binding_range(min=1, max=500, step=1, name="Mark size:  ")
mark_size_param = alt.param(value=30, bind=slider)
mark_size_param

Parameter('param_2', VariableParameter({
  bind: BindRange({
    input: 'range',
    max: 500,
    min: 1,
    name: 'Mark size:  ',
    step: 1
  }),
  name: 'param_2',
  value: 30
}))

In [10]:
alt.Chart(iris_df).mark_point(size=mark_size_param).encode(
    x="sepalLength:Q",
    y="sepalWidth:Q",
    color="species:N"
).add_params(
    mark_size_param
)

The parameters we discussed in the above examples are called Variable Parameter, they usually store a value and can be bound to input elements. Now we will look at Selection parameters.

### Selection Parameters

Widgets provide a really powerful way to add interactivity to our chart. However, widgets don't let us interact directly with the chart. Selection parameters offer us ways to create queries to the dataset by directly manipulating the chart. We can use mouse clicks and even keyboard events to add such interactions. We will look at how to define a selection parameter using `interval selection` as an example. We will look at types of selections later in the notebook.

In [11]:
selection_param = alt.param(select="interval")
selection_param

Parameter('param_3', SelectionParameter({
  name: 'param_3',
  select: 'interval'
}))

We can also use `alt.selection_interval` function to create such parameters. Moving forward we will use the `selection_*` functions rather than `param(select="interval")` style. However, both the styles are equivalent since the `selection_*` functions are just a wrapper around the `param` for our convenience.

In [12]:
selection = alt.selection_interval()
selection

Parameter('param_4', SelectionParameter({
  name: 'param_4',
  select: IntervalSelectionConfig({
    type: 'interval'
  })
}))

We can now add the interval selection to our plot.

In [13]:
alt.Chart(iris_df).mark_point().encode(
    x="sepalLength:Q",
    y="sepalWidth:Q",
    color="species:N"
).add_params(
    selection
)

Similar to the `mark_size` param earlier we created a `selection` and bound it mouse interactions on the chart. However, the selection is not very useful still. We need our chart to update in response to the selection similar to the slider we added earlier. We can use conditional encodings and/or filter transforms we discussed earlier in conjuction with selection parameters to achieve this.

### Conditional Encodings

We will look at a new way of specifying an encoding in a Vega-Altair plot – conditional encoding. Using the `condition` function we can specify two different values a particular encoding depending on wether the condition is satisfied or not. E.g.

```python
encode(
    color = alt.condition(predicate, "red", "blue")
)
```

Here we specified that if value of `predicate` is `True` the color should be red else it should be blue. One of the possible values of `predicate` is the selection parameter. When we use a selection parameter as predicate the _true_ condition is met when points lie within the selection. We will update our interactive plot to show colors for points within the rectangular brush.

In [14]:
selection = alt.selection_interval()

alt.Chart(iris_df).mark_point().encode(
    x="sepalLength:Q",
    y="sepalWidth:Q",
    color=alt.condition(selection, "species:N", alt.value("gray"))
).add_params(
    selection
)

We can use chart composition to create multiple views which are linked together with a brush. We will use the SPLOM as an example of linked multiple views.

In [15]:
cols = ["sepalLength", "petalLength", "sepalWidth", "petalWidth"]

alt.Chart(iris_df, width=200, height=200).mark_point().encode(
    x=alt.X(alt.repeat("row"), type="quantitative"),
    y=alt.X(alt.repeat("column"), type="quantitative"), 
    color="species:N", 
    tooltip="species:N"
).repeat(
    row=cols,
    column=cols
)

In a SPLOM, it is impossible to follow a single point across charts. We can use interactivity to select points in a chart and have them highlighted in others

In [16]:
cols = ["sepalLength", "petalLength", "sepalWidth", "petalWidth"]

selection = alt.selection_interval()

alt.Chart(iris_df, width=200, height=200).mark_point().encode(
    x=alt.X(alt.repeat("row"), type="quantitative"),
    y=alt.X(alt.repeat("column"), type="quantitative"), 
    color=alt.condition(selection, "species:N", alt.value("gray")), 
    opacity=alt.condition(selection, alt.value(0.7), alt.value(0.1)), 
    tooltip="species:N"
).add_params(
    selection
).repeat(
    row=cols,
    column=cols
)

### Interactive Filtering

Similar to conditional encoding, we can selection parameters as predicate for our filter transforms. We will recreate the example at the very beginning of our previous lecture which showed an composite chart with scatterplot and a barchart for the _cars_ dataset. The scatterplot is interactive and can be used to control what data is shown in the barchart. We will first create the static composite plot.

In [17]:
df = data.cars()

base_plot = alt.Chart(df)

scatterplot = base_plot.mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color="Origin:N",    
)

histogram = base_plot.mark_bar().encode(
    y="Origin:N",
    color="Origin:N",
    x="count():Q",
)

scatterplot & histogram

We will now add a selection parameter to filter the barchart using scatterplot

Combination of parameters and selections bound to various chart properties and encodigns along with chart composition allows us to create very cool dynamic charts for easy exploration of the data.

In [18]:
df = data.cars()

brush_selection = alt.selection_interval()

base_plot = alt.Chart(df)

scatterplot = base_plot.mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3))
    
).add_params(
    brush_selection
)

histogram = base_plot.mark_bar().encode(
    y="Origin:N",
    color="Origin:N",
    x="count():Q",
).transform_filter(
    brush_selection
)

scatterplot & histogram

### Types of Selections

#### Selection Interval

Interval selections allow us to specify selections within an range by clicking and dragging with the mouse pointer.

Selection intervals are represented by transparent rectangular marks on the plot. We already saw how to create selection intervals using `selection_interval` function. Let us quickly look at the code again:

In [19]:
df = data.cars()

brush_selection = alt.selection_interval()

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3))
    
).add_params(
    brush_selection
)

#### Selection Point

Point selections let us select points one at a time using mouse interactions like clicking (default behavior) or hovering.

We can create point selection using `selection_point` or `param(select="point")`. Let us look at a few examples of point selections

In [20]:
df = data.cars()

brush_selection = alt.selection_point()

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3))
    
).add_params(
    brush_selection
)

We can press down shift when clicking to select multiple points. We can also change the behavior to select on hover instead. We can do that by using `selection_point(on="mouseover")`. We can set the `nearest` flag to `True`, which highlights the marks closest to the mouse pointer instead of waiting for marks to fall exactly below the mouse.

In [21]:
df = data.cars()

brush_selection = alt.selection_point(on="mouseover")

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3))
    
).add_params(
    brush_selection
)

### Projecting Selections

While the default behavior of selection parameter is to select exact match, we can customize the behavior. We can customize what the target of the selection is by projecting it over `fields` or `encodings`. Both `selection_interval` and `selection_point` accept `fields` argument. `selection_interval` can also accept `encodings`. We will explore what these arguments do in the example below.

Let's say we want to create a composite scatterplot and barchart for `cars` dataset with color encoded as `Origin`. Clicking on a bar should select all points for that category.

In [22]:
selection = alt.selection_point()

cars = data.cars()

color = alt.condition(
    selection,
    alt.Color('Origin:N').legend(None),
    alt.value('lightgray')
)

scatter = alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color=color,
    tooltip='Name:N'
)

barchart = alt.Chart(cars).mark_bar().encode(
    x="Origin:N",
    y="count()",
    color=color
).add_params(
    selection
)

scatter | barchart

Nothing get's selected in the scatterplot. Why is that?

By default Vega-Altair tries to select the same point across multiple plots. However since the barchart is an aggregate chart, the point representing a specific origin does not exist in the scatterplot data. We can use the `fields` arugment and set it `Origin` to tell Vega-Altair to only consider the `Origin` value when making selections.

In [23]:
selection = alt.selection_point(fields=["Origin"])

color = alt.condition(
    selection,
    alt.Color('Origin:N').legend(None),
    alt.value('lightgray')
)

scatter = alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color=color,
    tooltip='Name:N'
)

barchart = alt.Chart(cars).mark_bar().encode(
    x="Origin:N",
    y="count()",
    color=color
).add_params(
    selection
)

scatter | barchart

We can also use multiple fields for encoding. Here is an example from Vega-Altair documentation. We have a scatterplot along with a two-dimensional legend. The scatterplot shows `Horsepower` vs `Miles_per_Gallon`, while the legend is heatmap of `Origin` vs `Cylinders`. Selecting a group in the legend should select all `cars` that match the group in our scatterplot.

In [24]:
selection = alt.selection_point(fields=['Origin', 'Cylinders'])
color = alt.condition(
    selection,
    alt.Color('Origin:N').legend(None),
    alt.value('lightgray')
)

scatter = alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color=color,
    tooltip='Name:N'
)

legend = alt.Chart(cars).mark_rect().encode(
    alt.Y('Origin:N').axis(orient='right'),
    x='Cylinders:O',
    color=color
).add_params(
    selection
)

scatter | legend

We can use `encodings` field with interval selections to change the brushing behavior and restrict it either the `x` or `y` direction. Let us look at multiple charts with different interval selections

In [25]:
selection_x = alt.selection_interval(encodings=['x']) 
selection_y = alt.selection_interval(encodings=['y']) 

# This is the default, so you don't have to specify it everytime
selection_xy = alt.selection_interval(encodings=["x", "y"]) 

base_scatter_plot = alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    tooltip='Name:N'
)

scatter_x = base_scatter_plot.encode(
        color=alt.condition(selection_x, "Origin:N", alt.value("gray"))
    ).add_params(
        selection_x
    ).properties(title = "Scatterplot with 1-D X brush")

scatter_y = base_scatter_plot.encode(
        color=alt.condition(selection_y, "Origin:N", alt.value("gray"))
    ).add_params(
        selection_y
    ).properties(title = "Scatterplot with 1-D Y brush")

scatter_xy = base_scatter_plot.encode(
        color=alt.condition(selection_xy, "Origin:N", alt.value("gray"))
    ).add_params(
        selection_xy
    ).properties(title = "Scatterplot with 2-D brush")

scatter_x | scatter_y | scatter_xy

### Parameter Composition

We can compose the parameters using the logical operands `AND (&)`, `OR (|)` and `NOT (~)`. Let us create a chart with a scatterplot which has two types of brushes. One is created by clicking and dragging when `Shift` key is pressed, and other is created when `alt` (`option` on MacOS) key is pressed. We will the logically combine the brushes to achieve our final selections.

In [26]:
df = data.cars()

shift_brush = alt.selection_interval(
    on="[mousedown[event.shiftKey], mouseup] > mousemove"
)

alt_brush = alt.selection_interval(
    on="[mousedown[event.altKey], mouseup] > mousemove",
    mark=alt.BrushConfig(fill="#FF6961", fillOpacity=0.3, stroke="#c23b22")
)

combo_brush = shift_brush | alt_brush

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(combo_brush, "Origin:N", alt.value("gray")),
    opacity=alt.condition(combo_brush, alt.value(0.7), alt.value(0.3))
    
).add_params(
    shift_brush, alt_brush
)

## Widgets

We saw the slider widget in our previous example. Here we will see different widgets available in Vega-Altair and how to use them. For detailed explaination please refer to the documentation.

#### Checkbox

Checkbox is great for binding a boolean parameter. Here we use a checkbox to toggle `color` encoding channel.

In [27]:
df = data.cars()

checkbox = alt.binding_checkbox(name="Encode `Origin` as color")
encode_color =  alt.param(value="True", bind=checkbox)

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(encode_color, "Origin:N", alt.value("gray")),
).add_params(
    encode_color,
)

#### Radio

Radio buttons allow the user to select one option from many. We will use it here to select and highlight one particular value from a field.

In [28]:
df = data.cars()

origin_option = alt.binding_radio(name="Select a origin: ", options=df["Origin"].unique())
choose_origin =  alt.selection_point(fields=["Origin"], bind=origin_option)

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(choose_origin, "Origin:N", alt.value("gray")),
    opacity=alt.condition(choose_origin, alt.value(0.7), alt.value(0.2))
).add_params(
    choose_origin,
)

#### Dropdown

A dropdown allows us to select one value from a list of values. We will use it to filter out dataset using a value from `Cylinders` field.

In [29]:
df = data.cars()

cylinder_dropdown = alt.binding_select(name="Select # of cylinders: ", options=df["Cylinders"].unique())
cylinder_count =  alt.selection_point(fields=["Cylinders"], bind=cylinder_dropdown)

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color="Origin:N",
).transform_filter(
    cylinder_count
).add_params(
    cylinder_count
)

#### Slider

A slider allows us to select a numeric value within a range. We will use it here to limit the scatterplot to cars introduced during a particular year.

In [30]:
df = data.cars()
df["Year_only"] = df["Year"].dt.year

year_slider = alt.binding_range(name="Select a year : ", min=df["Year_only"].min(), max=df["Year_only"].max(), step=1)
selected_year =  alt.selection_point(fields=["Year_only"], bind=year_slider)

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color="Origin:N",
).transform_filter(
    selected_year
).add_params(
    selected_year
)

#### HTML Inputs

Vega-Altair can also be bound to any valid HTML input element. We will look at examples of such binding later in the notebook

## Binding

### Widgets

#### Lookups

In the above widget examples we use the bindings to lookup a particular value in our dataset. Such bindings are great for filtering and highlighting specific points based on some value in the dataset.

#### Comparision

We can also use bindings to create more complex comparision instead of exact matches. We will create a scatterplot with slider. The slider let's us select the cut off for the `year` attribute in the cars dataset. We will highlight the points that are above this cutoff.

We can use `alt.datum` to access any dimension in our `alt.condition` predicate.

In [31]:
df = data.cars()
df["Year_only"] = df["Year"].dt.year

year_slider = alt.binding_range(name="Cutoff Year : ", min=df["Year_only"].min(), max=df["Year_only"].max(), step=1)
selected_year =  alt.selection_point(fields=["Year_only"], 
                                     bind=year_slider,
                                     value=[{'Year_only': df["Year_only"].min()}])

predicate = alt.datum.Year_only >= selected_year.Year_only

alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(predicate, "Origin:N", alt.value("gray")),
    opacity=alt.condition(predicate, alt.value(0.7), alt.value(0.2)),
).add_params(
    selected_year
)

#### Logical

We previously saw example of such a binding when we used a checkbox.

### Binding Channels

Vega-Altair by default does not let us bind a encoding channel to a widget. For e.g. we cannot directly change which column is encoded by an axis. 

However, we can achieve this using a parameter in combination with a widget and calculate transform. We will create a histogram for a column we select using a dropdown.

We will use the calculate transform to create a new column, and we will update the values of the column to the selected column using parameter & widget

In [32]:
df = data.cars()

column_name_select = alt.binding_select(
    options=['Miles_per_Gallon', 'Horsepower', 'Displacement', 'Weight_in_lbs', 'Acceleration'],
    name='Select a column: '
)

selected_column_param = alt.param(value="Miles_per_Gallon", bind=column_name_select)

alt.Chart(df).mark_bar().encode(
    x=alt.X("selected_column:Q", bin=True).title("Selected Column"),
    y="count()"
).transform_calculate(
    selected_column=f"datum[{selected_column_param.name}]"
).add_params(
    selected_column_param
)

### Binding Legends

We can also bind the chart legend to selection parameter. We can focus on different parts of dataset using interactive legends. 

We will create a scatterplot where we can select `cars` from a specific origin using the legend.

In [33]:
df = data.cars()

selected_origin =  alt.selection_point(fields=["Origin"], bind="legend")


alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(selected_origin, "Origin:N", alt.value("gray")),
    opacity=alt.condition(selected_origin, alt.value(0.7), alt.value(0.2)),
).add_params(
    selected_origin
)

### Binding Scales

Final type of binding that is supported is `scale` binding. Binding an interval selection to `scales` lets us build a chart with zooming and panning capability.

In [34]:
selection = alt.selection_interval(bind='scales')

alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color='Origin:N',
).add_params(
    selection
)

Vega-Altair has a shortuct for such binding as well. We could simple call the `interactive` function at the end of the chart specification to enable zooming and panning using scale binding.

In [35]:
alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color='Origin:N',
).interactive()

## Expressions

Final part of creating interactive charts using Vega-Altair is __expressions__. Expressions allow us to create custom interactions using [_Vega expression language_](https://vega.github.io/vega/docs/expressions/). Expression language is a subset of JavaScript programming langauge. e.g we can specify conditions using ternary operator `(condition ? if_true : if_false)`:

```javascript
expr='year > 2000 ? "red" : "blue"'
```

Vega-Altair gives the `alt.expr` module which allows us to create expressions without using javascript. The above expression can recreated using:

```python
alt.expr.if_(year > 2000, "red", "blue")
```

Expressions can be used for binding with widgets, as well as in transforms like the calculate transform.
We will look at an example for using expression for binding. However, this is barely scratching the surface of what we can do with expressions. Please refer to documentation for deep dive.

We will create a scatterplot with the search feature. Previously we learned that we can bind parameters to any HTML input element. We can create such bindings using `alt.binding`. Here we will use HTML text box to get user input for a search term. We will use this search term to create a expression to use as predicate for highlighting cars whose model name matches the user input.

In [36]:
df = data.cars()

search = alt.param(
    value="",
    bind=alt.binding(
        input="search",
        placeholder="Enter a car model name",
        name="Search: "
    )
)

search_term_regex = alt.expr.regexp(search, "i")
predicate = alt.expr.test(search_term_regex, alt.datum.Name)

alt.Chart(df).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
    color=alt.condition(predicate, "Origin:N", alt.value("gray")),
    opacity=alt.condition(predicate, alt.value(0.7), alt.value(0.2)),
    tooltip=["Name:N"]
).add_params(
    search
).interactive()

## _New_: Accessing parameters from selections in Python

We saw a lot of ways to create interactive charts with Vega-Altair. However the selections and parameters cannot be used outside the Vega-Altair plots. Vega-Altair recently (end of August 2023) added `JupyterChart` class which allows us to access the selections in an altair chart in our code.

JupyterChart is a `Jupyter Widget` which is a popular project to create interactive widgets in python.
The Jupyter Widget is different from the widgets we saw earlier. Those are native Vega-Lite features. We don't need to learn about the Jupyter Widget project for using `JupyterChart`.

>
> Warning: JupyterChart is a very recent addition, so there might be a few bugs
>

We can create a regular Vega-Altair plot and pass it to `JupyterChart` class to create a interactive plot which can be accessed from python code.

In [37]:
df = data.cars()

brush_selection = alt.selection_interval(name="brush")

base_altair_chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3))
    
).add_params(
    brush_selection
)

jupyter_chart = alt.JupyterChart(base_altair_chart)
jupyter_chart

JupyterChart(spec={'config': {'view': {'continuousWidth': 300, 'continuousHeight': 300}}, 'data': {'name': 'da…

Visually we don't see a difference, however we have access to internals of vega-altair plot using the `jupyter_chart` variable

In [38]:
jupyter_chart.selections

Selections({'brush': IntervalSelection(name='brush', value={}, store=[])})

In [39]:
jupyter_chart.selections.brush.value

{}

We can now use the selection value to query our pandas dataframe or do any data analysis. We will look at how to use different types of interactions.

### Interval selection

In [40]:
df = data.cars()

brush_selection = alt.selection_interval(name="brush")

base_altair_chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3)),
    tooltip=["Name:N"]
).add_params(
    brush_selection
)

jupyter_chart = alt.JupyterChart(base_altair_chart)
jupyter_chart

JupyterChart(spec={'config': {'view': {'continuousWidth': 300, 'continuousHeight': 300}}, 'data': {'name': 'da…

In [44]:
jupyter_chart.selections.brush.value

{'Miles_per_Gallon': [24.705698649088543, 34.082499186197914],
 'Weight_in_lbs': [1936.2384541829429, 3002.9835510253906]}

In [45]:
def interval_filter(selection_value):
    return " and ".join([
        f"{v[0]} <= `{k}` <= {v[1]}"
        for k, v in selection_value.items()
    ])

interval_query = interval_filter(jupyter_chart.selections.brush.value)
interval_query

'24.705698649088543 <= `Miles_per_Gallon` <= 34.082499186197914 and 1936.2384541829429 <= `Weight_in_lbs` <= 3002.9835510253906'

In [46]:
df.query(interval_query)

Unnamed: 0,Name,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin
24,datsun pl510,27.0,4,97.0,88.0,2130,14.5,1970-01-01,Japan
26,peugeot 504,25.0,4,110.0,87.0,2672,17.5,1970-01-01,Europe
28,saab 99e,25.0,4,104.0,95.0,2375,17.5,1970-01-01,Europe
29,bmw 2002,26.0,4,121.0,113.0,2234,12.5,1970-01-01,Europe
35,datsun pl510,27.0,4,97.0,88.0,2130,14.5,1971-01-01,Japan
...,...,...,...,...,...,...,...,...,...
400,chevrolet camaro,27.0,4,151.0,90.0,2950,17.3,1982-01-01,USA
401,ford mustang gl,27.0,4,140.0,86.0,2790,15.6,1982-01-01,USA
403,dodge rampage,32.0,4,135.0,84.0,2295,11.6,1982-01-01,USA
404,ford ranger,28.0,4,120.0,79.0,2625,18.6,1982-01-01,USA


### Point Selection

In [48]:
df = data.cars()

brush_selection = alt.selection_point(name="brush", encodings=["color"], bind="legend")

base_altair_chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3)),
    tooltip=["Name:N"]
).add_params(
    brush_selection
)

jupyter_chart = alt.JupyterChart(base_altair_chart)
jupyter_chart

JupyterChart(spec={'config': {'view': {'continuousWidth': 300, 'continuousHeight': 300}}, 'data': {'name': 'da…

In [51]:
jupyter_chart.selections.brush.value

[{'Origin': 'Japan'}]

In [52]:
def point_filter(selection_value):
    return " or ".join([
        " and ".join([
            f"`{col}` == {repr(val)}" for col, val in sel.items()
        ])
        for sel in selection_value
    ])

point_query = point_filter(jupyter_chart.selections.brush.value)
point_query

"`Origin` == 'Japan'"

In [53]:
df.query(point_query)

Unnamed: 0,Name,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin
20,toyota corona mark ii,24.0,4,113.0,95.0,2372,15.0,1970-01-01,Japan
24,datsun pl510,27.0,4,97.0,88.0,2130,14.5,1970-01-01,Japan
35,datsun pl510,27.0,4,97.0,88.0,2130,14.5,1971-01-01,Japan
37,toyota corona,25.0,4,113.0,95.0,2228,14.0,1971-01-01,Japan
60,toyota corolla 1200,31.0,4,71.0,65.0,1773,19.0,1971-01-01,Japan
...,...,...,...,...,...,...,...,...,...
390,toyota corolla,34.0,4,108.0,70.0,2245,16.9,1982-01-01,Japan
391,honda civic,38.0,4,91.0,67.0,1965,15.0,1982-01-01,Japan
392,honda civic (auto),32.0,4,91.0,67.0,1965,15.7,1982-01-01,Japan
393,datsun 310 gx,38.0,4,91.0,67.0,1995,16.2,1982-01-01,Japan


### Index Selection (Point selection without encodings or fields)

In [54]:
df = data.cars()

brush_selection = alt.selection_point(name="brush")

base_altair_chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(brush_selection, "Origin:N", alt.value("gray")),
    opacity=alt.condition(brush_selection, alt.value(0.7), alt.value(0.3)),
    tooltip=["Name:N"]
).add_params(
    brush_selection
)

jupyter_chart = alt.JupyterChart(base_altair_chart)
jupyter_chart

JupyterChart(spec={'config': {'view': {'continuousWidth': 300, 'continuousHeight': 300}}, 'data': {'name': 'da…

In [55]:
jupyter_chart.selections.brush.value

[137, 385, 394]

In [56]:
selected_points = jupyter_chart.selections.brush.value
selected_points

[137, 385, 394]

In [57]:
df.iloc[selected_points, :]

Unnamed: 0,Name,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin
137,ford pinto,26.0,4,122.0,80.0,2451,16.5,1974-01-01,USA
385,mazda glc custom,31.0,4,91.0,68.0,1970,17.6,1982-01-01,Japan
394,buick century limited,25.0,6,181.0,110.0,2945,16.4,1982-01-01,USA


## Summary

Today we covered the following topics:
- Interaction grammar in Vega-Altair
    - Parameters in Vega-Altair
        - Variable Parameters
        - Selection Parameters
    - Conditional Encoding
    - Interactive Filtering
    - Selections
        - Types of selection interactions
        - Projecting selections
    - Parameter composition
    - Types of widgets
    - Binding
        - Widgets
        - Channels
        - Legend
        - Scales
    - Expressions
- JupyterChart 

## Limits to interactivity Jupyter Notebooks

We have multiple libraries which add interactivity to Jupyter Notebooks like Vega-Altair, Jupyter Widgets, Holoviz, etc.

Most of the libraries like Vega-Altair support interactions, but until recently do not allow using the results of interactions in the python code. So there is a separation between code section of the notebook and the interactive section of the notebook.

Vega-Altair provides a way to bridge this gap using JupyterChart. However we can only do selections. Further, using interactions to drive the code in cells below is flimsy, since interactions are not saved and reloaded when we reopen the notebook.