# Interactive Plotting with Altair

Static visualizations are good, but they definitely have their limitations.

There have been a number of interesting developments in recent years, in particular Bokeh, plot.ly, and most recently Altair.

Whilst I like plot.ly and Bokeh, they can be complicated to get, but are powerful when you know how to navigate the APIs. Altair is more declarative, by that I mean that the mapping from data to visual is more natural. There is a specific grammar for the composition of charts, and with this you can go far quickly.

**One caveat however is a default limit of 5000 rows of data to visualize. But there are ways to increase :)**

```
alt.data_transformers.enable('default', max_rows=None)
```

In [None]:
import altair as alt
import pandas as pd

alt.renderers.enable('notebook')

In [None]:
alt.data_transformers.enable('default', max_rows=None)

In [None]:
fifa = pd.read_csv('../data/fifa_player_data.csv')

In [None]:
fifa.sample(10)

## Looking at 1D distributions

Let's take a basic set of data, say 1D to start off. I have the heights of all the footballers, and I want to plot them all in a line.

In [None]:
alt.Chart(fifa.sample(1000)).mark_point().encode(
    x='Acceleration'
)

In [None]:
alt.Chart(fifa.sample(1000)).mark_point().encode(
    x='average(Acceleration)',
    y='Nationality'
)

In [None]:
alt.Chart(fifa.sample(1000)).mark_bar().encode(
    x='Nationality',
    y='average(Acceleration)'
)

In [None]:
alt.Chart(fifa.sample(1000)).mark_point().encode(
    x='Nationality',
    y='average(Acceleration)',
    size='count()',
    color='Club_Position',
    tooltip=['Club_Position', 'average(Acceleration)', 'count()']
).interactive()

## Line Charts

In [None]:
import numpy as np

player_speed = pd.DataFrame()

player_speed['messi'] = np.random.poisson(10, 90) 
player_speed['ronaldo'] = np.random.poisson(8, 90)
# player_speed['mbappe'] = np.random.randn(200)
player_speed['martins'] = np.random.poisson(5,90)

In [None]:
player_speed = player_speed.stack().reset_index().rename(columns={'level_0': 'time','level_1': 'player', 0:'activity'})

In [None]:
import altair as alt
from vega_datasets import data


alt.Chart(player_speed).mark_line().encode(
    x='time',
    y='activity',
    color='player',
    tooltip=['time', 'activity', 'player']
).properties(
    width=600,
    height=150
)


In [None]:
stocks = data.stocks()

highlight = alt.selection(type='single', on='mouseover',
                          fields=['player'], nearest=True)

base = alt.Chart(player_speed).encode(
    x='time',
    y='activity',
    color='player'
)

points = base.mark_circle().encode(
    opacity=alt.value(0)
).add_selection(
    highlight
).properties(
    width=600
)

lines = base.mark_line().encode(
    size=alt.condition(~highlight, alt.value(1), alt.value(3))
)

points + lines

## Scatter Charts

In [None]:
# load a simple dataset as a pandas DataFrame
from vega_datasets import data
cars = data.cars()

chart = alt.Chart(fifa.query('Club_Position != "Sub"').sample(1000)).mark_circle().encode(
    x='Acceleration',
    y='Speed',
    color='Club_Position',
    tooltip=['Acceleration', 'Speed', 'Name', 'Club_Position']
).interactive()

chart.display()

In [None]:
import altair as alt
from vega_datasets import data
iris = data.iris()

brush = alt.selection_interval()

alt.Chart(fifa.query('Nationality=="Israel"')).mark_circle().encode(
    alt.X(alt.repeat("column"), type='quantitative'),
    alt.Y(alt.repeat("row"), type='quantitative'),
     color=alt.condition(brush, 'Club_Position:N', alt.value('lightgray')),
        tooltip=['Name', 'Club_Position']
).properties(
    width=200,
    height=200
).add_selection(
    brush
).repeat(
    row=['Acceleration', 'Speed'],
    column=['Finishing', 'Strength']
)

## Histograms

### One Group

In [None]:
alt.Chart(fifa[fifa.Club_Position.isin(['ST', 'CB'])]).mark_area(
    opacity=0.3,
    interpolate='step'
).encode(
    alt.X('Reactions', bin=alt.Bin(maxbins=10)),
    alt.Y('count()', stack=None, axis=alt.Axis(title='Number of Players')),
    tooltip=['Club_Position']
)

### Multiple Groups

In [None]:
alt.Chart(fifa[fifa.Club_Position.isin(['ST', 'CB'])]).mark_area(
    opacity=0.3,
    interpolate='step'
).encode(
    alt.X('Short_Pass', bin=alt.Bin(maxbins=25)),
    alt.Y('count()', stack=None, axis=alt.Axis(title='Number of Players')),
    alt.Color(
        'Club_Position',
        scale=alt.Scale(range=['#0000ff', '#008000', '#ff0000'])
    ),
    tooltip=['Club_Position']
)

## Building Cross-Linked Plots

It's often more interesting to be able to interrogate your date interactively, seeing how distributions change based on some selection for example.

Luckily, we can do this directly in our notebook without having to go to a different tool base, and it's rather easy.

Key here are selections and transform filters.

We add selections to a plot, and that selection can be applied to some other plot with a transform_filter.

In [None]:
brush = alt.selection(type='interval')

first_chart = alt.Chart(fifa.sample(1000)).mark_bar().encode(
    y='count(Nationality)',
    x=alt.X('Nationality',
        sort=alt.SortField(field='count', order='descending', op='max')
    ),
    color=alt.condition(brush, alt.value('blue'), alt.value('lightgray')),
).add_selection(
    brush
)

second_chart = alt.Chart(fifa.sample(1000)).mark_bar().encode(
    y='count(Acceleration)',
    x='Acceleration'
).transform_filter(
    brush
)

In [None]:
brush = alt.selection(type='interval')
nationality_select = alt.selection(type='single', fields=['Nationality'])
club_select = alt.selection(type='single', fields=['Club_Position'])

points = alt.Chart().mark_circle().encode(
    x='Acceleration',
    y='Speed',
    color=alt.condition(brush, 'Nationality:N', alt.value('lightgray')),
    tooltip=['Club_Position', 'Name']
).add_selection(
    brush
).transform_filter(
    nationality_select
)

bars_nationality = alt.Chart().mark_bar().encode(
    color='Nationality',
    x='count(Nationality)',
    y=alt.Y('Nationality',
        sort=alt.SortField(field='count', order='descending', op='max')
    )
).properties(
    selection=nationality_select
).transform_filter(
    brush
)

bars_club_position = alt.Chart().mark_bar().encode(
    color='Club_Position',
    x='count(Club_Position)',
    y=alt.Y('Club_Position',
        sort=alt.SortField(field='count', order='descending', op='max')
    )
).properties(
    selection=club_select
).transform_filter(
    brush
)

alt.vconcat(points, bars_nationality, bars_club_position, data=fifa.sample(2000))