## Altair Express Tutorial

Altair express is a high-level data visualization library that provides quick access to create statistical charts in python. Modeled after the [seaborn library](https://seaborn.pydata.org/), Altair express (abbreviated alx) allows you to quickly [create charts](http://www.dylanwootton.com/Altair-Express/API/visualizations.html) and add on [interactions](http://www.dylanwootton.com/Altair-Express/API/interactiontechniques.html) in a single line of code.  

Today, you'll be testing a alpha version of altair express. We'll walk you through 

First we'll import our libraries of interest:

In [None]:
# pip uninstall without checking 
# 

In [4]:
pip install vega_datasets


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [5]:
import altair_express as alx
from vega_datasets import data
import pandas as pd


ModuleNotFoundError: No module named 'altair'

We'll be working with a paired-down **gapminder** dataset to explore Altair Express library API.

In [3]:
df = data.gapminder()
df.head()

NameError: name 'data' is not defined

In [11]:
new_df = df
new_df['super_duper_long_title_that_will_def_not_wrap'] = df['year']
alx.pairplot(new_df)

In [4]:
import altair-alx-version as alt
from vega_datasets import data

source = data.cars()

chart = alt.Chart(source).mark_point().encode(
    x=alt.X('Horsepower',axis=alt.Axis(titleLimit=30)),
    y='Miles_per_Gallon',
    color='Origin',
    tooltip=['Name', 'Origin', 'Horsepower', 'Miles_per_Gallon']
)

chart.encoding.x.title = "Horsepower (hp)"
chart


## Visualizations 

Let's get familiar with the dataset a bit more. To do this, we'll use altair expresses visualization library to [visualize our data](http://www.dylanwootton.com/Altair-Express/API/visualizations.html)


The profile function provides a quick api to visualize the columns of your dataframe and help us find areas that might be more interesting to look at. 

In [5]:
alx.profile(df)


### TODO: Add more examples of visualizations here



## Interaction Technqiues
Static visualization gives us a sense of the data, but interaction can help us explore the data more thoroughly. 

Let's make our profile visualization interactive with the altair_express [highlight_brush()](http://www.dylanwootton.com/Altair-Express/API/interactions/highlightbrush.html#highlight-brush) interaction technique.

In [8]:
df =pd.read_csv('https://raw.githubusercontent.com/vega/vega-datasets/master/data/cars.csv')
alx.highlight_brush() + alx.profile(df)

URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)>

In [17]:
alx.hist(df,x='pop',x_scale=alt.Scale(type='log'))

By layering on the interaction, you can now brush and filter across your various data columns. The profiler can now be a powerful data selection tool. 

But now that we've explored the data, how do we drill down for more info?

Let's `copy our selection to query`. You can copy the pandas query from the chart by hitting cmd+c (on mac) or cntrl+c (on windows). 


In [18]:
Video('./static/profiler_interactions_and_copy.mp4',width=1000)
          

The above demonstration shows how to get our selected data from our charts. 

Lets view how this selected data fits on our scatterplot. We can do that by [layering](https://altair-viz.github.io/user_guide/compound_charts.html#layered-charts) our filtered chart over our unfiltered chart. 

In [5]:
filtered_df = df.query(" (life_expect>=64.70 and life_expect<=80.56) ")



alx.scatterplot(df,x='life_expect',y='fertility',color='gray') + alx.scatterplot(filtered_df,x='life_expect',y='fertility',color='red')



In [6]:
alx.filter_type(target='country')+alx.scatterplot(df,x='life_expect',y='fertility',color='gray')


In [26]:
alx.filter_brush()+alx.scatterplot(df,x='life_expect',y='fertility',color='gray')


### Composing Visualizations

As you saw with the last example, visualization can be particularly helpful when we compose charts. 

Altair offers three primary ways to construct multi-view charts. As altair express returns an altair object, all of these operations have native support. 
- [Layering](https://altair-viz.github.io/user_guide/compound_charts.html#layered-charts) overlays the charts on each other. You can layer charts using the '+' operator.



In [6]:
df = data.seattle_weather()

countplot = alx.countplot(df,x='weather') # produce a countplot as the base of our chart

# layer that countplot and the text above it. 
# The second object reuses the base countplot as welll so axes don't need to be respecified
countplot + countplot.mark_text(
    align='center',
    baseline='middle',
    dy=-5  # Nudges text above the bar 
).encode(
    text='weather:N'
)


- [Vertical Concatenation](https://altair-viz.github.io/user_guide/compound_charts.html#vconcat-chart)  puts the charts vertically atop one another. You can vertically concatenate two charts with the '&' operator.


In [7]:
alx.hist(df,x='precipitation') & alx.hist(df,x='temp_max')

- [Horizontal Concatenation](https://altair-viz.github.io/user_guide/compound_charts.html#horizontal-concatenation) arranges the charts side by side.


In [8]:
alx.scatterplot(df,x='precipitation',y='temp_max') | alx.hist(df,y='temp_max',height=200,width=50)

These operators compose the basics of how we can visually layout charts together. Similarly we can layer interactions on top of each other. 

In [12]:
alx.tooltip_hover()+alx.highlight_brush()+alx.scatterplot(df,x='precipitation',y='temp_max')

Now that we have created interactions for single charts, how do you compose interactive visualizations?


The easiest way is to add them directly to your composed charts. This will make your entire dashboard interactive.

Here, we create a gapminder dashboard that lets you cross filter on your columns and see the changes on the chart below. 

In [5]:
df = data.gapminder()

profile = alx.profile(df)

barplot = alx.barplot(df,x='country',y='pop',sort='-y',width=600) # sort descending by the height of the bars

alx.highlight_brush() + alx.tooltip_hover() + alx.filter_slider(field='year') + (profile & barplot)

But you can also chose to make only certain parts of your dashboard interactive. For example, let's modify the gapminder visualization by adding the ability to brush on regions of the chart. 

To do this, we'll be using the effects parameter of altair express. This lets you pass in interactions to have an effect on charts without making them interactive themselves. 

In [11]:
df = data.gapminder()


scatter =  alx.scatterplot(df, x='life_expect',y='fertility',color='cluster',size='pop')

slider = alx.filter_slider(field='year')
brush = alx.highlight_brush()

interactive_scatter = alx.tooltip_hover() + brush + slider  + scatter


# apply the interaction effects to the barplot 
barplot = alx.barplot(df,x='country',y='pop',sort='-y',effects={"filter":[slider,brush]})

# compose the visualizations together
interactive_scatter & barplot

For example, see this user brushing on the gapminder bubble chart and filtering using the provided year query widget. 

In [4]:
pip uninstall -y altair

Note: you may need to restart the kernel to use updated packages.


In [13]:
Video('./static/gapminder_interact.mp4',width=1000)          