## Interactive Data Visualization with Altair 
### A D3 Based Visualization Library

`Altair` is a powerful visualization library in Python built upon `Vega-Lite` library developed by the University of Washington Interactive Data Lab https://idl.cs.washington.edu/ . `Vega-Lite` follows a "grammar of graphics" model of data visualization, much like the `R` package `ggplot2`, with a key difference. `Vega-Lite`, and `Altair`, are developed for interactive data visualizations on top of the `D3` Javascript library. 

The beauty and interactivity of `D3` with the ease of use of `Altair` in Python yields a powerful tool for researchers to explore and understand their data, at any scale `Jupyter Notebook` implementation, whether its on a laptop, a High-Perfomance Computing (HPC) environment like MARCC, or perhaps an intermediate system like a Google Colaboratory.

To motivate our use of `Jupyter Notebooks` as an infrastructure agnostic tool for data visualization, we will run a simple visualization example, using `Altair`, in a `Jupyter Notebook`. 

Lets begin by importing the `Altair` package into our running Python kernel, as well as some example datasets for visualizing some simple data.

In [36]:
import altair as alt
from vega_datasets import data

Now lets import a dataset from `vega_datasets` cars, a simple dataset compising details about cars including fuel consumption, manufacturer, origin, and horsepower. The data is stored as a `Pandas` dataframe. `Pandas` is a Python package for data manipulation of tabular data.

In [None]:
cars = data.cars()
cars.head()

Lets create a chart with the dataframe cars

In [None]:
chart = alt.Chart(cars)

Now we can create a scatter plot by setting our mark as a point, and encode the data we would like to visualize on the x and y axis respectively:

In [None]:
alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower'
)

We can introduce an interactive grammar now, called `selection`:

In [None]:
interval = alt.selection_interval()

alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower',
    color='Origin'
).properties(
    selection=interval
)


Lets make certain only the values currently selected are colored by `Origin`:

In [None]:
interval = alt.selection_interval()

alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower',
    color=alt.condition(interval, 'Origin', alt.value('lightgray'))
).properties(
    selection=interval
)


Now we can add other, linked plots using the same syntax as creating a `Markdown` table:

In [None]:
interval = alt.selection_interval()

chart = alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower',
    color=alt.condition(interval, 'Origin', alt.value('lightgray'))
).properties(
    selection=interval
)

chart | chart.encode(x='Acceleration')


This tutorial was adapted, in part, from the presentation by Jake VanderPlas, Altair's creator, at the 2018 PyCon: 

_Jake VanderPlas - Exploratory Data Visualization with Vega, Vega-Lite, and Altair - PyCon 2018_