# Data Visualization with Altair / Vega

[Altair](https://altair-viz.github.io/) is a declarative statistical visualization library, based on the [Vega](http://vega.github.io/vega) and [Vega-Lite](http://vega.github.io/vega-lite) grammar of graphics.

You need to install the `altair` and `vega` Python packages, plus possibly `vega_dataset` if you want to work on their examples. Altair needs at least Python 3.5.3.

## Documentation

The [main documentation](https://altair-viz.github.io/) includes an extensive user guide as well as a gallery and case studies, here are a few links to special topics that are often needed but somewhat hard to find.

* [Mark definition parameters explained](https://altair-viz.github.io/user_guide/generated/core/altair.MarkDef.html#)
* [X axis specifications](https://altair-viz.github.io/user_guide/generated/channels/altair.X.html)

**Trainings**
* [1-Hour Altair Tutorial](https://nbviewer.jupyter.org/github/kanitw/altair-tutorial/blob/master/ECS_Hackweek.ipynb)

## Initialization

To register the Altair renderer in the classic notebook, you need the following code (typically in your first code cell).

In [1]:
import altair as alt

# Enable Altair for notebooks (not needed for JupyterLab)
_ = alt.renderers.enable('notebook')

Note that this is not necessary with JupyterLab.

To be able to render PNG images from Altair charts, you need some more prep work. Your Jupyter installation needs to have the `selenium` Python package installed (see the `setup` folder for hints on that), which in turn requires the `chromedriver` executable to work. 

The following code helps when you cannot install that program more globally, and instead download it into the folder of your notebook. That folder must then be added to the command search `PATH`, so the driver binary is found.

In [2]:
import os

# Find a 'chromedriver' in the notebook's directory
if os.getcwd() not in os.environ['PATH'].split(os.pathsep): 
    os.environ['PATH'] += os.pathsep + os.getcwd()

A more automated way uses the [chromedriver-binary](https://pypi.org/project/chromedriver-binary/) PyPI package to do the downloading of the binary. Assuming you have installed that package into your runtime environment, all you need to do is import it, which extends your `PATH` so the driver binary is found at runtime.

In [3]:
import chromedriver_binary
!echo $PATH | tr : \\n | grep chrome
!ls -lh "{chromedriver_binary.chromedriver_filename}"

/opt/venvs/jupyterhub/lib/python3.6/site-packages/chromedriver_binary
-rwxr-xr-x 1 root root 12M Feb 21 12:24 /opt/venvs/jupyterhub/lib/python3.6/site-packages/chromedriver_binary/chromedriver


## Publishing Charts with `nbconvert`

Since Altair is based on the Javascript-driven Vega, the internal notebook outputs are a mix of HTML and Javascript. With IPython 7.2.0 (later versions might fix this), a `jupyter nbconvert --execute --to html` call will produce *empty* output cells.

To circumvent the problem, you can use HTML pointing to generated PNGs as the code cell output, instead of the Altair chart object itself. This is also more ‘git-friendly’ when you commit output cells, since the binary image data is stored in extra files and does not inflate the diffs for the notebook.

In [4]:
import pandas as pd
import altair as alt

def render_chart(chart, name, scale_factor=1.0, ext='png', publish=1):
    """Helper for chart output via non-embedded PNG images."""
    import time
    from IPython.display import HTML

    chart_img = "img/{}.{}".format(name, ext)
    chart.save(chart_img, scale_factor=scale_factor)
    if publish:
        return HTML('<img src="{}?{}"></img>'
                    .format(chart_img, time.time()))
    else:  # return interactive original chart object when not publishing 
        return chart

letters = list("Altair")
df = pd.DataFrame(dict(Letter=letters, Code=list(map(ord, letters))))

chart = alt.Chart(df).mark_bar().encode(
    y=alt.Y('Letter', sort=letters),
    x='Code',
).configure_view(height=150)

render_chart(chart, "ascii_bars")