# INFO 3402 – Week 13: Visualizing data in Altair and Vega

[Brian C. Keegan, Ph.D.](http://brianckeegan.com/)  
[Assistant Professor, Department of Information Science](https://www.colorado.edu/cmci/people/information-science/brian-c-keegan)  
University of Colorado Boulder  

Copyright and distributed under an [MIT License](https://opensource.org/licenses/MIT).

## Learning Objectives
* Visualizing data using "grammars of graphics"
* Using Altair to construct Vega specifications for visualizing data

## Resources
* Documentation
  * [Vega and Vega-Lite](https://vega.github.io/) documentation.
  * [Altair](https://altair-viz.github.io/index.html) documentation.
* Optional readings
  * Wilkinson, L. *[The Grammar of Graphics](https://www.springer.com/gp/book/9780387245447).*
  * Wilkinson, L. "[The Grammar of Graphics](https://link.springer.com/chapter/10.1007/978-3-642-21551-3_13)." *Handbook of Computational Statistics*, 2012.
  * Wickham, H. "[A Layered Grammar of Graphics](https://www.tandfonline.com/doi/abs/10.1198/jcgs.2009.07098)." *J. of Comp. and Graphical Statistics*, 2010.
  * Satyanarayan, A.,Wongsuphasawat, K., & Heer, J. "[Declarative interaction design for data visualization](https://dl.acm.org/doi/abs/10.1145/2642918.2647360)." *Proc. UIST'14*.
  * Satyanarayan, A., Moritz, D., Wongsuphasawat, K., & Heer, J. "[Vega-Lite: A Grammar of Interactive Graphics](https://idl.cs.washington.edu/papers/vega-lite)." *IEEE Trans. Visualization & Comp. Graphics*, 2017.
  * VanderPlas, J., Granger, B.E., Heer, J., *et al*. "[Altair: Interactive Statistical Visualizations for Python](https://joss.theoj.org/papers/10.21105/joss.01057.pdf)", *The Journal of Open Source Software*

## Install libraries

You only need to do this once. At the terminal (or Anaconda Prompt on Windows) run:

<code>conda install -c conda-forge altair vega_datasets</code>

If it's been a while, now may also be a good time to do a <code>conda update --all</code>

## Import libraries

In [None]:
import pandas as pd
import altair as alt

## Make some fake data

In [None]:
data = pd.DataFrame({'a': list('CCCDDDEEE'),
                     'b': [2, 7, 4, 1, 2, 6, 8, 4, 7]})

data

## Make a Chart object with the `data` inside

In [None]:
chart = alt.Chart(data)

## Make a basic mark

In [None]:
chart.mark_point()

## Make a basic mark with encodings

In [None]:
chart.mark_point().encode(x='a')

In [None]:
chart.mark_point().encode(x='a',y='b')

## Try an alternative mark and encoding

In [None]:
chart.mark_point().encode(x='b',y='a')

In [None]:
chart_bar = chart.mark_bar().encode(x='average(b)',y='a')
chart_bar

## Change mark color

In [None]:
chart_bar = chart.mark_bar(color='red').encode(x='average(b)',y='a')
chart_bar

## Examine JSON output

In [None]:
print(chart_bar.to_json())

## Save chart to HTML

In [None]:
chart_bar.save('chart_bar.html')