# Declarative visualization

Declarative visualization depends upon [*tidy data*](https://www.jstatsoft.org/index.php/jss/article/view/v059i10/v59i10.pdf); basically, the idea is that instead of having many observations about many phenomena in one row of a table, the data are structured so that   

An *untidy* table might combine several different kinds of observations in several columns of one row (like, for example, the amount of electricity generated by fossil fuels, nuclear energy, and renewables):

In [None]:
import pandas as pd
pd.read_csv("data/iowa-electricity-untidy.csv")

A *tidy* table, on the other hand is structured so that a single observation is in a single row.  We could tidy the previous table by restructuring it to store the year, the kind of electricity source, and the quantity generated:

In [None]:
iowa = pd.read_csv("data/iowa-electricity.csv")
iowa

Tidy data has many advantages; in this notebook, we will focus on its suitability for visualization.  Instead of a spreadsheet-style visualization of a wide table (in which expressions over several columns provide points to plot and explicit configuration shows how to draw them), tidy tables can be plotted *declaratively*, so that the values in some columns provide the location of a point and the values in other columns show how to render it.

We'll see a quick example of how this works in the [Altair framework](https://altair-viz.github.io/index.html) now. (Altair is a Python library that exposes the [Vega visualization grammar](https://vega.github.io); this isn't important but we're just mentioning it now because we'll mention Vega later!)  First, we'll import Altair:

In [None]:
import altair as alt

As with some of the other libraries in this tutorial, we'll import Altair with an abbreviated alias because we'll be typing it often.  Our next step is to let Altair know we're working in a notebook environment:

In [None]:
alt.renderers.enable('notebook')
pass

We can then make our first chart!  We'll make a scatter plot, using Altair's `mark_point` function.  We'll use the year as the x axis, the net energy generation as the y axis, and we'll color observations by the type of energy involved.  To do this, we need to tell Altair's `encode` function which columns to use for the color and the x and y coordinates.

In [None]:
c = alt.Chart(iowa).mark_point().encode(
    x='year',
    y='net_generation',
    color='source',
)
c

There are a few things to notice here:

1.  Altair labeled our axes based on the names of the columns.
2.  Altair automatically chose scales for us based on the ranges of the data.
3.  Altair automatically chose colors for each distinct energy type.

Did you notice the button with the ellipsis to the right of the chart?  You can click on that menu to download an image of the plot or visit the chart in an online editor for the Vega visualization grammar.

We can also use Altair to make an interactive chart:

In [None]:
c.interactive()

We can customize the [encoding of the data points](https://altair-viz.github.io/user_guide/encoding.html) in several ways, like using the type of energy to select a shape instead of (or in addition to) a color:

In [None]:
alt.Chart(iowa).mark_point().encode(
    x='year',
    y='net_generation',
    color='source',
    shape='source'
).interactive()

We've just provided a taste of what you can do with Altair here.  Fortunately, it's quite easy to use!  See [the getting-started guide](https://altair-viz.github.io/getting_started/starting.html) for some ideas.  We suggest that you start by trying some different visualization techniques on this example dataset (or perhaps a dataset of your choice), or maybe [experiment with data transformations](https://altair-viz.github.io/user_guide/transform/index.html).