# Week 04: In-Class Assignment: <br> Visualization Technicalities


![gog](https://sakizo-blog.com/wp-content/uploads/2021/11/st_alt2-1.png)

### Instructions for Submitting

Please follow the format below when typing your names in the notebook. This is **mandatory** for all group submissions.
**Each student** will turn in a notebook. Note that this is different from the first ICA. You will still work together, but you will come up with your own answers.

- Each member's name must be written in the format:  
  **Last Name, First Name, Second Name**
- Separate each member's name with a **semicolon (;)**
- Do **not** include any extra text or formatting.
- **Delete this instruction text** and replace it with your names.

#### Example:
    Doe, Jane Marie; Smith, John Alan; Lee, Anna Grace;

⚠️ **Failure to follow this format will result in a reduction of your grade.**

Put your names in the next markdown cell

Ni, Zhiqiang; Cho, Jungbum; Toaz, Benjamin, Ryan;

For this ICA, you will turn in **your own notebook**. You can work together as always, but I want each of you to try these steps on your own for a dataset of your choosing. 

In today's ICA we will explore the grammar of graphics (GoG), which is a less well-known, but increasingly popular, paradigm for producing visualizations.

We'll work with Altair today. But, there are many other GoG libraries out there and you should explore them! If I had to suggest perhaps the most obvious library to learn after ([before?](https://www.youtube.com/watch?v=jNiQaErXg8s)) Altair, it would be Plotly; as you have seen in some of the assignments, it is very powerful. Examine both for your projects.

Anyway, let's get to it!

We are going to walk through the steps of building a visualization **very slowly** - what might seem like ridiculously slow. But, the idea is that you appreciate how the "grammar" works.

Your first step is to rearrange the tables you are sitting at so that your group members can all look at one screen together. In the first part of this ICA you will:
* read what I wrote below and follow along (for the mpg dataset); you can have this shown on one of your group's laptops show on one of your monitors,
* mimic each step with a second laptop on the other monitor (e.g., the iris or penguin dataset - you pick!).

I want to be sure you know how to perform these steps for your own dataset.


## Intro to `altair`

I'll use for today the "mpg" dataset and bring it in through `vega_datasets`, another place to get nice datasets. You get another dataset of your group's choosing. Consider also using `vega_datasets` so that you get familiar with it. 

Note: there may be some issues with this ICA: be prepared to install libraries you might not have and try different notebook environments. I last edited this in VS Code and struggled a bit. (Works now!)

In [None]:
# !pip install vega_datasets

In [None]:
# !pip install --upgrade altair vega_datasets jupyterlab

Let's get the cars dataset. And, you get yours to compare with these steps.

In [None]:
import altair as alt
from vega_datasets import data
cars = data.cars()

We are going to start with [`.Chart`](https://altair-viz.github.io/getting_started/starting.html#the-chart-object), although Altair has [other options](https://altair-viz.github.io/user_guide/API.html#top-level-objects). The way you get started is to call Altair through `alt` (or whatever you chose to call it in the `import` statement above), create a chart and pass the dataframe. Note that the dataframe is the natural container for Altair (as opposed to lists, dictionaries or arrays), just as it was for Seaborn. Basically, this means Altair keeps track of the columns names and can use them.  

Run this piece of code, **which will generate an error**. Don't panic! 

In [None]:
alt.Chart(cars)

What Altair is complaining about is that it is not possible to map the data onto a visual without knowing a minimal level of mapping. We need to use at least one more dot in the dot chain! Let's a `.mark_point` to map numbers to symbols.

Next, run this cell:


In [None]:
alt.Chart(cars).mark_point()

What do you see? This might not look like much - a circle in a square. Seems weird, right?

This result is expected in the GoG way of doing things: we have mapped the data onto a marker, _but have not given any more information_ about how it should be organized. Thus, we get a "0D" plot! Next time you need a 0D plot, you know how to make one!

Notice that in the GoG, nothing is taken for granted, very little is assumed for you. Think of this as starting with letters, then we can build words, then we can put words together, and so on.....

The next "dot" that we need is an encoding (`.encode()`) that maps the markers to some geometry; that is, to an axis (or more). Let's do a simple encoding next:


In [None]:
alt.Chart(cars).mark_point().encode(
    x='Horsepower'
)

Gorgeous! Since we only encoded one variable to the $x$ axis, we get a 1D plot. I bet you have not seen many 0D and 1D plots before - welcome to GoG!

From `altair`'s docs: 
  The `encode()` method builds a key-value mapping between encoding channels (such as x, y, color, shape, size, etc.) to columns in the dataset, accessed by column name

Think of it as a dictionary!

Interestingly, this is basically a "rug plot", which we saw in Seaborn and you made in the HW. In Altair's GoG, a rug plot is a very natural type of plot.

Do this with your dataset; compare the two monitors.

Make an "official" rug plot, try `mark_tick()`.

  But, let's keep going! Let's add an encoding to the $y$ axis.

In [None]:
base_chart = alt.Chart(cars).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon'
)
# You need this line to display the chart
base_chart

At this point, we have a normal 2D scatter plot. Relative to `matplotlib`, the syntax looks maybe unusual; but, there is a nice logic to it.

Let's make two jumps now, adding a color and also getting some of the true power of Altair by making the plot interactive. Again, do this with your second dataset as well. Here's how you do that:

In [None]:
# you may need this in VSCode 
alt.renderers.enable('mimetype')
cars = data.cars()

chart = alt.Chart(cars).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    color='Origin'
).properties(
    width=600,
    height=400
).interactive()

chart

What interactions can you do? Can you pan? Zoom? Save a PNG? If you double click, it returns you to the default settings. Make sure all of this works for both datasets.

## Exploratory Data Analysis

What if we want to do EDA on this dataset and put a categorical variable on the $x$ axis?

Simple:

In [None]:
alt.Chart(cars).mark_point().encode(
    x='Origin',
    y='Miles_per_Gallon',
    color='Cylinders',
).interactive()

Try this with the categorical variables in the second dataset as well.

What about the $y$ axis?

In [None]:
alt.Chart(cars).mark_point().encode(
    y='Origin',
    x='Miles_per_Gallon',
    color='Cylinders',
).interactive()

As you can see, it is very easy to make logical changes to the layout.

Notice how the `color` option works for real values (this case) and discrete categories (the case above with `color='Origin'`).

What does this next code do? That is, what is the ":O"? [hint](https://altair-viz.github.io/user_guide/encodings/index.html#encoding-data-types)

In [None]:
alt.Chart(cars).mark_point().encode(
    y='Origin',
    x='Miles_per_Gallon',
    color='Cylinders:O',
).interactive()

What happens if you change the ":O" to ":N"?

Again, compare and constrast the two datasets on the two monitors.

## Your turn

Ok, you get the idea: GoG is very nice for EDA!

There is a lot more to GoG, but we want to keep this fairly short.


But, now it is your turn. To complete this ICA you will need to use the [Altair docs](https://altair-viz.github.io/index.html).


Take your dataset and vary these options:
* change `mark_point` to several of the [other options](https://altair-viz.github.io/user_guide/marks.html) (e.g., bar),
* how do you change colors?
* vary the [encodings](https://altair-viz.github.io/user_guide/encodings/index.html),
* how do you export a PNG or PDF?
* for DS, tooltips are extremely useful - [make a plot with tooltips](https://altair-viz.github.io/gallery/scatter_tooltips.html)
* make a plot that uses [facets](https://altair-viz.github.io/user_guide/compound_charts.html),
* look through the gallery and attempt some of the styles, such as area charts, circular plots, and so on,
* discuss with your group members what their plans are for visualizations in their projects - list ideas from your discussion in a markdown cell of how Altair plots might be used among your group members. (Or, are your group members planning to not use Altair at all?)

If you like Altair and want to use it for your project, take a look at [this example](https://altair-viz.github.io/case_studies/exploring-weather.html). If you want an East Lansing weather dataset, let me know.

In [None]:
import altair as alt
from vega_datasets import data
stocks = data.stocks()

In [None]:
alt.Chart(stocks).mark_bar().encode(
    x='price',
    color='symbol'
)

In [None]:
alt.Chart(stocks).mark_point().encode(
    x='date',
    y='price',
    color='symbol'
)

In [None]:
alt.Chart(stocks).mark_point().encode(
    x='date',
    y='price',
    color='symbol'
).properties(
    width=600,
    height=400
).interactive()

In [None]:
alt.Chart(stocks).mark_point().encode(
    x='date',
    y='price',
    color='symbol',
    tooltip=['date', 'price', 'symbol']
).properties(
    width=600,
    height=400
).interactive()

In [132]:
point  = alt.Chart(stocks).mark_point().encode(
    x='date',
    y='price',
    color='symbol',
    tooltip=['date', 'price', 'symbol']
).properties(
    width=600,
    height=400
).interactive()
bar = alt.Chart(stocks).mark_point().encode(
    y='price',
    color='symbol',
    tooltip=['date', 'price', 'symbol']
).properties(
    width=100,
    height=400
).interactive()

point | bar


<VegaLite 5 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/display_frontends.html#troubleshooting


---
## Congratulations, you're done!

Submit this assignment by uploading your notebook to the course Desire2Learn web page.  Go to the "In Class Assignment" folder, find the appropriate submission link, and upload everything there. Make sure your name is on it!

&#169; Copyright 2025, Department of Computational Mathematics, Science and Engineering at Michigan State University.